José G. Quenum
José G. Quenum
May I please have a concrete example of how to read and write from and to the kafka stream based on the Spark.jl implementation? I can't find any example in...
Thanks for that. Could you also provide documentation on how to read a stream and write a stream using the library. I've struggled to get it to access data.
I've updated the session as follows: ```Julia spark = SparkSession.builder.appName("SoftwareTools").master("spark://196.216.167.103:7077").config("spark.jars","/home/sysdev/spark-3.5.2-bin-hadoop3/jars/commons-pool2-2.11.1.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-sql-kafka-0-10_2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-streaming_2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/kafka-clients-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-token-provider-kafka-0-10_2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-tags-2_12-3.5.2.jar").getOrCreate() ``` Then when I do ```Julia stream_df = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe", "st-streaming-session").option("includeHeaders", "true").option("startingOffsets", "earliest").load() rec_evts = stream_df.select(Column("key"), Column("value")) query = rec_evts.writeStream().outputMode("append").format("console").start().awaitTermination()...
is it normal that the print to console `rec_evts.writeStream().outputMode("append").format("console").start().awaitTermination()` returns an empty result when there is data in kafka? I've tried this in the REPL and it just returns the...
it still doesn't print anything. I get the following print out ------------------------------------------- Batch: 0 ------------------------------------------- +-----+ |value| +-----+ +-----+ Although it does a microbactch execution and prints the details below...
Please find below both the Julia and the Scala versions. The scala version shows the messages in spark-shell. 1 - Julia ```Julia using Spark spark = SparkSession.builder.appName("SoftwareTools").master("spark://IP:7077").config("saprk.jars", "/home/sysdev/spark-3.5.2-bin-hadoop3/jars/commons-pool2-2.12.0.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-streaming_2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-sql-kafka-0-10_2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/kafka-clients-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-token-provider-kafka-0-10_2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-tags-2.12-3.5.2.jar,/home/sysdev/spark-3.5.2-bin-hadoop3/jars/spark-streaming_2.12-3.5.2.jar").getOrCreate() stream_df =...
I've actually tried both offsets. But I get no output.
openjdk version "1.8.0_412" OpenJDK Runtime Environment (build 1.8.0_412-b08) OpenJDK 64-Bit Server VM (build 25.412-b08, mixed mode)