Failed to Create Streaming Context for Event Hubs on HDInsights
Running Spark 2.0 on Linux - HDInsights 3.5 Mobius: v2.0.200 DotNet 4.6.1 Mono JIT compiler version 5.4.1.6
I've run into an issue creating a Streaming Context for an EventHub processor. The error message reported in Yarn is as follows:
[2018-01-12T02:06:18.2598860Z] [wn0-openha] [Info] [StreamingContextIpcProxy] Callback server port number is 46270
[2018-01-12T02:06:18.3164990Z] [wn0-openha] [Error] [JvmBridge] JVM method execution failed: Constructor failed for class org.apache.spark.streaming.api.java.JavaStreamingContext when called with 1 parameters ([Index=1, Type=String, Value=hdfs://mycluster/checkpoints], )
[2018-01-12T02:06:18.3165750Z] [wn0-openha] [Error] [JvmBridge] java.lang.IllegalArgumentException: requirement failed: Spark Streaming cannot be initialized with both SparkContext and checkpoint as null
at scala.Predef$.require(Predef.scala:224)
...
[2018-01-12T02:06:18.3181830Z] [wn0-openha] [Exception] [JvmBridge] JVM method execution failed: Constructor failed for class org.apache.spark.streaming.api.java.JavaStreamingContext when called with 1 parameters ([Index=1, Type=String, Value=hdfs://mycluster/checkpoints], )
at Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge.CallJavaMethod (System.Boolean isStatic, System.Object classNameOrJvmObjectReference, System.String methodName, System.Object[] parameters) [0x0005f] in <6f66514957744af8a393c7667e586f58>:0
Unhandled Exception:
System.Exception: JVM method execution failed: Constructor failed for class org.apache.spark.streaming.api.java.JavaStreamingContext when called with 1 parameters ([Index=1, Type=String, Value=hdfs://mycluster/checkpoints], )
at Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge.CallJavaMethod (System.Boolean isStatic, System.Object classNameOrJvmObjectReference, System.String methodName, System.Object[] parameters) [0x00144] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge.CallConstructor (System.String className, System.Object[] parameters) [0x00000] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Proxy.Ipc.StreamingContextIpcProxy..ctor (System.String checkpointPath) [0x00033] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Proxy.Ipc.SparkCLRIpcProxy.CreateStreamingContext (System.String checkpointPath) [0x00000] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Streaming.StreamingContext.GetOrCreate (System.String checkpointPath, System.Func`1[TResult] creatingFunc) [0x00020] in <6f66514957744af8a393c7667e586f58>:0
at Test+test.main (System.String[] param) [0x00168] in <5a5812ead9c12e4ea7450383ea12585a>:0
[ERROR] FATAL UNHANDLED EXCEPTION: System.Exception: JVM method execution failed: Constructor failed for class org.apache.spark.streaming.api.java.JavaStreamingContext when called with 1 parameters ([Index=1, Type=String, Value=hdfs://mycluster/checkpoints], )
at Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge.CallJavaMethod (System.Boolean isStatic, System.Object classNameOrJvmObjectReference, System.String methodName, System.Object[] parameters) [0x00144] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge.CallConstructor (System.String className, System.Object[] parameters) [0x00000] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Proxy.Ipc.StreamingContextIpcProxy..ctor (System.String checkpointPath) [0x00033] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Proxy.Ipc.SparkCLRIpcProxy.CreateStreamingContext (System.String checkpointPath) [0x00000] in <6f66514957744af8a393c7667e586f58>:0
at Microsoft.Spark.CSharp.Streaming.StreamingContext.GetOrCreate (System.String checkpointPath, System.Func`1[TResult] creatingFunc) [0x00020] in <6f66514957744af8a393c7667e586f58>:0
at Test+test.main (System.String[] param) [0x00168] in <5a5812ead9c12e4ea7450383ea12585a>:0
The checkpoint file is being created. Here is the result of issuing:
$ hadoop fs -ls hdfs://mycluster/checkpoints:
Found 1 items
drwxr-xr-x - wxxxxxxxxx hdfs 0 2018-01-12 02:05 hdfs://mycluster/checkpoints/b51a1a4f-b49e-48d1-8fdc-1234ee88984e
I'm using the following command to run the job:
../mobius/runtime/scripts/sparkclr-submit.sh \
--master yarn \
--deploy-mode cluster \
--jars /home/{username}/mobius/runtime/dependencies/eventhubs/spark-streaming-eventhubs_2.11-2.0.3.jar,/home/{username}/mobius/runtime/dependencies/eventhubs/eventhubs-client-1.0.1.jar,/home/{username}/mobius/runtime/dependencies/eventhubs/qpid-amqp-1-0-client-0.32.jar,/home/{username}/mobius/runtime/dependencies/eventhubs/qpid-amqp-1-0-common-0.32.jar \
--exe test.exe /home/{username}/test
pom file deps used to pull jars
<dependencies>
<!-- https://mvnrepository.com/artifact/com.microsoft.azure/spark-streaming-eventhubs -->
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>spark-streaming-eventhubs_2.11</artifactId>
<version>2.0.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.microsoft.eventhubs.client/eventhubs-client -->
<dependency>
<groupId>com.microsoft.eventhubs.client</groupId>
<artifactId>eventhubs-client</artifactId>
<version>1.0.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.qpid/qpid-amqp-1-0-client -->
<dependency>
<groupId>org.apache.qpid</groupId>
<artifactId>qpid-amqp-1-0-client</artifactId>
<version>0.32</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.qpid/qpid-amqp-1-0-common -->
<dependency>
<groupId>org.apache.qpid</groupId>
<artifactId>qpid-amqp-1-0-common</artifactId>
<version>0.32</version>
</dependency>
</dependencies>
What I find interesting is that the print statements in the lambda passed to StreamingContext.GetOrCreate are not being executed so is it possible that my Func<StreamingContext> has an error? I did reference the compatibility charts and the handful of Issues that showed working and compatible JARs, but it still might be a JAR issue.
Any help would be appreciated!