pytd icon indicating copy to clipboard operation
pytd copied to clipboard

Spark + Java 11 requires additional config param for Apache Arrow

Open takuti opened this issue 5 years ago • 0 comments

According to the latest Spark documentation:

For Java 11, -Dio.netty.tryReflectionSetAccessible=true is required additionally for Apache Arrow library. This prevents java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available when Apache Arrow uses Netty internally.

We'd need to tweak pytd/spark.py to handle this case.

Like this...

SparkConf()
  .setMaster("local[*]")
  .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
  .set("spark.sql.execution.arrow.pyspark.enabled", "true")
  .set("spark.executor.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
  .set("spark.driver.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")

takuti avatar Feb 02 '21 03:02 takuti