spark
spark copied to clipboard
[SPARK-46971][SQL] When the `compression` is null, a `NullPointException` should not be thrown
What changes were proposed in this pull request?
The pr aims to provide better prompts when option's compression is null.
Why are the changes needed?
In the original logic, if the compression is null, Spark will throw a NullPointerException, which is obviously unfriendly to the user.
val df = (1 to 5).map(i => ((i % 2).toString)).toDF("a")
df.write.option("compression", null).text("test1")
Before:
scala> df.write.option("compression", null).text("test1")
org.apache.spark.SparkException: [INTERNAL_ERROR] Eagerly executed command failed. You hit a bug in Spark or the Spark plugins you use. Please, report this bug to the corresponding communities or vendors, and provide the full stack trace. SQLSTATE: XX000
at org.apache.spark.SparkException$.internalError(SparkException.scala:107)
at org.apache.spark.sql.execution.QueryExecution$.toInternalError(QueryExecution.scala:550)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:562)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:119)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:109)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:442)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:442)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:34)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:271)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:34)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:34)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:418)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:109)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:96)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:156)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:892)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:389)
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:362)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:240)
at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:834)
... 42 elided
Caused by: java.lang.NullPointerException: Cannot invoke "String.toLowerCase(java.util.Locale)" because "name" is null
at org.apache.spark.sql.catalyst.util.CompressionCodecs$.getCodecClassName(CompressionCodecs.scala:38)
at org.apache.spark.sql.execution.datasources.text.TextOptions.$anonfun$compressionCodec$1(TextOptions.scala:38)
at scala.Option.map(Option.scala:242)
... 17 elided and 62 more
After:
scala> df.write.option("compression", null).text("test1")
org.apache.spark.SparkIllegalArgumentException: [CODEC_NOT_AVAILABLE.WITH_AVAILABLE_CODECS_SUGGESTION] The codec NULL is not available. Available codecs are bzip2, deflate, uncompressed, snappy, none, lz4, gzip. SQLSTATE: 56038
at org.apache.spark.sql.errors.QueryExecutionErrors$.codecNotAvailableError(QueryExecutionErrors.scala:2716)
at org.apache.spark.sql.catalyst.util.CompressionCodecs$.getCodecClassName(CompressionCodecs.scala:40)
at org.apache.spark.sql.execution.datasources.text.TextOptions.$anonfun$compressionCodec$1(TextOptions.scala:38)
at scala.Option.map(Option.scala:242)
... 79 elided
Does this PR introduce any user-facing change?
Yes, when compression is null, will display better error prompts.
How was this patch tested?
- Add new UT.
- Pass GA.
Was this patch authored or co-authored using generative AI tooling?
No.