[SUPPORT] IllegalArgumentException at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33)
Describe the problem you faced
I have two jobs:
- JobA use spark3.3.2+hudi0.14.1 append batch log data;
- JobB use spark3.3.2+hudi0.12.3 to read log data produced by JobA. Then error occurs in JobA like title.
Expected behavior
We dont write concurrently, we should not get this error.
Environment Description
-
Hudi version : 0.14.1(single spark writer), 0.12.3(multi spark reader with hoodie.metadata.enable to improve read performance)
-
Spark version : 3.3.2
-
Hive version : no
-
Hadoop version : 3.3.6
-
Storage (HDFS/S3/GCS..) : GCS
-
Running on Docker? (yes/no) : no, Dataproc
Stacktrace
org.apache.hudi.exception.HoodieException: Failed to apply clean commit to metadata
at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:91) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.runClean(CleanActionExecutor.java:227) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.runPendingClean(CleanActionExecutor.java:193) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.execute(CleanActionExecutor.java:263) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.clean(HoodieSparkCopyOnWriteTable.java:291) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieTableServiceClient.clean(BaseHoodieTableServiceClient.java:763) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:862) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:835) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: java.lang.IllegalArgumentException
at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:618) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:599) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:224) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.commit(BaseSparkCommitActionExecutor.java:311) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.autoCommit(BaseCommitActionExecutor.java:201) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.commitOnAutoCommit(BaseCommitActionExecutor.java:183) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.updateIndexAndCommitIfNeeded(BaseSparkCommitActionExecutor.java:279) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:184) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:126) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:88) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:156) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:63) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.commitInternal(HoodieBackedTableMetadataWriter.java:1099) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:117) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.processAndCommit(HoodieBackedTableMetadataWriter.java:810) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.update(HoodieBackedTableMetadataWriter.java:899) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:86) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
... 12 more
2024-03-22 02:54:03 ERROR HoodieAsyncService:105 - Service shutdown with error
java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Failed to apply clean commit to metadata
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) ~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.async.AsyncCleanerService.waitForCompletion(AsyncCleanerService.java:75) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieTableServiceClient.asyncClean(BaseHoodieTableServiceClient.java:133) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.autoCleanOnCommit(BaseHoodieWriteClient.java:595) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.mayBeCleanAndArchive(BaseHoodieWriteClient.java:579) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:248) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:104) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.HoodieSparkSqlWriterInternal.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:1081) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:520) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:204) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:121) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:150) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560) ~[spark-catalyst_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:116) ~[spark-sql_2.12-3.3.2.jar:0.14.1]
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:860) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:390) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:363) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239) ~[spark-sql_2.12-3.3.2.jar:3.3.2]
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:973) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1061) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1070) ~[spark-core_2.12-3.3.2.jar:3.3.2]
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ~[spark-core_2.12-3.3.2.jar:3.3.2]
Caused by: org.apache.hudi.exception.HoodieException: Failed to apply clean commit to metadata
at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:91) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.runClean(CleanActionExecutor.java:227) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.runPendingClean(CleanActionExecutor.java:193) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.execute(CleanActionExecutor.java:263) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.clean(HoodieSparkCopyOnWriteTable.java:291) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieTableServiceClient.clean(BaseHoodieTableServiceClient.java:763) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:862) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:835) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: java.lang.IllegalArgumentException
at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:618) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:599) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:224) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.commit(BaseSparkCommitActionExecutor.java:311) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.autoCommit(BaseCommitActionExecutor.java:201) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.commitOnAutoCommit(BaseCommitActionExecutor.java:183) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.updateIndexAndCommitIfNeeded(BaseSparkCommitActionExecutor.java:279) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:184) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:126) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:88) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:156) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:63) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.commitInternal(HoodieBackedTableMetadataWriter.java:1099) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:117) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.processAndCommit(HoodieBackedTableMetadataWriter.java:810) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.update(HoodieBackedTableMetadataWriter.java:899) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:86) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.runClean(CleanActionExecutor.java:227) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.runPendingClean(CleanActionExecutor.java:193) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.action.clean.CleanActionExecutor.execute(CleanActionExecutor.java:263) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.clean(HoodieSparkCopyOnWriteTable.java:291) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieTableServiceClient.clean(BaseHoodieTableServiceClient.java:763) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:862) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:835) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55) ~[hudi-spark3.3-bundle_2.12-0.14.1.jar:0.14.1]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
2024-03-22 02:54:03 WARN HoodieSparkSqlWriterInternal:534 - Closing write client```
how many writers are there, there might be multiple cleaners.
We have only one writer here, while multiple readers. So we have one cleaner within writer, right?
yeah, can you check the metadata file state under .hoodie dir?
yeah, can you check the metadata file state under
.hoodiedir?
Oh, I found some empty files there.
And if I enable metadata in reader side, warning thrown as below:
24/03/22 06:30:09 WARN AbstractHoodieLogRecordReader: TargetInstantTime 20240322043545698 invalid or extra rollback command block in gs://bucket/tables_prod/ods/hudi_log_events/.hoodie/metadata/files/.files-0000-0_20240322033640958001.log.5_1-0-1
24/03/22 06:30:10 WARN BaseTableMetadata: Metadata record for log_date=2024-03-20 encountered some files to be deleted which was not added before. Ignoring the spurious deletes as the `_hoodie.metadata.ignore.spurious.deletes` config is set to true
java.io.FileNotFoundException:
File not found: gs://bucket/tables_prod/ods/hudi_log_events/log_date=2024-03-20/62c503f0-0fca-463a-8643-c1fa688412e5-0_0-8-211_20240320203557276.parquet
How to avoid this error?
@michael1991 Why are we trying to read the data written using newer Hudi version from an older Hudi version . Between 0.12.3 and 0.14.1 anyway used the different table version and have different log format.
Also can you post writer configurations too.
- Because we are ingesting event log data without record key, before 0.14, we must set record key for writer. Meanwhile, in GCP Dataproc, Hudi v0.12.3 is the default version, so we use 0.12.3 bundle to read data generated by 0.14.1 bundle.
- Our writer configurations:
lazy val EVENT_HUDI_CONF_MAP: Map[String, String] = Map(
"hoodie.database.name" -> "database",
"hoodie.table.name" -> "log_events",
"hoodie.schema.on.read.enable" -> "true",
"hoodie.combine.before.upsert" -> "false",
"hoodie.datasource.write.partitionpath.field" -> "log_date",
"hoodie.datasource.write.operation" -> "insert_overwrite",
"hoodie.clean.async" -> "true",
"hoodie.cleaner.commits.retained" -> "5",
"hoodie.parquet.compression.codec" -> "snappy",
"hoodie.copyonwrite.record.size.estimate" -> "64",
"hoodie.datasource.write.hive_style_partitioning" -> "true"
)
@michael1991 I dont think that's the right way. We must use 0.14.1 only to read data also. You can use OSS hudi bundle on dataproc to read data back and see if you have same issue occurring there too. Thanks.
Ok, let me have a try, and we have a spark job, needs to read base data & event data, then join them to generate new dataset. Base data are generated by 0.12.3 bundle for 1yr, and event data are generated by 0.14.1 bundle from last week, so can we use 0.14.1 bundle to read two datasets with different table versions, meaning backward compatibility in reader side?
I disable metadata in reader side, so the reader job just treat them as normal parquet files and ignore reading metadata. Meanwhile, I disable metadata in writer side too, after few commits, I re-enable metadata in writer side. But exception has been thrown at beginning of every day. The writer job is scheduled hourly, and insert overwrite whole day partition, due to event data without record key. Seems not writer & reader version compatibility issue.
spark-hudi.log I got more details for this error, actually this error occurs after this rollback error. But from log, I could not see any error before rollback occurs. Hope it could be helpful. I'm not sure whether async cleaning could be enabled when metadata rollback happen.
Root cause is: rollback and async clean, two actions need to delete same file, then "dead lock" happens. Disable async cleaning works.
Root cause is: rollback and async clean, two actions need to delete same file, then "dead lock" happens. Disable async cleaning works.
Nice findings, seems a bug, can you fire a fix for it.
@michael1991 Thanks for identifying the root cause. Do you have a fix in your mind. Created tracking jira for the same - https://issues.apache.org/jira/browse/HUDI-7560
Are you using spark structured streaming to write or HudiStreamer?
@michael1991 Thanks for identifying the root cause. Do you have a fix in your mind. Created tracking jira for the same - https://issues.apache.org/jira/browse/HUDI-7560
Are you using spark structured streaming to write or HudiStreamer?
I'm using Spark core and sql to write, not structured streaming or HudiStreamer. In the future, we will try Flink to write streaming data.
@michael1991 Async services do not make sense with spark core and sql write anyway. So you should use inline table services only.
CC @linliu-code
https://hudi.apache.org/docs/metadata#deployment-model-b-single-writer-with-async-table-services
Lock is needed for asycn table service with MDT.
As @xicm mentioned, lock is needed for async table service, close issue then.