incubator-xtable icon indicating copy to clipboard operation
incubator-xtable copied to clipboard

Docker Demo fails at Iceberg sync to Delta

Open sagarlakshmipathy opened this issue 1 year ago • 5 comments

code block

val icebergSourceClientProvider = new IcebergSourceClientProvider()
icebergSourceClientProvider.init(spark.sparkContext.hadoopConfiguration, Collections.emptyMap())
val icebergSourcePerTableConfig = PerTableConfigImpl.builder()
    .tableName(hudiTableName)
    .namespace(namespaceArray)
    .targetTableFormats(Arrays.asList(TableFormat.DELTA))
    .tableBasePath(hudiBasePath)
    .icebergCatalogConfig(icebergCatalogConfig)
    .syncMode(SyncMode.INCREMENTAL)
    .build()
oneTableClient.sync(icebergSourcePerTableConfig, icebergSourceClientProvider)

error:

java.lang.NoSuchMethodError: org.apache.spark.sql.delta.actions.AddFile.<init>(Ljava/lang/String;Lscala/collection/immutable/Map;JJZLjava/lang/String;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/delta/actions/DeletionVectorDescriptor;)V
  org.apache.xtable.delta.DeltaDataFileUpdatesExtractor.createAddFileAction(DeltaDataFileUpdatesExtractor.java:118)
  org.apache.xtable.delta.DeltaDataFileUpdatesExtractor.lambda$applyDiff$3(DeltaDataFileUpdatesExtractor.java:99)
  java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
  java.util.Iterator.forEachRemaining(Iterator.java:116)
  java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
  java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
  java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
  java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
  java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
  java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
  java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
  java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
  java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
  org.apache.xtable.delta.DeltaDataFileUpdatesExtractor.applyDiff(DeltaDataFileUpdatesExtractor.java:103)
  org.apache.xtable.delta.DeltaDataFileUpdatesExtractor.applySnapshot(DeltaDataFileUpdatesExtractor.java:78)
  org.apache.xtable.delta.DeltaClient.syncFilesForSnapshot(DeltaClient.java:184)
  org.apache.xtable.spi.sync.TableFormatSync.lambda$syncSnapshot$0(TableFormatSync.java:74)
  org.apache.xtable.spi.sync.TableFormatSync.getSyncResult(TableFormatSync.java:160)
  org.apache.xtable.spi.sync.TableFormatSync.syncSnapshot(TableFormatSync.java:70)
  org.apache.xtable.client.OneTableClient.syncSnapshot(OneTableClient.java:179)
  org.apache.xtable.client.OneTableClient.sync(OneTableClient.java:116)
  ammonite.$sess.cell7$Helper.<init>(cell7.sc:11)
  ammonite.$sess.cell7$.<init>(cell7.sc:7)
  ammonite.$sess.cell7$.<clinit>(cell7.sc:-1)

sagarlakshmipathy avatar Mar 22 '24 18:03 sagarlakshmipathy

+1 I also hit this

kywe665 avatar Mar 22 '24 18:03 kywe665

The issue might stem from the Delta version upgrade, which required an update to the Spark version as well (see commit). It seems that the demo code wasn't revised to reflect these changes at that time.

ashvina avatar Mar 22 '24 18:03 ashvina

thats what i thought too, let me fix it this weekend.

sagarlakshmipathy avatar Mar 22 '24 18:03 sagarlakshmipathy

can you assign it to me @ashvina

sagarlakshmipathy avatar Mar 22 '24 18:03 sagarlakshmipathy

Thanks @sagarlakshmipathy References to AddFile were fixed in the commit I mentioned above. Those changes may provide some hints about how to fix the demo.

ashvina avatar Mar 22 '24 18:03 ashvina

can I fix this issue? @ashvina , I updated the notebook jar version locally and it works. no outputs will be included, only the version changed.

zhen-d avatar Sep 05 '24 09:09 zhen-d

@zhen-d can you raise a PR?

the-other-tim-brown avatar Sep 16 '24 15:09 the-other-tim-brown