hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] No .marker files

Open eshu opened this issue 3 years ago • 3 comments

  • Hudi version : 0.11.1
  • Spark version : 3.1.1
  • Storage (HDFS/S3/GCS..) : S3
  • Running on Docker? (yes/no) : no
  • Environment: Glue 3

Issue happen quite rarely while writing, but if it happen, it reproduces persistently.

As far as I understood from the second part of the stacktrace, hudi tries to find files that contain .marker in their name, but there are no such files in the path. Files in .hoodie/.temp/20220802132553801/ have the name pattern MARKERS\d+ (MARKERS0, MARKERS1, MARKERS2, ..., MARKERS19).

Stacktrace

Caused by: org.apache.hudi.exception.HoodieRollbackException: Error rolling back using marker files written for [==>20220802132553801__compaction__INFLIGHT]
	at org.apache.hudi.table.action.rollback.MarkerBasedRollbackStrategy.getRollbackRequests(MarkerBasedRollbackStrategy.java:103)
	at org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor.requestRollback(BaseRollbackPlanActionExecutor.java:109)
	at org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor.execute(BaseRollbackPlanActionExecutor.java:132)
	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.scheduleRollback(HoodieSparkMergeOnReadTable.java:161)
	at org.apache.hudi.table.HoodieTable.rollbackInflightCompaction(HoodieTable.java:551)
	...
Caused by: java.lang.IllegalArgumentException
	at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
	at org.apache.hudi.common.util.MarkerUtils.stripMarkerFolderPrefix(MarkerUtils.java:67)
	at org.apache.hudi.table.marker.DirectWriteMarkers.lambda$allMarkerFilePaths$0(DirectWriteMarkers.java:136)
	at org.apache.hudi.common.fs.FSUtils.processFiles(FSUtils.java:277)
	at org.apache.hudi.table.marker.DirectWriteMarkers.allMarkerFilePaths(DirectWriteMarkers.java:135)
	at org.apache.hudi.table.marker.MarkerBasedRollbackUtils.getAllMarkerPaths(MarkerBasedRollbackUtils.java:62)
	at org.apache.hudi.table.action.rollback.MarkerBasedRollbackStrategy.getRollbackRequests(MarkerBasedRollbackStrategy.java:76)
	... 80 more

Is it possible to recover the dataset?

eshu avatar Aug 03 '22 01:08 eshu

By default, timeline-server-based markers are used. MARKERS.type indicates whether direct (with .marker files your mentioned) or timeline-server-based markers (aggregated in MARKERS0 etc.) are used. The rollback fails due to the wrongly identified marker type (similar to this issue #6224). There is a pending fix on this: https://github.com/apache/hudi/pull/6266/files

For recovery, you can try disable hoodie.rollback.using.markers (https://hudi.apache.org/docs/configurations#hoodierollbackusingmarkers).

yihua avatar Aug 04 '22 20:08 yihua

@eshu : is there any update? if the issue is resolved, can you close the github issue please. If not, let us know and we can debug further.

nsivabalan avatar Sep 12 '22 22:09 nsivabalan

@nsivabalan Workaround is working, but the bug still exists. If workaround is a resolution, then yes, it is resolved.

eshu avatar Sep 22 '22 04:09 eshu

@eshu how should we reproduce it? can you zip and share a problematic sample dataset so we can debug it?

xushiyan avatar Nov 14 '22 04:11 xushiyan

Closing due to inactivity. Please reopen with steps to reproduce. The general flow works in master as well as 0.12.2 an d0.13.0.

codope avatar Mar 29 '23 06:03 codope