[SUPPORT] No .marker files
- Hudi version : 0.11.1
- Spark version : 3.1.1
- Storage (HDFS/S3/GCS..) : S3
- Running on Docker? (yes/no) : no
- Environment: Glue 3
Issue happen quite rarely while writing, but if it happen, it reproduces persistently.
As far as I understood from the second part of the stacktrace, hudi tries to find files that contain .marker in their name, but there are no such files in the path. Files in .hoodie/.temp/20220802132553801/ have the name pattern MARKERS\d+ (MARKERS0, MARKERS1, MARKERS2, ..., MARKERS19).
Stacktrace
Caused by: org.apache.hudi.exception.HoodieRollbackException: Error rolling back using marker files written for [==>20220802132553801__compaction__INFLIGHT]
at org.apache.hudi.table.action.rollback.MarkerBasedRollbackStrategy.getRollbackRequests(MarkerBasedRollbackStrategy.java:103)
at org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor.requestRollback(BaseRollbackPlanActionExecutor.java:109)
at org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor.execute(BaseRollbackPlanActionExecutor.java:132)
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.scheduleRollback(HoodieSparkMergeOnReadTable.java:161)
at org.apache.hudi.table.HoodieTable.rollbackInflightCompaction(HoodieTable.java:551)
...
Caused by: java.lang.IllegalArgumentException
at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
at org.apache.hudi.common.util.MarkerUtils.stripMarkerFolderPrefix(MarkerUtils.java:67)
at org.apache.hudi.table.marker.DirectWriteMarkers.lambda$allMarkerFilePaths$0(DirectWriteMarkers.java:136)
at org.apache.hudi.common.fs.FSUtils.processFiles(FSUtils.java:277)
at org.apache.hudi.table.marker.DirectWriteMarkers.allMarkerFilePaths(DirectWriteMarkers.java:135)
at org.apache.hudi.table.marker.MarkerBasedRollbackUtils.getAllMarkerPaths(MarkerBasedRollbackUtils.java:62)
at org.apache.hudi.table.action.rollback.MarkerBasedRollbackStrategy.getRollbackRequests(MarkerBasedRollbackStrategy.java:76)
... 80 more
Is it possible to recover the dataset?
By default, timeline-server-based markers are used. MARKERS.type indicates whether direct (with .marker files your mentioned) or timeline-server-based markers (aggregated in MARKERS0 etc.) are used. The rollback fails due to the wrongly identified marker type (similar to this issue #6224). There is a pending fix on this: https://github.com/apache/hudi/pull/6266/files
For recovery, you can try disable hoodie.rollback.using.markers (https://hudi.apache.org/docs/configurations#hoodierollbackusingmarkers).
@eshu : is there any update? if the issue is resolved, can you close the github issue please. If not, let us know and we can debug further.
@nsivabalan Workaround is working, but the bug still exists. If workaround is a resolution, then yes, it is resolved.
@eshu how should we reproduce it? can you zip and share a problematic sample dataset so we can debug it?
Closing due to inactivity. Please reopen with steps to reproduce. The general flow works in master as well as 0.12.2 an d0.13.0.