hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] Issues w/ incremental query in MOR table

Open 15663671003 opened this issue 3 years ago • 2 comments

Describe the problem you faced

The problem that the parquet file cannot be found when using spark to incrementally read the MOR table, When reading, no write transactions are executed

Steps to reproduce the behavior:

  1. write one commit mor table and with option "hoodie.cleaner.policy": "KEEP_LATEST_FILE_VERSIONS", "hoodie.cleaner.fileversions.retained": 24, "hoodie.compact.inline": "true", "hoodie.compact.inline.max.delta.commits": 10, "hoodie.keep.min.commits": 99, "hoodie.keep.max.commits": 100, 2.After the first batch is submitted, it can be read incrementally 3.After writing a few more batches, the incremental read error occurs, but the read-optimized view and snapshot can be read normally

Expected behavior

After the mor table is updated for several commits, the incremental query fails to find the parquet file, but the read-optimized table and the real-time table can be accessed normally

Environment Description

  • Hudi version : 0.7.0

  • Spark version : 2.4.0

  • Hive version : 2.1.1

  • Hadoop version : 3.0.0

  • Storage (HDFS/S3/GCS..) : HDFS

  • Running on Docker? (yes/no) : no

Additional context

Add any other context about the problem here.

Stacktrace

>>> op = {'hoodie.datasource.query.type': 'incremental','hoodie.datasource.read.begin.instanttime': '0'}
>>> spark.read.format("hudi").options(**op).load("/user/hive/warehouse/test.db/hudi_mor").count()
[Stage 19:>                                                    (1 + 20) / 25839]22/08/06 01:02:27 WARN scheduler.TaskSetManager: Lost task 20.0 in stage 19.0 (TID 12328, xx.com, executor 37): java.io.FileNotFoundException: File does not exist: hdfs://nameservice1/user/hive/warehouse/test.db/hudi_mor/par=4b/166dc4dd-fe33-47fe-8b1b-23b834a1c3e4-0_4846-55-139569_20220805223417.parquet
        at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1499)
        at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1492)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1507)
        at org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
        at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:413)
        at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(ParquetFileFormat.scala:371)
        at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$1(ParquetFileFormat.scala:370)
        at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(ParquetFileFormat.scala:374)
        at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(ParquetFileFormat.scala:352)
        at org.apache.hudi.HoodieMergeOnReadRDD.read(HoodieMergeOnReadRDD.scala:98)
        at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1408)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

commits show

╔════════════════╤═════════════════════╤═══════════════════╤═════════════════════╤══════════════════════════╤═══════════════════════╤══════════════════════════════╤══════════════╗
║ CommitTime     │ Total Bytes Written │ Total Files Added │ Total Files Updated │ Total Partitions Written │ Total Records Written │ Total Update Records Written │ Total Errors ║
╠════════════════╪═════════════════════╪═══════════════════╪═════════════════════╪══════════════════════════╪═══════════════════════╪══════════════════════════════╪══════════════╣
║ 20220806004543 │ 2.6 GB              │ 0                 │ 6472                │ 256                      │ 13459201              │ 7297                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805232843 │ 2.7 GB              │ 0                 │ 7499                │ 256                      │ 13455632              │ 8771                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805223417 │ 256.1 GB            │ 0                 │ 12910               │ 256                      │ 1322056815            │ 174778                       │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805222939 │ 2.7 GB              │ 0                 │ 7809                │ 256                      │ 13446221              │ 9105                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805212946 │ 2.6 GB              │ 0                 │ 3534                │ 256                      │ 13422116              │ 3580                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805202932 │ 2.6 GB              │ 0                 │ 3599                │ 256                      │ 13418043              │ 3670                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805192931 │ 2.6 GB              │ 0                 │ 3210                │ 256                      │ 13412437              │ 3192                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805183958 │ 2.6 GB              │ 0                 │ 5240                │ 256                      │ 13418101              │ 5666                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805172927 │ 2.6 GB              │ 0                 │ 6840                │ 256                      │ 13408492              │ 7790                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805171751 │ 2.6 GB              │ 0                 │ 4237                │ 256                      │ 13377870              │ 4469                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805170955 │ 2.6 GB              │ 0                 │ 4368                │ 256                      │ 13359451              │ 4600                         │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805170021 │ 2.6 GB              │ 0                 │ 8856                │ 256                      │ 13364422              │ 10938                        │ 0            ║
╟────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220805144130 │ 511.7 GB            │ 25839             │ 0                   │ 256                      │ 2632982405            │ 0                            │ 0            ║
╚════════════════╧═════════════════════╧═══════════════════╧═════════════════════╧══════════════════════════╧═══════════════════════╧══════════════════════════════╧══════════════╝

cleans show

╔═══════════╤═════════════════════════╤═════════════════════╤══════════════════╗
║ CleanTime │ EarliestCommandRetained │ Total Files Deleted │ Total Time Taken ║
╠═══════════╧═════════════════════════╧═════════════════════╧══════════════════╣
║ (empty)                                                                      ║
╚══════════════════════════════════════════════════════════════════════════════╝

compations show all

╔═════════════════════════╤═══════════╤═══════════════════════════════╗
║ Compaction Instant Time │ State     │ Total FileIds to be Compacted ║
╠═════════════════════════╪═══════════╪═══════════════════════════════╣
║ 20220805223417          │ COMPLETED │ 12910                         ║
╚═════════════════════════╧═══════════╧═══════════════════════════════╝

15663671003 avatar Aug 05 '22 17:08 15663671003

What is certain is that there are no write transactions running during incremental reads, why do I get such an error?

15663671003 avatar Aug 05 '22 17:08 15663671003

There could be issues with MOR incremental query in Hudi 0.7.0. Since then MOR incremental reads have been improved. Have you tried Hudi 0.11.1 or the latest master to see if the problem still exists in your case?

yihua avatar Aug 08 '22 05:08 yihua

there are some known limitations w/incrmental query. for eg, there is some interplay b/w cleaner and incremental query. if cleaner has cleaned up the data file pertaining to commit Cn, and if you trigger incremental query w/ Cn, you may see FileNotFoundIssue. you may have to relax the cleaner configs if you wish to do incremental for older commits.

nsivabalan avatar Aug 27 '22 20:08 nsivabalan

@15663671003 : gentle ping. do you have any more specific questions for us?

nsivabalan avatar Sep 12 '22 22:09 nsivabalan

@15663671003 : gentle ping. do you have any more specific questions for us?

@nsivabalan After triggering compact and clean, the incremental query will lose some data. I generally understand this phenomenon. I want to know whether the reason for losing this data is archive or clean. I should increase the value of "hoodie.keep.min.commits" or "hoodie.cleaner.hours.retained" or both? pls help me.

15663671003 avatar Oct 25 '22 16:10 15663671003

@15663671003 This is due to cleaner. I would suggest to retain more commits, whiich can be achieved by increasing the value of both the configs you mentioned above.

codope avatar Nov 28 '22 16:11 codope

@15663671003 Any update on the issue?

codope avatar Feb 01 '23 14:02 codope

will go ahead and close out the issue. Please do file new issue if the above suggestion does not work. thanks!

nsivabalan avatar Feb 06 '23 23:02 nsivabalan