hudi too many s3 list when hoodie.metadata.enable=true

I upgrade hudi from **0.7 to 0.13.1 (**actually just replace spark hudi bundle jar). In hudi 0.13.1 metadata enabled is by default. But why there are still also so many list operations and it is not reduced . I can see the metadata/files are generated and also mentioned that Listed files in partition from metadata. Did I miss any config? Why there are still so many s3 list operations

Environment Description

Hudi version : 0.13.1
Spark version : 3.3.2
Hive version : 3
Hadoop version : 3.2.2
Storage (HDFS/S3/GCS..) : s3
Running on Docker? (yes/no) : no

I am using the third party's s3 storage(local deployed storage cluster) and they provided me the s3 list operations metrics. I am using default config.

Sep 20 '23 01:09 njalan

@njalan Do you mind trying out with tag 0.14.0-rc2 and see if number of list operations got reduced. Parallely we will look into this.

Sep 20 '23 08:09 ad1happy2go

@ad1happy2go Thanks for your reply. I will try 0.14.0-rc2 and let you know the results. Is there any configuration need to add for spark-sql to use the metadata ? By the way where below command will use hudi medata ? val tripsSnapshotDF = spark. read. format("hudi"). load(basePath) tripsSnapshotDF.createOrReplaceTempView("hudi_trips_snapshot")

Sep 20 '23 14:09 njalan

Thanks @njalan . Now I got the issue actually. Hudi metadata is not turned on for readers by default. You have to explicitly set the props.

Check last part here, and try with those configs - https://hudi.apache.org/docs/next/performance/

val tripsSnapshotDF = spark.
read.
format("hudi").option("hoodie.metadata.enable","true").option("hoodie.enable.data.skipping","true").
load(basePath)

Sep 20 '23 15:09 ad1happy2go

@ad1happy2go By now I can't enable hoodie.enable.data.skipping and hoodie.metadata.index.column.stats.enable since I faced some errors when enable index.column.stats. But I can try format("hudi").option("hoodie.metadata.enable","true")

Sep 21 '23 00:09 njalan

@ad1happy2go I added option("hoodie.metadata.enable","true") and also added --conf spark.hadoop.hoodie.metadata.enable=true for spark sql but list operation is still high. I got some list metrics from one server for 40 seconds with around 2000 list. one table with around 100-200 s3 lists with 40 seconds. Most of them are listing hive/warehouse/xxx.db/xxx/.hoodie/

By the way what is difference of 0.14.0-rc2?

Sep 21 '23 06:09 njalan

@njalan Metadata table will help reducing the S3 operation for data directory. But it will still do lists on .hoodie directory.

Can you see a noticeable difference with S3 calls outside hudi directory ?

Sep 21 '23 08:09 ad1happy2go

@ad1happy2go why there are hundreds s3 lists on .hoodie/. within 1 minutes? Most of my hudi tables's file under .hoodie/ is much more than data directory. In my testing data directory listing outside hudi directory is much more faster than .hoodie directory. Is there any way to reduce s3 object list under /.hoodie? Otherwise it doesn't make any sense for me to reduce call outside hudi directory. Is there any way to reduce files under hudi directory?

Sep 22 '23 04:09 njalan

@ad1happy2go I didn't see big difference with S3 calls outside hudi directory compare with metadata enabled and disabled.

Sep 22 '23 16:09 njalan

@ad1happy2go Did you get a chance to test? did you also also see too many list on .hoodie/

Sep 25 '23 15:09 njalan

@njalan Sorry I couldn't look again. I will try it tomorrow. One thing to check more on your end.

how many active commits are there in your timeline? Is archival happening properly?
How many total S3 objects are there in .hoodie directory? Compare that number with .hoodie/metadata?
Is your metadata table getting compacted properly? Check the metadata table timeline and check archival happening for metadata table also. [.hoodie/metadata/.hoodie]

Sep 26 '23 14:09 ad1happy2go

@ad1happy2go Attached file has the details of one table in production. Why there are thousands of commits_.archive.xx under .hoodie directory. In my hudi 0.9 tables there are also thousands of commits_.archive.xx under .hoodie. Attache table is on hudi 0.13 and just upgraded from 0.9 a couple of days ago. It looks like there is no action to clean archive commit files

base_file_list.log hoodie_file_list.log hoodie_metadata_file_list.log metadata_hoodie_archived_file_list.log metadata_hoodie_file_list.log

Sep 27 '23 01:09 njalan

@ad1happy2go I think below are the same issues with mine: https://github.com/apache/hudi/issues/7991 https://github.com/apache/hudi/issues/9612

Sep 30 '23 16:09 njalan

@njalan Do you also see similar behaviour for the tables which got written with later versions of hudi (0.13) only and not 0.9. I mean the tables which are not upgraded?

Oct 03 '23 09:10 ad1happy2go

@ad1happy2go Below are the list count for one spark streaming micro batch:

bleow are top list opreations(first line is list count) for table with hudi 0.13.1 and metadata enabled: 329 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/), 229 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/), 50 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/files/), 42 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/.aux/.bootstrap/.partitions/00000000-0000-0000-0000-000000000000-0_1-0-1_00000000000001.hfile/), 33 (hive/warehouse/ods_xxx.db/testing_hudi13/), 14 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/.temp/), 10 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/.temp/20231010140342361/), 9 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/.temp/20231010140158325/), 7 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/.temp/20231010140509929/), 7 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/.temp/20231010140342361/),

bleow are top list opreations(first line is list count) for table with hudi 0.9 and metadata disabled: 274 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/), 188 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.aux/.bootstrap/.partitions/00000000-0000-0000-0000-000000000000-0_1-0-1_00000000000001.hfile/), 48 (hive/warehouse/ods_xxxx.db/testing_hudi09/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140501/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140401/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140301/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140201/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140101/), 5 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/), 5 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.heartbeat/),

Is there any way the reduce the list operation? If one table can reduce 50% list operation it can reduce workload significantly where there are thousands of of tables with local deployed object storage cluster.

Oct 12 '23 14:10 njalan

Thanks a lot for your effort here. @njalan . Really appreciate it. Looks like in your case metadata table got more list calls. I will work on this. Thanks.

Oct 13 '23 17:10 ad1happy2go

@ad1happy2go Thanks a lot for your help. Just let me know if you want any other information from me.

Oct 13 '23 17:10 njalan

@ad1happy2go May I know any updates from you? If can't reduce object list , can we cache these metadatas on driver?

Oct 20 '23 16:10 njalan

@njalan Didn't got much time yet to look into this yet. I will prioritize this one this week. Thanks. Will update.

Oct 24 '23 09:10 ad1happy2go

Any updates?

Jan 10 '24 10:01 BruceKellan

@ad1happy2go I did internal benchmarks with different versions of hudi here. With metadata enabled between various version, I didn't saw significant increase in S3 calls.

@njalan @BruceKellan Did you tried 0.14.X release? Do you still see high S3 calls only with metadata enabled?

Jan 31 '24 15:01 ad1happy2go

hey @njalan @BruceKellan : any follow ups on this.

Apr 09 '24 02:04 nsivabalan

hi, I'm facing the same issues when using flink to stream hudi into S3 as well. There are too many list requests being made to s3 and it eventually causes the flink to be stuck. Any updates on this?

Jun 27 '24 01:06 nickefy

@nickefy Did you fix it

Dec 16 '24 04:12 njalan

@ad1happy2go Below is the list objects metrics when there is only one spark streaming jobs running by hudi 0.15.0& spark 3.3.2.

Dec 17 '24 14:12 njalan

Also facing this issue with both metadata table enabled and disabled in my spark structured streaming job writing to a partitioned Hudi table. The Listing all partitions with prefix stage is currently the bottleneck of the pipeline as the data and number of partitions keep growing.

Jan 15 '25 00:01 ruifmont-te

@ad1happy2go Any updates on this one?

Feb 08 '25 06:02 njalan