hudi icon indicating copy to clipboard operation
hudi copied to clipboard

too many s3 list when hoodie.metadata.enable=true

Open njalan opened this issue 2 years ago • 22 comments

I upgrade hudi from **0.7 to 0.13.1 (**actually just replace spark hudi bundle jar). In hudi 0.13.1 metadata enabled is by default. But why there are still also so many list operations and it is not reduced . I can see the metadata/files are generated and also mentioned that Listed files in partition from metadata. Did I miss any config? Why there are still so many s3 list operations

image

Environment Description

  • Hudi version : 0.13.1

  • Spark version : 3.3.2

  • Hive version : 3

  • Hadoop version : 3.2.2

  • Storage (HDFS/S3/GCS..) : s3

  • Running on Docker? (yes/no) : no

I am using the third party's s3 storage(local deployed storage cluster) and they provided me the s3 list operations metrics. I am using default config.

njalan avatar Sep 20 '23 01:09 njalan

@njalan Do you mind trying out with tag 0.14.0-rc2 and see if number of list operations got reduced. Parallely we will look into this.

ad1happy2go avatar Sep 20 '23 08:09 ad1happy2go

@ad1happy2go Thanks for your reply. I will try 0.14.0-rc2 and let you know the results. Is there any configuration need to add for spark-sql to use the metadata ? By the way where below command will use hudi medata ? val tripsSnapshotDF = spark. read. format("hudi"). load(basePath) tripsSnapshotDF.createOrReplaceTempView("hudi_trips_snapshot")

njalan avatar Sep 20 '23 14:09 njalan

Thanks @njalan . Now I got the issue actually. Hudi metadata is not turned on for readers by default. You have to explicitly set the props.

Check last part here, and try with those configs - https://hudi.apache.org/docs/next/performance/

val tripsSnapshotDF = spark.
read.
format("hudi").option("hoodie.metadata.enable","true").option("hoodie.enable.data.skipping","true").
load(basePath)

ad1happy2go avatar Sep 20 '23 15:09 ad1happy2go

@ad1happy2go By now I can't enable hoodie.enable.data.skipping and hoodie.metadata.index.column.stats.enable since I faced some errors when enable index.column.stats. But I can try format("hudi").option("hoodie.metadata.enable","true")

njalan avatar Sep 21 '23 00:09 njalan

@ad1happy2go I added option("hoodie.metadata.enable","true") and also added --conf spark.hadoop.hoodie.metadata.enable=true for spark sql but list operation is still high. I got some list metrics from one server for 40 seconds with around 2000 list. one table with around 100-200 s3 lists with 40 seconds. Most of them are listing hive/warehouse/xxx.db/xxx/.hoodie/

By the way what is difference of 0.14.0-rc2?

njalan avatar Sep 21 '23 06:09 njalan

@njalan Metadata table will help reducing the S3 operation for data directory. But it will still do lists on .hoodie directory.

Can you see a noticeable difference with S3 calls outside hudi directory ?

ad1happy2go avatar Sep 21 '23 08:09 ad1happy2go

@ad1happy2go why there are hundreds s3 lists on .hoodie/. within 1 minutes? Most of my hudi tables's file under .hoodie/ is much more than data directory. In my testing data directory listing outside hudi directory is much more faster than .hoodie directory. Is there any way to reduce s3 object list under /.hoodie? Otherwise it doesn't make any sense for me to reduce call outside hudi directory. Is there any way to reduce files under hudi directory?

njalan avatar Sep 22 '23 04:09 njalan

@ad1happy2go I didn't see big difference with S3 calls outside hudi directory compare with metadata enabled and disabled.

njalan avatar Sep 22 '23 16:09 njalan

@ad1happy2go Did you get a chance to test? did you also also see too many list on .hoodie/

njalan avatar Sep 25 '23 15:09 njalan

@njalan Sorry I couldn't look again. I will try it tomorrow. One thing to check more on your end.

  • how many active commits are there in your timeline? Is archival happening properly?
  • How many total S3 objects are there in .hoodie directory? Compare that number with .hoodie/metadata?
  • Is your metadata table getting compacted properly? Check the metadata table timeline and check archival happening for metadata table also. [.hoodie/metadata/.hoodie]

ad1happy2go avatar Sep 26 '23 14:09 ad1happy2go

@ad1happy2go Attached file has the details of one table in production. Why there are thousands of commits_.archive.xx under .hoodie directory. In my hudi 0.9 tables there are also thousands of commits_.archive.xx under .hoodie. Attache table is on hudi 0.13 and just upgraded from 0.9 a couple of days ago. It looks like there is no action to clean archive commit files

base_file_list.log hoodie_file_list.log hoodie_metadata_file_list.log metadata_hoodie_archived_file_list.log metadata_hoodie_file_list.log

njalan avatar Sep 27 '23 01:09 njalan

@ad1happy2go I think below are the same issues with mine: https://github.com/apache/hudi/issues/7991 https://github.com/apache/hudi/issues/9612

njalan avatar Sep 30 '23 16:09 njalan

@njalan Do you also see similar behaviour for the tables which got written with later versions of hudi (0.13) only and not 0.9. I mean the tables which are not upgraded?

ad1happy2go avatar Oct 03 '23 09:10 ad1happy2go

@ad1happy2go Below are the list count for one spark streaming micro batch:

bleow are top list opreations(first line is list count) for table with hudi 0.13.1 and metadata enabled: 329 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/), 229 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/), 50 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/files/), 42 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/.aux/.bootstrap/.partitions/00000000-0000-0000-0000-000000000000-0_1-0-1_00000000000001.hfile/), 33 (hive/warehouse/ods_xxx.db/testing_hudi13/), 14 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/.temp/), 10 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/.temp/20231010140342361/), 9 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/.temp/20231010140158325/), 7 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/.temp/20231010140509929/), 7 (hive/warehouse/ods_xxx.db/testing_hudi13/.hoodie/metadata/.hoodie/.temp/20231010140342361/),

bleow are top list opreations(first line is list count) for table with hudi 0.9 and metadata disabled: 274 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/), 188 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.aux/.bootstrap/.partitions/00000000-0000-0000-0000-000000000000-0_1-0-1_00000000000001.hfile/), 48 (hive/warehouse/ods_xxxx.db/testing_hudi09/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140501/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140401/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140301/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140201/), 9 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/20231010140101/), 5 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.temp/), 5 (hive/warehouse/ods_xxxx.db/testing_hudi09/.hoodie/.heartbeat/),

Is there any way the reduce the list operation? If one table can reduce 50% list operation it can reduce workload significantly where there are thousands of of tables with local deployed object storage cluster.

njalan avatar Oct 12 '23 14:10 njalan

Thanks a lot for your effort here. @njalan . Really appreciate it. Looks like in your case metadata table got more list calls. I will work on this. Thanks.

ad1happy2go avatar Oct 13 '23 17:10 ad1happy2go

@ad1happy2go Thanks a lot for your help. Just let me know if you want any other information from me.

njalan avatar Oct 13 '23 17:10 njalan

@ad1happy2go May I know any updates from you? If can't reduce object list , can we cache these metadatas on driver?

njalan avatar Oct 20 '23 16:10 njalan

@njalan Didn't got much time yet to look into this yet. I will prioritize this one this week. Thanks. Will update.

ad1happy2go avatar Oct 24 '23 09:10 ad1happy2go

Any updates?

BruceKellan avatar Jan 10 '24 10:01 BruceKellan

@ad1happy2go I did internal benchmarks with different versions of hudi here. With metadata enabled between various version, I didn't saw significant increase in S3 calls.

@njalan @BruceKellan Did you tried 0.14.X release? Do you still see high S3 calls only with metadata enabled?

ad1happy2go avatar Jan 31 '24 15:01 ad1happy2go

hey @njalan @BruceKellan : any follow ups on this.

nsivabalan avatar Apr 09 '24 02:04 nsivabalan

hi, I'm facing the same issues when using flink to stream hudi into S3 as well. There are too many list requests being made to s3 and it eventually causes the flink to be stuck. Any updates on this?

nickefy avatar Jun 27 '24 01:06 nickefy

@nickefy Did you fix it

njalan avatar Dec 16 '24 04:12 njalan

@ad1happy2go Below is the list objects metrics when there is only one spark streaming jobs running by hudi 0.15.0& spark 3.3.2. image

njalan avatar Dec 17 '24 14:12 njalan

Also facing this issue with both metadata table enabled and disabled in my spark structured streaming job writing to a partitioned Hudi table. The Listing all partitions with prefix stage is currently the bottleneck of the pipeline as the data and number of partitions keep growing.

ruifmont-te avatar Jan 15 '25 00:01 ruifmont-te

@ad1happy2go Any updates on this one?

njalan avatar Feb 08 '25 06:02 njalan