hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] show_compaction procedure returns empty even after multiple compaction have completed

Open pravin1406 opened this issue 2 years ago • 0 comments

A clear and concise description of the problem.

To Reproduce

  1. Enable inline compaction to run after some (n ) deltacommits. Attempts n upserts into a hoodie table
  2. In the n the upsert , a compaction will also get triggered.

Expected behavior

When compaction runs, i can see compaction.requested and compaction.inflight, but after that no .compaction file is created , rather a .commit file is created.

When i try to run show_compaction procedure, i never see any compaction has run in it. Whereas i can confirm that in reality files were compacted. I also tried to fetch latest completed compaction from hoodiemetaclient, but that also returns empty.

It seems the .commit file was created instead of .compaction file in hoodie metadata, which has a opertionType as "commit" in it. Is this the expected behavior. If yes then show_compaction procedure is rendered useless. It would never show correct info.

drwxr-xr-x - b0219039 supergroup 0 2023-12-12 03:12 /tmp/pravin_hudi_new/.hoodie/.temp -rw-r--r-- 2 b0219039 supergroup 54907 2023-12-12 03:12 /tmp/pravin_hudi_new/.hoodie/20231212031157356.commit -rw-r--r-- 2 b0219039 supergroup 0 2023-12-12 03:11 /tmp/pravin_hudi_new/.hoodie/20231212031157356.compaction.inflight -rw-r--r-- 2 b0219039 supergroup 1568 2023-12-12 03:11 /tmp/pravin_hudi_new/.hoodie/20231212031157356.compaction.requested drwxr-xr-x - b0219039 supergroup 0 2023-12-12 03:11 /tmp/pravin_hudi_new/.hoodie/.aux -rw-r--r-- 2 b0219039 supergroup 55197 2023-12-12 03:11 /tmp/pravin_hudi_new/.hoodie/20231212031136710.deltacommit -rw-r--r-- 2 b0219039 supergroup 1470 2023-12-12 03:11 /tmp/pravin_hudi_new/.hoodie/20231212031136710.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-12 03:11 /tmp/pravin_hudi_new/.hoodie/20231212031136710.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 758 2023-12-11 06:42 /tmp/pravin_hudi_new/.hoodie/20231205033753119.savepoint -rw-r--r-- 2 b0219039 supergroup 0 2023-12-11 06:42 /tmp/pravin_hudi_new/.hoodie/20231205033753119.savepoint.inflight -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-06 09:10 /tmp/pravin_hudi_new/.hoodie/20231206091015577.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-06 09:10 /tmp/pravin_hudi_new/.hoodie/20231206091015577.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-06 09:10 /tmp/pravin_hudi_new/.hoodie/20231206091015577.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-06 09:05 /tmp/pravin_hudi_new/.hoodie/20231206090447449.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-06 09:04 /tmp/pravin_hudi_new/.hoodie/20231206090447449.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-06 09:04 /tmp/pravin_hudi_new/.hoodie/20231206090447449.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-06 09:00 /tmp/pravin_hudi_new/.hoodie/20231206090017580.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-06 09:00 /tmp/pravin_hudi_new/.hoodie/20231206090017580.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-06 09:00 /tmp/pravin_hudi_new/.hoodie/20231206090017580.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-06 08:51 /tmp/pravin_hudi_new/.hoodie/20231206085104622.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-06 08:51 /tmp/pravin_hudi_new/.hoodie/20231206085104622.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-06 08:51 /tmp/pravin_hudi_new/.hoodie/20231206085104622.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-05 16:33 /tmp/pravin_hudi_new/.hoodie/20231205163332332.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-05 16:33 /tmp/pravin_hudi_new/.hoodie/20231205163332332.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-05 16:33 /tmp/pravin_hudi_new/.hoodie/20231205163332332.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-05 16:18 /tmp/pravin_hudi_new/.hoodie/20231205161742495.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-05 16:17 /tmp/pravin_hudi_new/.hoodie/20231205161742495.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-05 16:17 /tmp/pravin_hudi_new/.hoodie/20231205161742495.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 1819 2023-12-05 03:47 /tmp/pravin_hudi_new/.hoodie/20231205034746961.clean -rw-r--r-- 2 b0219039 supergroup 1827 2023-12-05 03:47 /tmp/pravin_hudi_new/.hoodie/20231205034746961.clean.inflight -rw-r--r-- 2 b0219039 supergroup 1827 2023-12-05 03:47 /tmp/pravin_hudi_new/.hoodie/20231205034746961.clean.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-05 03:47 /tmp/pravin_hudi_new/.hoodie/20231205034728832.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-05 03:47 /tmp/pravin_hudi_new/.hoodie/20231205034728832.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-05 03:47 /tmp/pravin_hudi_new/.hoodie/20231205034728832.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54906 2023-12-05 03:38 /tmp/pravin_hudi_new/.hoodie/20231205033812277.commit -rw-r--r-- 2 b0219039 supergroup 0 2023-12-05 03:38 /tmp/pravin_hudi_new/.hoodie/20231205033812277.compaction.inflight -rw-r--r-- 2 b0219039 supergroup 1569 2023-12-05 03:38 /tmp/pravin_hudi_new/.hoodie/20231205033812277.compaction.requested -rw-r--r-- 2 b0219039 supergroup 54017 2023-12-05 03:38 /tmp/pravin_hudi_new/.hoodie/20231205033753119.deltacommit -rw-r--r-- 2 b0219039 supergroup 113 2023-12-05 03:38 /tmp/pravin_hudi_new/.hoodie/20231205033753119.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-12-05 03:37 /tmp/pravin_hudi_new/.hoodie/20231205033753119.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 55198 2023-11-06 13:49 /tmp/pravin_hudi_new/.hoodie/20231106134909550.deltacommit -rw-r--r-- 2 b0219039 supergroup 1470 2023-11-06 13:49 /tmp/pravin_hudi_new/.hoodie/20231106134909550.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 0 2023-11-06 13:49 /tmp/pravin_hudi_new/.hoodie/20231106134909550.deltacommit.requested -rw-r--r-- 2 b0219039 supergroup 54892 2023-11-06 13:35 /tmp/pravin_hudi_new/.hoodie/20231106133458204.deltacommit -rw-r--r-- 2 b0219039 supergroup 773 2023-11-06 13:35 /tmp/pravin_hudi_new/.hoodie/20231106133458204.deltacommit.inflight -rw-r--r-- 2 b0219039 supergroup 903 2023-11-06 13:35 /tmp/pravin_hudi_new/.hoodie/hoodie.properties drwxr-xr-x - b0219039 supergroup 0 2023-11-06 13:35 /tmp/pravin_hudi_new/.hoodie/metadata -rw-r--r-- 2 b0219039 supergroup 0 2023-11-06 13:35 /tmp/pravin_hudi_new/.hoodie/20231106133458204.deltacommit.requested drwxr-xr-x - b0219039 supergroup 0 2023-11-06 13:34 /tmp/pravin_hudi_new/.hoodie/archived drwxr-xr-x - b0219039 supergroup 0 2023-11-06 13:34 /tmp/pravin_hudi_new/.hoodie/.schema

hdfs dfs -text /tmp/pravin_hudi_new/.hoodie/20231212031157356.compaction.requested 2023-12-21 14:36:45,565 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2023-12-21 14:36:46,358 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false {"operations":{"array":[{"baseInstantTime":{"string":"20231205033812277"},"deltaFilePaths":{"array":[".d4f39b50-e121-4e83-9cc5-5e8668cf80ac-0_20231205033812277.log.1_0-14-211"]},"dataFilePath":{"string":"d4f39b50-e121-4e83-9cc5-5e8668cf80ac-0_0-24-26_20231205033812277.parquet"},"fileId":{"string":"d4f39b50-e121-4e83-9cc5-5e8668cf80ac-0"},"partitionPath":{"string":""},"metrics":{"map":{"TOTAL_LOG_FILES":1.0,"TOTAL_IO_READ_MB":0.0,"TOTAL_LOG_FILES_SIZE":54163.0,"TOTAL_IO_WRITE_MB":0.0,"TOTAL_IO_MB":0.0}},"bootstrapFilePath":null}]},"extraMetadata":{"map":{}},"version":{"int":2}}

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version : 0.12.1

  • Spark version : 3.2.0

  • Hive version : 3.1.1

  • Hadoop version : 3.1.1

  • Storage (HDFS/S3/GCS..) : hdfs

  • Running on Docker? (yes/no) : no

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

pravin1406 avatar Dec 21 '23 09:12 pravin1406