hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] Unable to archive if no non-table service actions are performed on the data table

Open voonhous opened this issue 3 years ago • 4 comments

Hello Hudi, this is a question regarding the design considerations between metadata table (MDT) and the archiving commit action on a data table (DT).

When performing archival of commits on the DT, at least one compaction is required to be performed on the MDT.

    // If metadata table is enabled, do not archive instants which are more recent than the last compaction on the
    // metadata table.
    if (config.isMetadataTableEnabled()) {
      try (HoodieTableMetadata tableMetadata = HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(),
          config.getBasePath(), FileSystemViewStorageConfig.SPILLABLE_DIR.defaultValue())) {
        Option<String> latestCompactionTime = tableMetadata.getLatestCompactionTime();
        if (!latestCompactionTime.isPresent()) {
          LOG.info("Not archiving as there is no compaction yet on the metadata table");
          instants = Stream.empty();
        } else {
          LOG.info("Limiting archiving of instants to latest compaction on metadata table at " + latestCompactionTime.get());
          instants = instants.filter(instant -> HoodieTimeline.compareTimestamps(instant.getTimestamp(), HoodieTimeline.LESSER_THAN,
              latestCompactionTime.get()));
        }
      } catch (Exception e) {
        throw new HoodieException("Error limiting instant archival based on metadata table", e);
      }
    }

Assuming that a DT has MDT enabled (by default for Spark entrypoints), and ONLY INSERT-OVERWRITE actions are performed on the DT (a table service action generating replacecommits), archival of commits will not be performed.

This is so as compaction on the MDT is never performed if a table service action is performed on the DT.

As such, one can see that archival service on DT is dependent on MDT's compaction service, which is dependent on DT's data manipulation operations.

TLDR: I am unsure as to what design considerations are involved in putting such restrictions in place, hence am consulting the community as to why this is the case.

Thank you.

Environment Description

  • Hudi version : 0.11.1

  • Spark version : 3.1

  • Running on Docker? (yes/no) : no

Related JIRA ticket: HUDI-4876

voonhous avatar Sep 19 '22 07:09 voonhous

@TengHuo @fengjian428 @hbgstc123

voonhous avatar Sep 19 '22 07:09 voonhous

@yihua : Can you take this up please.

nsivabalan avatar Sep 19 '22 23:09 nsivabalan

@voonhous I'm going to take a look at this use case. Is the INSERT_OVERWRITE the only write action, i.e., no other inserts, updates, deletes, bulk inserts, etc.? This is not a table service action, though it generates the replacecommit, which is in the same format as clustering. INSERT_OVERWRITE should also update the metadata table with delta commits, and there shouldn't be any reason that the compaction does not kick in the metadata table. Do you have reproducible steps?

The reasoning behind the constraint that prevents the archival of commits in the data table after the latest compaction commit of the metadata table is that the compaction of the metadata table is fenced by the condition that we trigger compaction only when there are no inflight requests in data table. This ensures that all base files in the metadata table are always in sync with the data table (w/o any holes) and only there could be some extra invalid commits among delta log files in metadata table.

yihua avatar Sep 21 '22 21:09 yihua

@yihua Thank you for the reply.

Is the INSERT_OVERWRITE the only write action

Yes, INSERT_OVERWRITE is the only action being performed on the table. i.e. ensuring that an insert always rewrites a certain partition, regardless if the partition exists or not.

For the sake of simplicity, the example below to reproduce this issue does not involve a partitioned table.

drop table if exists insert_overwrite_archive_test purge;
create table if not exists insert_overwrite_archive_test(
	id int,
	name string,
	price double,
	_ts long
) using hudi 
tblproperties (
	type = 'cow',
	primaryKey = 'id',
	preCombineField = '_ts'
) location 'hdfs://insert_overwrite_archive_test';

-- INSERT_OVERWRITE 64 times
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);

After the INSERT_OVERWRITE operations have completed, we can check the hdfs directory as such:

$ hdfs dfs -ls hdfs://insert_overwrite_archive_test/.hoodie | grep -o '.replacecommit$'
64

Let us check the size of the file within the archive folder too:

$ hdfs dfs -ls hdfs://insert_overwrite_archive_test/.hoodie/archived

The above output should return nothing as no archiving has been done.

voonhous avatar Sep 22 '22 06:09 voonhous

@yihua I still don't quite understand the:

This ensures that all base files in the metadata table are always in sync with the data table (w/o any holes) and only there could be some extra invalid commits among delta log files in metadata table.

Can you please elaborate? thank you.

voonhous avatar Sep 23 '22 09:09 voonhous

probably there is some mis-understanding. any actions in data table will be applied to metadata table. be it commit, delta commit, clustering, insert_overwrite operations, delete_partition, clean, rollback, etc. So, there should not be any issues.

nsivabalan avatar Oct 22 '22 21:10 nsivabalan

actually, you are right. we have some bug around this. https://issues.apache.org/jira/browse/HUDI-5078 will put up a fix shortly. thanks for bringing it up.

nsivabalan avatar Oct 22 '22 22:10 nsivabalan

https://github.com/apache/hudi/pull/7037

nsivabalan avatar Oct 22 '22 23:10 nsivabalan