hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-29016: Disable caching on the Iceberg REST Catalog

Open okumin opened this issue 7 months ago • 9 comments

What changes were proposed in this pull request?

Remove caching on the Iceberg REST API server. It's mainly because the current code didn't work with caching correctly, as stated in HIVE-29016.

This is an alternative to https://github.com/apache/hive/pull/5871

Why are the changes needed?

Disable caching.

Does this PR introduce any user-facing change?

No. This feature is not yet shipped.

How was this patch tested?

Unit test

okumin avatar Jun 17 '25 09:06 okumin

+1 We should revisit the caching catalog implementation, as it seems to be broken.

@deniskuzZ the unit test that creates a table through http does not fail though. ( TestHMSCatalog.testTableAPI() ). Any other unit test that reproduces that issue would be much appreciated.

henrib avatar Jun 17 '25 13:06 henrib

+1 We should revisit the caching catalog implementation, as it seems to be broken.

@deniskuzZ the unit test that creates a table through http does not fail though. ( TestHMSCatalog.testTableAPI() ). Any other unit test that reproduces that issue would be much appreciated.

@okumin, how did you create the table? have you used RESTCatalog instead of manual http calls construction as in TestHMSCatalog?

CreateTableRequest create = CreateTableRequest.builder().
         withName(tblName).
         withLocation(location).
         withSchema(schema).build();
  URL url = iceUri.resolve("namespaces/" + DB_NAME + "/tables").toURL();
  Object response = clientCall(jwt, url, "POST", create);

deniskuzZ avatar Jun 17 '25 14:06 deniskuzZ

@henrib, transaction API doesn't work at all:

org.apache.iceberg.exceptions.ServiceFailureException: Server error: IllegalStateException: Cannot wrap catalog that does not produce BaseTransaction

that is what @okumin mentioned

reproduce:

  @Test
  public void testTableAPI() throws Exception {
    URI iceUri = URI.create("http://localhost:" + catalogPort + "/"+catalogPath);
    Schema schema = getTestSchema();
    final String tblName = "tbl_" + Integer.toHexString(RND.nextInt(65536));
    final TableIdentifier TBL = TableIdentifier.of(DB_NAME, tblName);
    String location = temp.newFolder(TBL.toString()).toString();

    Map<String, String> properties = ImmutableMap.of(
      "type", "rest", 
      "uri", iceUri.toString());
    final String catalogName = catalog.name();
    
    RESTCatalog restCatalog = (RESTCatalog) CatalogUtil.buildIcebergCatalog(catalogName, properties, new Configuration());
    restCatalog.initialize(catalogName, properties);

    restCatalog
        .buildTable(TBL, schema)
        .withLocation(location)
        .createTransaction()
        .commitTransaction();
    
    Table table = catalog.loadTable(TBL);
    Assert.assertEquals(location, table.location());
  }

MetastoreConf.setLongVar(conf, MetastoreConf.ConfVars.ICEBERG_CATALOG_CACHE_EXPIRY, -1L); solves the issue

but i am not sure if HMSCachingCatalog can be even used with transaction API. If not, we should revert cc @okumin

deniskuzZ avatar Jun 18 '25 13:06 deniskuzZ

With this modification in HMSCachingCatalog:

  @Override
  public Catalog.TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
    tableCache.invalidate(identifier);
    return nsCatalog.buildTable(identifier, schema);
  }

This test succeeds:

  @Test
  public void testTableAPI2() throws Exception {
    String cname = catalog.name();
    URI iceUri = URI.create("http://localhost:" + catalogPort + "/"+catalogPath);
    String jwt = generateJWT();
    Schema schema = getTestSchema();
    final String tblName = "tbl_" + Integer.toHexString(RND.nextInt(65536));
    final TableIdentifier TBL = TableIdentifier.of(DB_NAME, tblName);
    String location = temp.newFolder(TBL.toString()).toString();

    Configuration configuration = new Configuration();
    configuration.set("iceberg.catalog", cname);
    configuration.set("iceberg.catalog."+cname+".type", "rest");
    configuration.set("iceberg.catalog."+cname+".uri", iceUri.toString());
    configuration.set("iceberg.catalog."+cname+".token", jwt);

    String catalogName = configuration.get(CATALOG_NAME);
    Assert.assertEquals(cname, catalogName);
    Map<String, String> properties = getCatalogPropertiesFromConf(configuration, catalogName);
    Assert.assertFalse(properties.isEmpty());
    RESTCatalog restCatalog = (RESTCatalog) CatalogUtil.buildIcebergCatalog(catalogName, properties, configuration);
    restCatalog.initialize(catalogName, properties);

    restCatalog
            .buildTable(TBL, schema)
            .withLocation(location)
            .createTransaction()
            .commitTransaction();

    Table table = catalog.loadTable(TBL);
    Assert.assertEquals(location, table.location());
  }

henrib avatar Jun 18 '25 15:06 henrib

With this modification in HMSCachingCatalog:

  @Override
  public Catalog.TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
    tableCache.invalidate(identifier);
    return nsCatalog.buildTable(identifier, schema);
  }

This test succeeds:

  @Test
  public void testTableAPI2() throws Exception {
    String cname = catalog.name();
    URI iceUri = URI.create("http://localhost:" + catalogPort + "/"+catalogPath);
    String jwt = generateJWT();
    Schema schema = getTestSchema();
    final String tblName = "tbl_" + Integer.toHexString(RND.nextInt(65536));
    final TableIdentifier TBL = TableIdentifier.of(DB_NAME, tblName);
    String location = temp.newFolder(TBL.toString()).toString();

    Configuration configuration = new Configuration();
    configuration.set("iceberg.catalog", cname);
    configuration.set("iceberg.catalog."+cname+".type", "rest");
    configuration.set("iceberg.catalog."+cname+".uri", iceUri.toString());
    configuration.set("iceberg.catalog."+cname+".token", jwt);

    String catalogName = configuration.get(CATALOG_NAME);
    Assert.assertEquals(cname, catalogName);
    Map<String, String> properties = getCatalogPropertiesFromConf(configuration, catalogName);
    Assert.assertFalse(properties.isEmpty());
    RESTCatalog restCatalog = (RESTCatalog) CatalogUtil.buildIcebergCatalog(catalogName, properties, configuration);
    restCatalog.initialize(catalogName, properties);

    restCatalog
            .buildTable(TBL, schema)
            .withLocation(location)
            .createTransaction()
            .commitTransaction();

    Table table = catalog.loadTable(TBL);
    Assert.assertEquals(location, table.location());
  }

@okumin, can we add the proposed change to this PR?

deniskuzZ avatar Jun 18 '25 15:06 deniskuzZ

@henrib, why do we need to call tableCache.invalidate(identifier);? isn't it enough to call nsCatalog directly?

  @Override
  public Catalog.TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
    return nsCatalog.buildTable(identifier, schema);
  }

deniskuzZ avatar Jun 18 '25 18:06 deniskuzZ

@henrib, why do we need to call tableCache.invalidate(identifier);? isn't it enough to call nsCatalog directly? @deniskuzZ you are right, it is certainly not necessary to invalidate the cache for a table that does not exist yet

henrib avatar Jun 18 '25 19:06 henrib

Hi @okumin, could you please incorporate the above minor change here as well? Thanks!

deniskuzZ avatar Jun 19 '25 07:06 deniskuzZ

I added the recommended change. I did not add a test case because we're replacing the existing ones with the more comprehensive test suite in https://issues.apache.org/jira/browse/HIVE-29019. I am feeling the current implementation can't pass the RCK, so I will check failure cases at once in another PR.

okumin avatar Jun 20 '25 12:06 okumin

May be we can try and fix the actual cache instead ? Have a look at 5882

henrib avatar Jun 20 '25 14:06 henrib

May be we can try and fix the actual cache instead ? Have a look at 5882

Since it's a blocker for Hive-4.1 and we don't have extensive test coverage, I would propose to disable, at least until we cut the 4.1 branch

deniskuzZ avatar Jun 20 '25 15:06 deniskuzZ

Merged. @deniskuzZ @henrib Thanks for your review comments!

okumin avatar Jun 21 '25 07:06 okumin