openhouse icon indicating copy to clipboard operation
openhouse copied to clipboard

Upgrade services from iceberg 1.2.0 to iceberg 1.5.2

Open jiang95-dev opened this issue 1 year ago • 2 comments

Summary

Upgrade services from iceberg 1.2.0 to iceberg 1.5.2. Integrations, apps, and tables-test-fixtures will remain iceberg 1.2.0.

Notable changes from iceberg 1.2.0 to iceberg 1.5.2 are not limited as following:

  • Add view support
  • Add FileIO that supports ADLSv2 storage
  • Support file and partition delete granularity
  • Track partition statistics in TableMetadata
  • Add last updated timestamp and snapshot ID to partitions metadata table
  • Added support for Spark 3.5 and removed support for Spark 3.2
  • Add fast_forward procedure

Changes

Added openhouse.iceberg-conventions-1.2 to strictly pin the dependencies version to be 1.2.0.

Excluded the iceberg libraries from the tables and tables-test-fixtures module to remove the circular dependency.

After the change, modules using iceberg 1.5.2 will be:

  • services
  • cluster:storage
  • internalcatalog
  • htscatalog

Modules using iceberg 1.2.0 will be:

  • apps
  • integrations
  • libs
  • tables-test-fixtures

Checkboxes:

  • [ ] Client-facing API Changes
  • [ ] Internal API Changes
  • [ ] Bug Fixes
  • [ ] New Features
  • [ ] Performance Improvements
  • [ ] Code Style
  • [x] Refactoring
  • [ ] Documentation
  • [ ] Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

Comprehensive spark compatibility tests performed in the local cluster: https://docs.google.com/document/d/1yXORH5ety5Gdr6Avsh7XEmIghiJ8wQrg0l4G4_9Soxo/edit#heading=h.h69v5xcbp1md

  • [ ] Manually Tested on local docker setup. Please include commands ran, and their output.
  • [ ] Added new tests for the changes made.
  • [x] Updated existing tests to reflect the changes made.
  • [ ] No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • [ ] Some other form of testing like staging or soak time in production. Please explain.

For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.

Additional Information

  • [ ] Breaking Changes
  • [ ] Deprecations
  • [ ] Large PR broken into smaller PRs, and PR plan linked in the description.

For all the boxes checked, include additional details of the changes made in this pull request.

jiang95-dev avatar Aug 28 '24 00:08 jiang95-dev

Might be looked into already, but is there any spec changes between 1.2 to 1.5? also can we do a staging test before merge?

Added some notable changes, but there's no changes in spec. Also added the spark compatibility testing report.

jiang95-dev avatar Aug 29 '24 06:08 jiang95-dev

one question.

Also, when we have things like ADLSv2 storage, can we remove Kai's copy ?

I want to do that in a separate PR.

jiang95-dev avatar Aug 29 '24 20:08 jiang95-dev

Performance the same tests in the staging environment. @HotSushi

jiang95-dev avatar Oct 11 '24 19:10 jiang95-dev