Upgrade services from iceberg 1.2.0 to iceberg 1.5.2
Summary
Upgrade services from iceberg 1.2.0 to iceberg 1.5.2. Integrations, apps, and tables-test-fixtures will remain iceberg 1.2.0.
Notable changes from iceberg 1.2.0 to iceberg 1.5.2 are not limited as following:
- Add view support
- Add FileIO that supports ADLSv2 storage
- Support file and partition delete granularity
- Track partition statistics in TableMetadata
- Add last updated timestamp and snapshot ID to partitions metadata table
- Added support for Spark 3.5 and removed support for Spark 3.2
- Add fast_forward procedure
Changes
Added openhouse.iceberg-conventions-1.2 to strictly pin the dependencies version to be 1.2.0.
Excluded the iceberg libraries from the tables and tables-test-fixtures module to remove the circular dependency.
After the change, modules using iceberg 1.5.2 will be:
- services
- cluster:storage
- internalcatalog
- htscatalog
Modules using iceberg 1.2.0 will be:
- apps
- integrations
- libs
- tables-test-fixtures
Checkboxes:
- [ ] Client-facing API Changes
- [ ] Internal API Changes
- [ ] Bug Fixes
- [ ] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [x] Refactoring
- [ ] Documentation
- [ ] Tests
For all the boxes checked, please include additional details of the changes made in this pull request.
Testing Done
Comprehensive spark compatibility tests performed in the local cluster: https://docs.google.com/document/d/1yXORH5ety5Gdr6Avsh7XEmIghiJ8wQrg0l4G4_9Soxo/edit#heading=h.h69v5xcbp1md
- [ ] Manually Tested on local docker setup. Please include commands ran, and their output.
- [ ] Added new tests for the changes made.
- [x] Updated existing tests to reflect the changes made.
- [ ] No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in production. Please explain.
For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.
Additional Information
- [ ] Breaking Changes
- [ ] Deprecations
- [ ] Large PR broken into smaller PRs, and PR plan linked in the description.
For all the boxes checked, include additional details of the changes made in this pull request.
Might be looked into already, but is there any spec changes between 1.2 to 1.5? also can we do a staging test before merge?
Added some notable changes, but there's no changes in spec. Also added the spark compatibility testing report.
one question.
Also, when we have things like ADLSv2 storage, can we remove Kai's copy ?
I want to do that in a separate PR.
Performance the same tests in the staging environment. @HotSushi