ucx
ucx copied to clipboard
[FEATURE]: Migrate tables in unsupported filesystem
Is there an existing issue for this?
- [X] I have searched the existing issues
Problem statement
External tables stored in adl:// and wasbs:// will be crawled and marked with What.EXTERNAL_NO_SYNC.
We will need more What enum to differentiate following scenarios:
- Hiveserde tables, like ParquetHiveSerDe, which cannot be SYNC, but can be in place migrated by creating a UC table with supported data source (for example
create external table ... using parquet ... location) - Hiveserde tables that have to be migrated using CTAS
- Tables in unsupported filesystem like
adl://andwasbs://. It require either:- migrate the storage to ADLS Gen2 first and update the HMS table location, then migrate to UC.
- or deep clone or CTAS the table to a UC.
Proposed Solution
- Add more
Whatenum. - Discuss the strategy of how to migrate those tables in the future.
Additional Context
Related issue:
- #355 which reports unsupported table in dashboard.
- #1064 Migrate UC External Location should skip unsupported filesystem
effort might be: 3weeks for cloud-level copy or few days for CTAS
@HariGS-DB @FastLee to triage and find the better time estimate