[Feature] Change forward index and dictionary properties during segment reload
label=feature
Currently, in the segment reload path, we support the following operations:
- Adding a new column, removing/updating an autogenerated column
- Add remove various indexes like inverted index, json index, etc.
- Create/modify/remove startree index
- Add min/max value to column metadata
However, we do not support the following operations related to dictionary/forward-index. The proposal here is to add support for the following operations in the segment reload path:
- Adddictionary on a column.
- Changing compression type on a raw column
- Remove dictionary on a column
Without these operations, table owners currently would have to backfill the segments, which is particularly troublesome in APPEND usecases.
Support for changing compression type has been merged in PRs https://github.com/apache/pinot/pull/9454 and https://github.com/apache/pinot/pull/9510
Next step is to do the same for dictionary creation / deletion. cc @Jackie-Jiang @vvivekiyer
The 3rd part to add support for creating dictionary has been merged for both SV and MV in https://github.com/apache/pinot/pull/9678.
Last part is to delete / disable dictionary.
This effort is completed. For offline segments, following segment format updates can be made on the reload path without having to refresh / backfill
- Create dictionary and rewrite dependent structures
- Delete dictionary and rewrite dependent structures
- Change compression codec for raw forward index and rewrite it
Support added for both SV and MV columns.
Closing this as completed.
Thank you @vvivekiyer for seeing this through. cc @Jackie-Jiang