pinot icon indicating copy to clipboard operation
pinot copied to clipboard

[Feature] Change forward index and dictionary properties during segment reload

Open vvivekiyer opened this issue 3 years ago • 1 comments

label=feature

Currently, in the segment reload path, we support the following operations:

  1. Adding a new column, removing/updating an autogenerated column
  2. Add remove various indexes like inverted index, json index, etc.
  3. Create/modify/remove startree index
  4. Add min/max value to column metadata

However, we do not support the following operations related to dictionary/forward-index. The proposal here is to add support for the following operations in the segment reload path:

  1. Adddictionary on a column.
  2. Changing compression type on a raw column
  3. Remove dictionary on a column

Without these operations, table owners currently would have to backfill the segments, which is particularly troublesome in APPEND usecases.

vvivekiyer avatar Sep 09 '22 00:09 vvivekiyer

Support for changing compression type has been merged in PRs https://github.com/apache/pinot/pull/9454 and https://github.com/apache/pinot/pull/9510

Next step is to do the same for dictionary creation / deletion. cc @Jackie-Jiang @vvivekiyer

siddharthteotia avatar Oct 17 '22 23:10 siddharthteotia

The 3rd part to add support for creating dictionary has been merged for both SV and MV in https://github.com/apache/pinot/pull/9678.

Last part is to delete / disable dictionary.

siddharthteotia avatar Nov 02 '22 21:11 siddharthteotia

This effort is completed. For offline segments, following segment format updates can be made on the reload path without having to refresh / backfill

  • Create dictionary and rewrite dependent structures
  • Delete dictionary and rewrite dependent structures
  • Change compression codec for raw forward index and rewrite it

Support added for both SV and MV columns.

Closing this as completed.

Thank you @vvivekiyer for seeing this through. cc @Jackie-Jiang

siddharthteotia avatar Nov 30 '22 18:11 siddharthteotia