pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Implement TTL support for Pinot upsert

Open deemoliu opened this issue 3 years ago • 2 comments

Apache Pinot provides native support of Upsert since v0.6.0 (#4261), it allows users to modify existing records, and successfully onboard many use cases. We observed Pinot upsert clusters usually have high usage of heap memory. This is because the upsert metadata (primaryKeyIndexes and validDocIndexes), are stored in heap of pinot hosts. For use cases with high cardinality of primary keys, the heap usage of these upsert tables usually becomes the bottleneck of the hardware resource.

For some use cases, records that shared primary keys will get updates frequently during a time window, and after the time window, these records won’t get updated any more. In these use cases, each primary key has a lifecycle and will be deactivated after the time window. Currently these primary keys won’t expire until the retention days, and they will be kept in primaryKeyIndexes. We shall introduce TTL (time-to-live) for Pinot primary keys. Primary keys will expire after the TTL, and we can remove inactive keys from upsert metadata to save heap space.

Few Challenges that we want to solve.

  • snapshots management for validDocIndexes
  • implement TTL for primary keys in primaryKeyIndexes
  • snapshot backup in the deepstore.

deemoliu avatar Oct 04 '22 17:10 deemoliu

We summarized the challenges and thoughts for partial upsert in this design

Please review cc @Jackie-Jiang @chenboat @yupeng9

deemoliu avatar Oct 04 '22 17:10 deemoliu

ValidDocIds Snapshot management PR and Pinot doc.

deemoliu avatar Oct 04 '22 17:10 deemoliu

After discussion with @Jackie-Jiang @yupeng9 @chenboat

We can break down the feature into the following part.

  • Design doc updates
  • part 1. When committing segment, update replaceSegment to clean up keys
  • part 1.1 clean up keys in primary key indexes
  • part 1.2 generate snapshot locally
  • part 1.3 [Deepstore] upload snapshot to Deepstore
  • part 2. periodic job in pinot controller (upload snapshot if not persisted)
  • part 3. add a download snapshot api on the server side.
  • part 4. when loading segments, get snapshot to avoid re-compute
  • part 4.1 get snapshot from peer server
  • part 4.2 [Deepstore] get snapshot from Deepstore

deemoliu avatar Dec 06 '22 17:12 deemoliu

Thanks for summarizing it. Part 1.3 is not required. Controller will ask server for the snapshot and then controller is responsible for the snapshot upload

Jackie-Jiang avatar Dec 07 '22 20:12 Jackie-Jiang

The POC was done in #10047 however there are unhandled corner cases. These corner cases was addressed in #10915

deemoliu avatar Jul 25 '23 18:07 deemoliu