HDDS-10110. Use RocksDB key count estimates instead of OM metrics file.
What changes were proposed in this pull request?
- Use RocksDB column family key count estimates to initialize OM metrics that need to survive restarts or be updated on snapshot install.
- Remove OM metrics json file
- Clarify usage of existing OM count metrics. Previous definitions of these metrics were inconsistent.
- numKeys: The number of entries in OBS or Legacy buckets, regardless of the value of
ozone.om.enable.filesystem.paths. - numDirs: The number of directories in FSO buckets.
- numFiles: The number of files in FSO buckets.
- numKeys: The number of entries in OBS or Legacy buckets, regardless of the value of
See the Jira description for history and details of changes to these OM metrics. With this solution, all metrics should be corrected on OM restart without repairs required. Opening this as a draft PR to get comments on the approach while tests are being finished.
What is the link to the Apache JIRA
HDDS-10110
How was this patch tested?
- [ ] WIP
What's the cost of running countRowsInTable as part of metric collection which is periodic? Also, rocks DB has a metric which estimates keys not sure if there are existing metrics from rocksdb that gives table level estimates. Ref: rocksdb_om_db_estimate_num_keys
What's the cost of running
countRowsInTableas part of metric collection which is periodic? Also, rocks DB has a metric which estimates keys not sure if there are existing metrics from rocksdb that gives table level estimates. Ref:rocksdb_om_db_estimate_num_keys
@kerneltime As far as I understand the metric collection based on the countRowsInTable (and the countEstimatedRowsInTable) from RocksDB are not running periodically. They are only set this way during OM start(), restart() and the reloadOMState(). The number of volumes, buckets (these two without estimate), files and directories were set like this previously, with this change the number of keys will also be. When OM is running the metrics are/will be increased and decreased as the requests are happening.
/pending conflicts
Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.
It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.
It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.
If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."
Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.
It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.
It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.
If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."