workload: create workload that replicates pressure on sql stats
We should create a workload generator that gets us into a state that puts heavy pressure on the persisted sql stats recording system. Ideally, the workload should:
- Generate a high volume high-cardinality statement fingerprints
- Optionally initialize with 1M rows in
system.statement_statisticsandsystem.transaction_statistics
Note that the insights workload has a --scaleStats option that generates high cardinality sql stats using different app names -- this is sufficienet for some cases but we'd also like to be able to generate cardinality using other stmt fingerprint dimensions.
Jira issue: CRDB-36142
Going to take a stab at this since it will be useful for testing https://github.com/cockroachdb/cockroach/pull/119283 and some fingerprint changes.
I started working on this the past day or so, but I now have to switch to higher priority work.
Sharing some thoughts here re: finding a way to set up a stressed SQL stats subsystem, to help revisit this later.
- Pre-filling the persisted tables appears to be quite difficult within the workload subsystem.
- Why? We don't want to rewrite the existing flush code used to insert data into the persisted tables. However, we're unable to reuse the existing code because it runs SQL via an internal executor, which is unavailable to a workload (as it runs in a separate process).
- Once the persisted tables are filled, we still do want a way to stress the in-memory stats.
- A workload could be useful here, just to provide a way to execute queries that'd be expected to generate high cardinality stats once the initial fill has been done of the persisted tables.
- If we can't use the workload to fill the persisted stats, what alternatives do we have?
- An API endpoint (admin server?) triggered via a debug command that generates random data & populates the table.
- We'd likely want a mechanism to protect against this endpoint being called accidentally.
- The endpoint should be able to make use of the PersistedSQLStats code, and now we're able to use the internal executor to insert the data.
- The endpoint would likely have to spawn a bag of async workers to populate the tables. In a loop:
- Randomly generate a stmt stat for an aggregated timestamp
- Call into the SQL stats code to write the stat to disk.
- An API endpoint (admin server?) triggered via a debug command that generates random data & populates the table.
- Backup/Restore? Very unfamiliar with this feature, but could enable us to generate a large amount of data, store it somewhere, and point the restore to that location to fill the tables.
- Downside is this would probably be brittle. If the table schema changes at all, it could require us to completely regenerate the backup file, which creates toil.