Optimize FIND_JOBS_BY_SHOW
This is an experiment to optimize FIND_JOBS_BY_SHOW which is the heaviest query in our use-case.
Environment / Test data
- CentOS8 VM, Broadwell 24 cores, PostgreSQL13 1 instance
- ~8K jobs, ~8K layers, ~10M frames (1 layer per job, 1K/10K frames per layer)
- extract 500 jobs sorted with
int_priorityandts_started
Original FIND_JOBS_BY_SHOW
- SQL https://gist.github.com/splhack/6905e35defd6bcbd92aeb82c7e645e41
- Analyze result 249ms https://gist.github.com/splhack/bd7c5221acb43e095dffb3e8ddafd472
Optimized FIND_JOBS_BY_SHOW (~7 times faster than original)
- SQL https://gist.github.com/splhack/8faead2d482f3dde2cba8226f6115dda
- Analyze result 36ms https://gist.github.com/splhack/f93f4e784649a94bd9fdfc50129a9349
Explanation
- Use CTE to sort
jobandjob_resourcefirst with some light filters (job state, pause, os, show ID, facility ID) and addROW_NUMBERasrank - Filter with job, folder and layer calculation on the "sorted" jobs
Caveat
- There are no guarantees that CTE provides the "sorted" result to outer query, thus the outer query should have had
ORDER BY rank, and it makes the query slow.
Theory
- Sort jobs with priority (and ts_started for FIFO #1060)
- O(N x log N) merge sort if we do everytime
- O(log N) binary search for insert if we can keep sorted list
- Filter jobs
- Worst case O(N) linear search on the sorted jobs to check whether the utilization of job, folder and layer exceeds the maximum number of cores and gpus or not
I like where this is going. One question, does the mentioned caveat means the benchmark is not actually comparing apples to apples? I mean, would the result of the proposed implementation be good enough to dispatch jobs in a somehow ordered fashion or do we need to add ORDER BY rank to make it usable?
One question, does the mentioned caveat means the benchmark is not actually comparing apples to apples?
The benchmark results (EXPLAIN ANALYZE) are exact apples to apples comparison.
Exact same environment, exact same data.
The only difference is the SQL, FIND_JOBS_BY_SHOW vs FIND_JOBS_BY_SHOW_OPTIMIZED.
would the result of the proposed implementation be good enough to dispatch jobs in a somehow ordered fashion
Yes, it is. At least, in the range of amount data (~8K jobs, ~8K layers), the query result are the exact same.
'The exact same' means that both FIND_JOBS_BY_SHOW and FIND_JOBS_BY_SHOW_OPTIMIZED returns the exact same entities in the exact same order.
The caveat is because PostgreSQL (or any relational database) does not guarantee the outer query can see the exact same order with the inner query order.
But, with PostgreSQL version 13 and most likely earlier versions, the CTE (Common Table Expression, WITH clause query) will generate a temporary dataset in memory or materialize it to storage in the specified order because the CTE query requires to set the rank with ROW_NUMBER using the int_priority (and ts_started) order.
Thus, (I guess) in the current PostgreSQL implementation, the outer query just uses the temporary dataset as is. There are no technical reason that the outer query will see the temporary dataset in different order than the CTE query.
I can add dispatcher.find_jobs_optimization_enabled property to switch the SQL, just like #1060
Agreed, this looks good.
I can add dispatcher.find_jobs_optimization_enabled property to switch the SQL, just like #1060
I think this is a good idea for now.