go
go copied to clipboard
services/horizon/internal/db2/history: Improve performance of effects query
PR Checklist
PR Structure
- [ ] This PR has reasonably narrow scope (if not, break it down into smaller PRs).
- [ ] This PR avoids mixing refactoring changes with feature changes (split into two PRs otherwise).
- [ ] This PR's title starts with name of package that is most changed in the PR, ex.
services/friendbot, orallordocif the changes are broad or impact many packages.
Thoroughness
- [ ] This PR adds tests for the most critical parts of the new functionality or fixes.
- [ ] I've updated any docs (developer docs,
.mdfiles, etc... affected by this change). Take a look in thedocsfolder for a given service, like this one.
Release planning
- [ ] I've updated the relevant CHANGELOG (here for Horizon) if needed with deprecations, added features, breaking changes, and DB schema changes.
- [ ] I've decided if this PR requires a new major/minor version according to semver, or if it's mainly a patch change. The PR is targeted at the next release branch if it's not a patch change.
What
The effects query below is one of the top 5 queries in the RDS performance insights dashboard:
SELECT heff.*, hacc.address FROM history_effects heff LEFT JOIN history_accounts hacc ON hacc.id = heff.history_account_id WHERE (
heff.history_operation_id >= $1
AND (
heff.history_operation_id > $1OR
(heff.history_operation_id = $1AND heff.order > $2)
)) ORDER BY heff.history_operation_id asc, heff.order asc LIMIT ?
The output from explain analyze is:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1001.17..371030.36 rows=200 width=473) (actual time=5.777..7.560 rows=200 loops=1)
-> Gather Merge (cost=1001.17..1640230.48 rows=886 width=473) (actual time=5.776..7.543 rows=200 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop Left Join (cost=1.15..1639128.19 rows=369 width=473) (actual time=0.025..0.491 rows=181 loops=3)
-> Parallel Index Scan using hist_e_by_order on history_effects heff (cost=0.71..1638226.97 rows=369 width=416) (actual time=0.012..0.059 rows=181 loops=3)
Index Cond: (history_operation_id >= '217570991537152001'::bigint)
Filter: ((history_operation_id > '217570991537152001'::bigint) OR ((history_operation_id = '217570991537152001'::bigint) AND ("order" > 12)))
Rows Removed by Filter: 4
-> Index Scan using index_history_accounts_on_id on history_accounts hacc (cost=0.43..2.44 rows=1 width=65) (actual time=0.002..0.002 rows=1 loops=542)
Index Cond: (id = heff.history_account_id)
Planning Time: 25.022 ms
Execution Time: 35.089 ms
(13 rows)
The where clause can be simplified from:
WHERE (
heff.history_operation_id >= 217570991537152001
AND (
heff.history_operation_id > 217570991537152001 OR
(heff.history_operation_id = 217570991537152001AND heff.order > 12)
))
to:
WHERE (heff.history_operation_id, heff.order) > (217570991537152001, 12)
Doing that removes the need for parallel workers and results in a faster execution time:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.15..580.59 rows=200 width=473) (actual time=0.049..1.195 rows=200 loops=1)
-> Nested Loop Left Join (cost=1.15..11046392.26 rows=3812781 width=473) (actual time=0.048..1.177 rows=200 loops=1)
-> Index Scan using hist_e_by_order on history_effects heff (cost=0.71..1822418.86 rows=3812781 width=416) (actual time=0.032..0.103 rows=200 loops=1)
Index Cond: (ROW(history_operation_id, "order") > ROW('217570991537152001'::bigint, 12))
-> Index Scan using index_history_accounts_on_id on history_accounts hacc (cost=0.43..2.42 rows=1 width=65) (actual time=0.005..0.005 rows=1 loops=200)
Index Cond: (id = heff.history_account_id)
Planning Time: 19.978 ms
Execution Time: 1.242 ms
(8 rows)
After deploying this change to staging, I saw that this effects query no longer appeared on the RDS performance insights dashboard.
I also noticed a decline in the number of scans on the index_history_accounts_on_id index after deploying this change to staging:
Also, since this change was deployed to staging, we have not received any 503s on the /effects endpoint:
Known limitations
[N/A]