go icon indicating copy to clipboard operation
go copied to clipboard

services/horizon/internal/db2/history: Improve performance of effects query

Open tamirms opened this issue 2 years ago • 0 comments

PR Checklist

PR Structure

  • [ ] This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • [ ] This PR avoids mixing refactoring changes with feature changes (split into two PRs otherwise).
  • [ ] This PR's title starts with name of package that is most changed in the PR, ex. services/friendbot, or all or doc if the changes are broad or impact many packages.

Thoroughness

  • [ ] This PR adds tests for the most critical parts of the new functionality or fixes.
  • [ ] I've updated any docs (developer docs, .md files, etc... affected by this change). Take a look in the docs folder for a given service, like this one.

Release planning

  • [ ] I've updated the relevant CHANGELOG (here for Horizon) if needed with deprecations, added features, breaking changes, and DB schema changes.
  • [ ] I've decided if this PR requires a new major/minor version according to semver, or if it's mainly a patch change. The PR is targeted at the next release branch if it's not a patch change.

What

The effects query below is one of the top 5 queries in the RDS performance insights dashboard:

SELECT heff.*, hacc.address FROM history_effects heff LEFT JOIN history_accounts hacc ON hacc.id = heff.history_account_id WHERE (
					 heff.history_operation_id >= $1
				AND (
					 heff.history_operation_id > $1OR
					(heff.history_operation_id = $1AND heff.order > $2)
				)) ORDER BY heff.history_operation_id asc, heff.order asc LIMIT ?

The output from explain analyze is:

                                                                                 QUERY PLAN                                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1001.17..371030.36 rows=200 width=473) (actual time=5.777..7.560 rows=200 loops=1)
   ->  Gather Merge  (cost=1001.17..1640230.48 rows=886 width=473) (actual time=5.776..7.543 rows=200 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop Left Join  (cost=1.15..1639128.19 rows=369 width=473) (actual time=0.025..0.491 rows=181 loops=3)
               ->  Parallel Index Scan using hist_e_by_order on history_effects heff  (cost=0.71..1638226.97 rows=369 width=416) (actual time=0.012..0.059 rows=181 loops=3)
                     Index Cond: (history_operation_id >= '217570991537152001'::bigint)
                     Filter: ((history_operation_id > '217570991537152001'::bigint) OR ((history_operation_id = '217570991537152001'::bigint) AND ("order" > 12)))
                     Rows Removed by Filter: 4
               ->  Index Scan using index_history_accounts_on_id on history_accounts hacc  (cost=0.43..2.44 rows=1 width=65) (actual time=0.002..0.002 rows=1 loops=542)
                     Index Cond: (id = heff.history_account_id)
 Planning Time: 25.022 ms
 Execution Time: 35.089 ms
(13 rows)

The where clause can be simplified from:

WHERE (
					 heff.history_operation_id >= 217570991537152001
				AND (
					 heff.history_operation_id > 217570991537152001 OR
					(heff.history_operation_id = 217570991537152001AND heff.order > 12)
				)) 

to:

WHERE (heff.history_operation_id, heff.order) > (217570991537152001, 12)

Doing that removes the need for parallel workers and results in a faster execution time:

                                                                            QUERY PLAN                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.15..580.59 rows=200 width=473) (actual time=0.049..1.195 rows=200 loops=1)
   ->  Nested Loop Left Join  (cost=1.15..11046392.26 rows=3812781 width=473) (actual time=0.048..1.177 rows=200 loops=1)
         ->  Index Scan using hist_e_by_order on history_effects heff  (cost=0.71..1822418.86 rows=3812781 width=416) (actual time=0.032..0.103 rows=200 loops=1)
               Index Cond: (ROW(history_operation_id, "order") > ROW('217570991537152001'::bigint, 12))
         ->  Index Scan using index_history_accounts_on_id on history_accounts hacc  (cost=0.43..2.42 rows=1 width=65) (actual time=0.005..0.005 rows=1 loops=200)
               Index Cond: (id = heff.history_account_id)
 Planning Time: 19.978 ms
 Execution Time: 1.242 ms
(8 rows)

After deploying this change to staging, I saw that this effects query no longer appeared on the RDS performance insights dashboard.

I also noticed a decline in the number of scans on the index_history_accounts_on_id index after deploying this change to staging:

Screenshot 2024-03-06 at 1 15 08 PM

Also, since this change was deployed to staging, we have not received any 503s on the /effects endpoint:

Screenshot 2024-03-06 at 1 19 12 PM

Known limitations

[N/A]

tamirms avatar Mar 06 '24 13:03 tamirms