Expiration background job for activities is not using an index
The query is quite heavy and there is no index on the timestamp column alone, so the delete is not using any index and therefor takes quite some time:
MariaDB [oc]> EXPLAIN DELETE FROM `oc_activity` WHERE `timestamp` < 1234567
-> ;
+------+-------------+-------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+---------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | oc_activity | ALL | NULL | NULL | NULL | NULL | 6673188 | Using where |
+------+-------------+-------------+------+---------------+------+---------+------+---------+-------------+
timestamp isn't an index but that would speed-up the housekeeping cron job for activities. Potentially check back with Joas.
I run on kubernetes. And I don't necessarily keep or analyses logs from cron jobs.
So, I get the monitoring telling me the database is going to reach 5GB soon. This instance has a bit more than 40 members. And the activity table is 3.6GB.
I changed the variable activity_expire_days to 60 days without much effect then.
I guess it times out, silently for me.
I deleted manually some entries:
delete from activity where timestamp < 1627779661
It was a lot of trial error to pick the right timestamp, as the database would timeout.
I created the index:
CREATE INDEX activity_timestamp ON public.activity(timestamp);
Now these queries:
select count(*) from activity where timestamp < 1654045261 ;
count
---------
2104123
(1 row)
Are a lot faster.
The last activity in the database is currently from this date: Wednesday, December 1, 2021 7:33:21 AM
I hope the cron will work this weekend so that I can report back.
It worked, so we just need to add an index to fix it :)
Do you manage these files manually? Or do you have some cli to template them? I could try to PR.
Mind to share how big the index is and how big your table is?
From what I understand, postgres doesn't reclaim space automatically.
So when I run the commands, I got: (I think first command returns table size, and second, table size plus all the indexes)
nextcloud=# SELECT pg_size_pretty(pg_relation_size('activity'));
pg_size_pretty
----------------
2173 MB
(1 row)
nextcloud=# SELECT pg_size_pretty( pg_total_relation_size('activity') );
pg_size_pretty
----------------
3337 MB
(1 row)
From 1st December until now, so pretty much 8 months.
Then I decided to reclaim the space:
vacuum full;
And then the result was:
nextcloud=# SELECT pg_size_pretty(pg_relation_size('activity'));
pg_size_pretty
----------------
241 MB
(1 row)
nextcloud=# SELECT pg_size_pretty( pg_total_relation_size('activity') );
pg_size_pretty
----------------
323 MB
(1 row)
is it enough info? If you need more, let me know!
On an other Nextcloud instance: Before and after the index creation:
nextcloud=# SELECT pg_size_pretty(pg_relation_size('activity'));
pg_size_pretty
----------------
2203 MB
(1 row)
nextcloud=# SELECT pg_size_pretty( pg_total_relation_size('activity') );
pg_size_pretty
----------------
3150 MB
(1 row)
nextcloud=# CREATE INDEX activity_timestamp ON public.activity(timestamp);
CREATE INDEX
nextcloud=# SELECT pg_size_pretty( pg_total_relation_size('activity') );
pg_size_pretty
----------------
3158 MB
(1 row)
so the index in this case takes 8MB.
Okay, that sounds acceptable. We have a performance hackweek in 10 days. I will bring it up there and discuss it.
How can I add this index in a migration? I'm willing to PR it, but I've read the files, without much success, if you could help a bit, I think I can PR :)
This needs to be done via the db:add-missing-indices command in the server, because otherwise there is a big issue on the upgrade.
With truncate, you don't need to vacuum afterward \o/
https://www.postgresql.org/docs/9.1/sql-truncate.html
truncate activity;