pgsql-bloat-estimation icon indicating copy to clipboard operation
pgsql-bloat-estimation copied to clipboard

Big bloat discrepencies compared to pgstattuple

Open MichaelDBA opened this issue 2 years ago • 5 comments

Using the table bloat checker here I get this on my database (PG 14):

 current_database | schemaname |          tblname           |  real_size  | extra_size  |     extra_pct      | fillfactor | bloat_size  |     bloat_pct      | is_na
------------------+------------+----------------------------+-------------+-------------+--------------------+------------+-------------+--------------------+-------
 cc_perf          | cc_merge   | ccx_ex_contactdrpservices  | 29351370752 | 14890582016 | 50.732151972784294 |        100 | 14890582016 | 50.732151972784294 | f

But with pgstattuple I get this: SELECT tuple_percent, dead_tuple_percent, free_percent FROM pgstattuple('cc_merge.ccx_ex_contactdrpservices');

 tuple_percent | dead_tuple_percent | free_percent
---------------+--------------------+--------------
         94.65 |                  0 |         3.68

Why the big discrepency?

MichaelDBA avatar Mar 08 '23 17:03 MichaelDBA

Hi,

Either your stats are wrong, or you have a lot of toasted values...

Unfortunately, columns size stats reports the size of the pointer itself when a value is toasted away and the query doesn't deal with this :/

See: https://github.com/ioguix/pgsql-bloat-estimation#toasted-fields

ioguix avatar Mar 10 '23 10:03 ioguix

let me investigate the toast side of things and get back with you, thanks for the feedback!

MichaelDBA avatar Mar 10 '23 11:03 MichaelDBA

I am not able to reproduce the situation anymore, so let's close this issue as a non-issue

MichaelDBA avatar Mar 19 '23 12:03 MichaelDBA

I am re-opening this since I have an example to work with again. Will be updating this shortly...

MichaelDBA avatar Jul 16 '23 14:07 MichaelDBA

I have logically (pg_dump/pg_restore) copied a table from one database to another, vsked_analysis. So now I can see what a completely unbloated table looks like in the target db. Both tables have the same amount of rows. There are 13 indexes on the table including the primary key. Both were vacuumed and analyzed right before gathering these stats.

My general observations are that your query is pretty good especially on the non-bloated table, but like pgstattuple, it underestimates the bloat significantly on the source, bloated table. I was wondering if you could provide any insight based on the following captured metrics.

SELECT pg_size_pretty(pg_total_relation_size('public.vsked_analysis'));

source:   154 MB
target:  5182 MB

I also ran pgstattuple on both the regular table and the toast table for them:

source --> freespace regular table =   2911964     toast table = 10331660  pctfree =  9.32
target  --> freespace regular table = 38006056     toast table = 94434512  pctfree = 43.47

Your table query:

 current_database | schemaname |    tblname     | real_size | extra_size | extra_pct | fillfactor | bloat_size | bloat_pct | is_na
------------------+------------+----------------+-----------+------------+-----------+------------+------------+-----------+-------
 v                | public     | vsked_analysis | 150601728 |  -16228352 |         0 |        100 |          0 |         0 | f

In the source:

 current_database | schemaname |    tblname     | real_size | extra_size |    extra_pct     | fillfactor | bloat_size |    bloat_pct     | is_na
------------------+------------+----------------+-----------+------------+------------------+------------+------------+------------------+-------
 v                | public     | vsked_analysis | 302776320 |   73908224 | 24.4101731601732 |        100 |   73908224 | 24.4101731601732 | f

MichaelDBA avatar Jul 16 '23 14:07 MichaelDBA