The new table bloat query is different from the check_postgres.pl version in several ways:
- Rewritten to use WITH statements for better maintainability and clarity
- Conditional logic for old Postgres versions and 32-bit platforms taken out
- Index bloat removed, since we have a separate query for that
- Columns formatted to be more immediately comprehensible
- Some attributes (such as JSON and polygon fields) have no stats, so those tables can't be estimated.
- There's no good way to estimate bloat for compressed (TOAST) attributes and rows.
Results look like this:
databasename | schemaname | tablename | pct_bloat | mb_bloat | table_mb --------------+------------+-------------------+-----------+----------+---------- members_2014 | public | current_member | 92 | 16.98 | 18.547 members_2014 | public | member_response | 87 | 17.46 | 20.000 members_2014 | public | archive_member | 84 | 35.16 | 41.734 members_2014 | public | survey | 57 | 28.59 | 50.188
pct_bloat is how much of the table (0 to 100) is estimated to be dead space. MB_bloat is how many megabytes of bloat are estimated to exist. Table_mb is the actual size of the table in megabytes.
The suggested criteria is to list tables which are either more than 50% bloat and bigger than 10MB, or more than 25% bloat and bigger than 1GB. However, you should calibrate this according to your own database.
You might have miss my last blog post about table bloat:
«I'm not clear on where that error is coming from, so help improving the math is very welcome!»
I wrote about bloat coming from type alignment there. Maybe it takes part on the +/- 20% error you have with your query ? Note that my query is doing some rough estimation on toasted data as well.
See you at pgconf.eu un spain next week ?
Yes, I did and there's no way to search your blog so I couldn't find it. Damn, now I'm going to have to rewrite my query.Delete
No, I won't make it to Madrid. There wasn't really a business or community justification to go.
« there's no way to search your blog so I couldn't find it »Delete
There's an index page listing all my articles: http://blog.ioguix.net/
Plus, they are published on planet.postgresql.org.
Then, you can follow my gists as well.
Hey Josh, I think you might find it interesting. It is a comparison of different table bloat estimation queries.ReplyDelete
tablename | pgx_scripts | pgtoolkit | ioguix | pgstattuple
table7 | 81 | 81.36 | 80.93 | 80.26
table1 | 75 | 54.93 | 74.64 | 44.70
table5_p2 | 61 | 62.04 | 56.33 | 59.83
table2 | 48 | 56.52 | 47.82 | 45.77 X
table5_p1 | 34 | 34.78 | 23.95 | 32.28
table4 | 28 | 44.00 | 44.00 | 46.39
table5 | 17 | 16.83 | 50.91 | 50.05 X
table3 | 0 | -4.00 | 0 | 6.42
the test DB DDL: https://github.com/grayhemp/pgtoolkit/blob/master/misc/scratch.sql#L5-L228
Oh, wow, thanks! Looks like I need to do some fine-tuning with tables 4 and 5.Delete
Yeah, I actually tried to figure out how @ioguix get table5's estimation so precise but with no success :( So, if you will have any thoughts I would be glad to hear/read them.Delete
I moved these bloat estimation queries to a proper github repository some weeks ago. These comments here just remembered me I had to update the gist to redirect visitors to this new repo :-)
@Sergey, as I kept fixing and nip-ticking these queries, I would be curious to know how they perform in their current version with your test case. As instance, last versions of the queries now pay attention to the fillfactor, so the result for table1 might be better now I guess..
@ioguix here we are: https://gist.github.com/grayhemp/55d6b547b4f40d69ba7dDelete
Your result is almost perfect!
Thank you for your time and these tests Sergey!