Sometimes there is table bloat on your postgres tables. A large amount of data that is stored that serves no purpose operationally.
The large amount of data may hinder replication of data to other sources.
One may decide to archive the data or just get rid of it if it is unneeded.
How long does it take though?
Furthermore – to actually reclaim the storage space one has to VACUUM FULL
– as when postgres updates or deletes – it does not physically delete any data. It just points things around. Need to find a legit link for further reading on this.
Careful with VACUUM FULL
it locks the entire table with an exclusive lock preventing any reads or writes from any other connection.
How Long do Deletes Take?
This is to give a general idea and will of course differ.
The test was run on:
db.t3.medium
AWS instance running Aurora PostgreSQL [4GB RAM and 2 vCPUS]db.r5.2xlarge
cluster writer instance running Aurora PostgreSQL [64GB RAM and 8 vCPUS]
Deletes were done using a big int id field as the selector.
Deleting a million records:
DELETE FROM WHERE id <= 1000000;
DELETE 1000000
db.t3.medium
: 87 seconds. Rate: 11_500 records/seconddb.r5.2xlarge
: 12 seconds Rate: 83_333 records/second
Deleting 10 million records:
DELETE FROM WHERE id <= 30000000 AND id >= 20000000;
DELETE 10000000
db.t3.medium
: 564 seconds or 10 minutes. Rate: 17_730 records/seconddb.r5.2xlarge
: 100 seconds or 1 min 40 secs: Rate: 100_000/second
Deleting 50 million records:
DELETE FROM WHERE id < 400000000;
DELETE 50000000
db.t3.medium
: 2514 seconds or 42 minutes. Rate: 19_888 records/seconddb.r5.2xlarge
: 496 seconds or 8 min 16. Rate 100_000/second
How Long does a VACUUM FULL take?
In total about 400_000_000 records were deleted. About 35 GB in total.
VACUUM FULL VERBOSE ;
db.t3.medium
: 1912 seconds or 32 minutes. A rough rate of reclaiming storage of a gigabyte a minute.db.r5.2xlarge
: 131 seconds or 2 min 11 secs. A gigabyte every 4 seconds.