Our Blog

Ongoing observations by End Point people

How fast is pg_upgrade anyway?

By Greg Sabino Mullane · Wednesday, July 1, 2015

Tags: postgres

Back in the old days, upgrading Postgres required doing a pg_dump and loading the resulting logical SQL into the new database. This could be a very slow, very painful process, requiring a lot of downtime. While there were other solutions (such as Bucardo) that allowed little (or even zero) downtime, setting them up was a large complex task. Enter the pg_upgrade program, which attempts to upgrade a cluster with minimal downtime. Just how fast is it? I grew tired of answering this question from clients with vague answers such as "it depends" and "really, really fast" and decided to generate some data for ballpark answers.

Spoiler: it's either about 3.5 times as fast as pg_dump, or insanely fast at a flat 15 seconds or so. Before going further, let's discuss the methodology used.

I used the venerable pgbench program to generate some sample tables and data, and then upgraded the resulting database, going from Postgres version 9.3 to 9.4. The pgbench program comes with Postgres, and simply requires an --initialize argument to create the test tables. There is also a --scale argument you can provide to increase the amount of initial data - each increment increases the number of rows in the largest table, pgbench_accounts, by one hundred thousand rows. Here are the scale runs I did, along with the number of rows and overall database size for each level:

Effect of --scale
--scaleRows in pgbench_accountsDatabase size
10010,000,0001418 MB
15015,000,0002123 MB
20020,000,0002829 MB
25025,000,0003535 MB
30030,000,0004241 MB
35035,000,0004947 MB
40040,000,0005652 MB
45045,000,0006358 MB
50050,000,0007064 MB
55055,000,0007770 MB
60060,000,0008476 MB

To test the speed of the pg_dump program, I used this simple command:

$ pg_dump postgres | psql postgres -q -p 5433 -f -

I did make one important optimization, which was to set fsync off on the target database (version 9.4). Although this setting should never be turned off in production - or anytime you cannot replace all your data, upgrades like this are an excellent time to disable fsync. Just make sure you flip it back on again right away! There are some other minor optimizations one could make (especially boosting maintenance_work_mem), but for the purposes of this test, I decided that the fsync was enough.

For testing the speed of pg_upgrade, I used the following command:

$ pg_upgrade -b $BIN1 -B $BIN2 -d $DATA1 -D $DATA2 -P 5433

The speed difference can be understood because pg_dump rewrites the entire database, table by table, row by row, and then recreates all the indexes from scratch. The pg_upgrade program simply copies the data files, making the minimum changes needed to support the new version. Because of this, it will always be faster. How much faster depends on a lot of variables, e.g. the number and size of your indexes. The chart below shows a nice linear slope for both methods, and yielding on average a 3.48 increase in speed of pg_upgrade versus pg_dump:

pg_dump versus pg_upgrade
--scaleDatabase sizepg_dump
(seconds)
pg_upgrade
(seconds)
Speedup
1001.4 GB210.074.72.82
1502.1 GB305.079.43.86
2002.8 GB457.6122.23.75
2503.5 GB636.1172.13.70
3004.2 GB832.2215.13.87
3504.9 GB1098.8320.73.43
4005.7 GB1172.7361.43.25
4506.4 GB1340.2426.73.15
5007.1 GB1509.6476.33.17
5507.8 GB1664.0480.03.47
6008.5 GB1927.06073.17

If you graph it out, you can see both of them having a similar slope, but with pg_upgrade as the clear winner:

I mentioned earlier that there were some other optimizations that could be done to make the pg_dump slightly faster. As it turns out, pg_upgrade can also be made faster. Absolutely, beautifully, insanely faster. All we have to do is add the --link argument. What this does is rather than copying the data files, it simply links them via the filesystem. Thus, each large data file that makes up the majority of a database's size takes a fraction of a second to link to the new version. Here are the new numbers, generated simply by adding a --link to the pg_upgrade command from above:

pg_upgrade --link is crazy fast
--scaleDatabase sizepg_upgrade --link
(seconds)
1001.4 GB12.9
1502.1 GB13.4
2002.8 GB13.5
2503.5 GB13.2
3004.2 GB13.6
3504.9 GB14.4
4005.7 GB13.1
4506.4 GB13.0
5007.1 GB13.2
5507.8 GB13.1
6008.5 GB12.9

No, those are not typos - an average of thirteen seconds despite the size of the database! The only downside to this method is that you cannot access the old system once the new system starts up, but that's a very small price to pay, as you can easily backup the old system first. There is no point in graphing these numbers out - just look at the graph above and imagine a nearly flat line traveling across the bottom of the graph :)

Are there any other options that can affect the time? While pgbench has a handy --foreign-keys argument I often use to generate a more "realistic" test database, both pg_dump and pg_upgrade are unaffected by any numbers of foreign keys. One limitation of pg_upgrade is that it cannot change the --checksum attribute of a database. In other words, if you want to go from a non-checksummed version of Postgres to a checksummed version, you need to use pg_dump or some other method. On the plus side, my testing found negligible difference between upgrading a checksummed versus a non-checksummed version.

Another limitation of the pg_upgrade method is that all internal stats are blown away by the upgrade, so the database starts out in a completely unanalyzed state. This is not as much an issue as it used to be, as pg_upgrade will generate a script to regenerate these stats, using the handy --analyze-in-stages argument to vacuum. There are a few other minor limitations to pg_upgrade: read the documentation for a complete list. In the end, pg_upgrade is extraordinarily fast and should be your preferred method for upgrading. Here is a final chart showing the strengths and weaknesses of the major upgrade methods.

Postgres upgrade methods compared
MethodStrengthsWeaknesses
pg_dump
  • Always works
  • Battle tested
  • Slowest method
  • Maximum downtime
  • Requires lots of disk space
pg_upgrade
  • Very fast
  • --link mode super fast
  • Cannot always be used (finicky)
  • Stats are lost
  • Minimal but non-zero downtime
Bucardo
  • Handles complex cases
  • Zero-downtime possible
  • Complex to setup
  • Requires primary keys on large tables
  • Requires lots of disk space

(As an addendum of sorts, pg_upgrade is fantastic, but the Holy Grail is still out of sight: true in-place upgrades. This would mean dropping in a new major version (similar to the way revisions can be dropped in now), and this new version would be able to read both old and new data file formats, and doing an update-on-write as needed. Someday!)

Comments

Archive