Adding new table columns with default values in PostgreSQL 11

October 18, 2018

In PostgreSQL version 10 or less, if you add a new column to a table without specifying a default value then no change is made to the actual values stored. Any existing row will just fill in a NULL for that column. But if you specify a default value, the entire table gets rewritten with the default value filled in on every row. That rewriting behavior changes in PostgreSQL 11.

In a new feature I worked on and committed, the default value is just stored in the catalog and used where needed in rows existing at the time the change was made. New rows, and new versions of existing rows, are written with the default value in place, as happens now. Any row that doesn’t have that column must have existed before the table change was made, and uses this value stored in the catalog when the row is fetched. The great advantage of this is that adding a column with a default value is now quite a fast and cheap operation, whereas before for very large tables it has been horribly, often intolerably slow. Rewriting a whole multi-terabyte table is really something you want to avoid.

The default value doesn’t have to be a static expression . It can be any non-volatile expression, e.g. CURRENT_TIMESTAMP. Volatile expressions such as random() will still result in table rewrites. For a non-volatile expression, it is evaluated at the time the statement runs and the result is what is stored in the catalog for use with existing rows. Of course, for new rows the expression will be re-evaluated at the time the row is created, as happens now.

Any time that the table does get rewritten, say by VACUUM FULL, all this is cleaned up, as there will be no rows left needing the value from the catalog. Likewise, any time that a row is updated the column’s value is filled in on the new row version, so over time the proportion of rows using the value fetched from the catalog will decrease.

Share this