You might know that Postgres supports materialized views and foreign data wrappers (fdw). Briefly, materialized views allow for queries to be materialized and refreshed on demand. Foreign data wrappers allow data to be pulled from foreign data sources, like Nosql stores and other Postgres servers.
What you might not have considered is that materialized views and foreign data wrappers can be used together. Materialized views speed data access by summarizing data so it isn't necessary to query the base tables that are referenced by materialized views. Foreign data wrappers, because they are remote, can be slow to access. By combining the two features, you can get fast access to remote data.
Transparent data integration for Postgres-based databases. Download Now.
Let's see this in action! First, let's set up the foreign table:
CREATE DATABASE fdw_test;
\connect fdw_test;
CREATE TABLE world (greeting TEXT);
\connect test
CREATE EXTENSION postgres_fdw;
CREATE SERVER postgres_fdw_test FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'fdw_test');
CREATE USER MAPPING FOR PUBLIC SERVER postgres_fdw_test
OPTIONS (password '');
CREATE FOREIGN TABLE other_world (greeting TEXT)
SERVER postgres_fdw_test
OPTIONS (table_name 'world');
\det
List of foreign tables
Schema | Table | Server
--------+-------------+-------------------
public | other_world | postgres_fdw_test
populate it with some data:
INSERT INTO other_world
SELECT *
FROM generate_series(1, 100000);
and create a materialized view on the foreign table:
CREATE MATERIALIZED VIEW mat_view (first_letter, count) AS
SELECT left(greeting, 1), COUNT(*)
FROM other_world
GROUP BY left(greeting, 1);
Now we can compare select times for foreign tables and materialized views:
\timing
SELECT left(greeting, 1) AS first_letter, COUNT(*)
FROM other_world
GROUP BY left(greeting, 1);
first_letter | count
--------------+-------
1 | 11112
2 | 11111
3 | 11111
4 | 11111
5 | 11111
6 | 11111
7 | 11111
8 | 11111
9 | 11111
Time: 354.571 ms
SELECT * FROM mat_view;
first_letter | count
--------------+-------
1 | 11112
2 | 11111
3 | 11111
4 | 11111
5 | 11111
6 | 11111
7 | 11111
8 | 11111
9 | 11111
Time: 0.783 ms
The materialized view is much faster, but also a contrived example. It is interesting that refreshing the materialized view takes a similar time to selecting from the foreign table:
REFRESH MATERIALIZED VIEW mat_view;
Time: 364.889 ms
The above output is from Postgres 9.6. Thanks to this improvement in Postgres 10:
Push aggregates to foreign data wrapper servers, where possible (Jeevan Chalke, Ashutosh Bapat)
This reduces the amount of data that must be passed from the foreign data wrapper server, and offloads aggregate computation from the requesting server. The postgres_fdw fdw is able to perform this optimization. There are also improvements in pushing down joins involving extensions.
Postgres 10 gets faster foreign table aggregate selects than in 9.6, but still slower than using materialized views:
SELECT left(greeting, 1) AS first_letter, COUNT(*)
FROM other_world
GROUP BY left(greeting, 1);
first_letter | count
--------------+-------
1 | 11112
2 | 11111
3 | 11111
4 | 11111
5 | 11111
6 | 11111
7 | 11111
8 | 11111
9 | 11111
Time: 55.052 ms
You don't even need to use aggregates in materialized views — you can just copy the contents of a remote table into a materialized view for faster access, and refresh the materialized view occasionally (though logical replication in Postgres 10 does this better):
CREATE MATERIALIZED VIEW mat_view2 AS
SELECT *
FROM other_world;
Now we can run a performance test on the foreign table and its local copy:
\o /dev/null
SELECT *
FROM other_world;
Time: 317.428 ms
SELECT * FROM mat_view2;
Time: 34.861 ms
In summary, materialized views and foreign data wrappers are two features that work well together. Materialized views allow remote data to be cached locally, either entire tables or aggregate summarizations. The cache can be refreshed using refresh materialized view. Also, Postgres 10 speeds up aggregate queries on foreign tables.
Bruce Momjian is a Senior Database Architect at EnterpriseDB.
This post originally appeared on Bruce's personal blog.