In this blog, I’ll describe what PostGIS is, how you can set this up quickly and easily using EnterpriseDB’s Advanced Server and scratch the surface of what’s possible using PostGIS.
First of all, what is PostGIS?
PostGIS is a spatial extender for Postgres and adds more geometric types and spatial functionality to Postgres. You can use PostGIS to easily calculate distances between geometric objects, examples of which are points, lines, and polygons; you can easily calculate areas, whether one or more points sit within any geometric shapes (e.g. which town does a house sit within), whether lines cross shape boundaries and so on. One example of current enterprise use of PostGIS is by local administrative organizations (e.g. city councils) who use PostGIS to calculate business rates/land taxes for land ownership/use.
What PostGIS is not...
is something that looks like google maps, although you can set up a tile server application that can present PostGIS data in exactly that way. A future blog will show you how to do that using OpenStreetMap data as your data source and PostGIS as your database.
This blog will show you how easy it is to set up PostGIS on your database server and will finish off by showing you how to use just a few of the spatial functions that are available to you.
So Let’s get started with the installation.
I’ll assume that you have just installed your minimal copy of Centos 7 and your Centos 7 has access to the Internet.
Once we’re all set, let’s log in as the root user.
Let's install a few packages.
yum -y install epel-release
yum -y install openssh-server openssh-clients
yum -y install ntp
yum -y update
Now I’ll set up the time correctly for my location - Sydney, Australia.
timedatectl set-timezone Australia/Sydney
systemctl start ntpd
systemctl enable ntpd
You’ll need an EDB account to complete the following installations. If you don’t already have an EDB account, you can create one (which will give you your access credentials) here: https://www.enterprisedb.com/repository-access-request
Next, we’ll install the EDB REPO and set up our EDB YUM repo username and password.
yum -y install http://yum.enterprisedb.com/edbrepos/edb-repo-latest.noarch.rpm
export YUMUSERNAME=<YOUR_EDB_YUM_REPO_USERNAME>
export YUMPASSWORD=<YOUR_EDB_YUM_REPO_PASSWORD>
sed -i "s/<username>:<password>/$YUMUSERNAME:$YUMPASSWORD/g" /etc/yum.repos.d/edb.repo
sed -i "\/edbas11/,/gpgcheck/ s/enabled=0/enabled=1/" /etc/yum.repos.d/edb.repo
sed -i "\/enterprisedb-tools]/,/gpgcheck/ s/enabled=0/enabled=1/" /etc/yum.repos.d/edb.repo
sed -i "\/edb-repos/,/gpgcheck/ s/enabled=0/enabled=1/" /etc/yum.repos.d/edb.repo
sed -i "\/enterprisedb-dependencies/,/gpgcheck/ s/enabled=0/enabled=1/" /etc/yum.repos.d/edb.repo
Install EDB Advanced server v 11 and PostGIS
setenforce 0
yum -y install edb-as11
yum -y install edb-as11-postgis
setenforce 1
mkdir /usr/edb/as11/data
chown enterprisedb /usr/edb/as11/data
chgrp enterprisedb /usr/edb/as11/data
chmod 700 /usr/edb/as11/data
We’ll create a file with your environment variables that we will use globally; you can do the same in a bash.rc file if you prefer.
vi /etc/profile.d/edb.sh
PGDATA=/usr/edb/as11/data
PATH=/usr/edb/as11/bin:$PATH
MANPATH=/usr/edb/as11/share/man:$MANPATH
LD_LIBRARY_PATH=/usr/edb/as11/lib:$LD_LIBRARY_PATH
export PGDATA
export PATH
export MANPATH
export LD_LIBRARY_PATH
Save the file, then type:
source /etc/profile.d/edb.sh
Initialize the Database cluster
su enterprisedb
initdb -D /usr/edb/as11/data/
And startup Advanced Server
cd /usr/edb/as11/data/
pg_ctl -D /usr/edb/as11/data/ -W start
Login to the database using the psql client
psql -U enterprisedb -p 5444 -d template1
At this point you would normally configure your firewall and/or security groups, secure your database, remove default access to the public group and public schema, add a password to the enterprisedb user account, set up pg_hba.conf to tighten up access control and so on.
I’m going to skip all of those steps here as our goal is to just install and run a demo of PostGIS.
Note: if you want advice on setting up, fully securing and tuning EnterpriseDB’s Advanced Server, you can contact EDB using one of the contact methods on the EDB website.
Set up a new database template where we'll add our PostGIS extensions
CREATE DATABASE template_postgis WITH template = template1;
UPDATE pg_database SET datistemplate = TRUE WHERE datname = 'template_postgis';
\c template_postgis
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
CREATE EXTENSION postgis_sfcgal;
CREATE EXTENSION fuzzystrmatch;
CREATE EXTENSION postgis_tiger_geocoder;
CREATE EXTENSION address_standardizer;
CREATE EXTENSION address_standardizer_data_us;
CREATE EXTENSION hstore;
We now have PostGIS database template setup, PostGIS installed, as well as some relevant supporting extensions. So let’s go ahead and create a database called test, create a single table and then insert the EDB office names, longitudes and latitudes so that we can use this data to calculate the distances in Km between the different EDB offices.
We'll set up a test database using the template database we just created and then create the table we need.
CREATE DATABASE test WITH template = 'template_postgis';
\c test
CREATE TABLE edb_offices
(
edb_pk SERIAL PRIMARY KEY,
office_name VARCHAR(128),
lng numeric(9,6),
lat numeric (9,6)
);
Insert the basic data we need; the data consists of the EDB office names along with their longitude and latitudes.
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('Australia HQ' , 151.206940 , -33.839632);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('USA HQ' , -71.248010 , 42.510452);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('EMEA HQ' , -0.757880 , 51.415287);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('Japan HQ' , 139.713593 , 35.711357);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('Korea HQ' , 127.077235 , 37.391944);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('Pakistan HQ' , 73.019291 , 33.671297);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('Singapore HQ' , 103.859465 , 1.295488);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('India HQ' , 73.877699 , 18.533235);
INSERT INTO edb_offices(office_name , lng , lat) VALUES ('Netherlands HQ' , 4.846080 , 52.345114);
Now for the interesting part.
Let’s write a query that calculates the distance between the EDB offices using the PostGIS ST_DistanceSpheroid and ST_DistanceSphere functions.
I won’t go into too much detail about how this works here other than to say that the ST_DistanceSphere function calculates distances over the surface of a sphere, whereas ST_DistanceSpheroid calculates the distance over an oblate spheroid which is a little more accurate but takes a little longer to calculate.
Calculate km distances between EDB offices
SELECT
a.office_name,
b.office_name,
ROUND
(
CAST
(
ST_DistanceSpheroid
(
geometry(point(a.lng, a.lat)),
geometry(point(b.lng, b.lat)),
'SPHEROID["WGS 84",6378137,298.257223563]'
) /1000 As numeric
),2
) AS spheroid_distance_in_km,
ROUND
(
CAST
(
ST_DistanceSphere
(
geometry(point(a.lng, a.lat)),
geometry(point(b.lng, b.lat))
) /1000 As numeric
),2
) AS sphere_distance_in_km
FROM
edb_offices a
CROSS JOIN
edb_offices b
WHERE
a.edb_pk<b.edb_pk
ORDER BY
spheroid_distance_in_km
DESC
;
and the result is:
office_name | office_name | spheroid_distance_in_km | sphere_distance_in_km
---------------+-----------------+-------------------------+-----------------------
Australia HQ | EMEA HQ | 17029.74 | 17034.35
Australia HQ | Netherlands HQ | 16639.74 | 16644.99
Australia HQ | USA HQ | 16222.13 | 16223.09
USA HQ | Singapore HQ | 15127.78 | 15119.45
USA HQ | India HQ | 12362.90 | 12344.39
Australia HQ | Pakistan HQ | 11063.97 | 11078.08
{…}
Singapore HQ | India HQ | 3788.17 | 3789.71
Pakistan HQ | India HQ | 1679.37 | 1685.44
Japan HQ | Korea HQ | 1145.71 | 1143.20
EMEA HQ | Netherlands HQ | 399.41 | 398.20
And that’s it, it’s that easy.
It should be noted that if you just want to run the distance calculations shown above you could just set up the table data, create a small stored procedure which used the haversine formula, Vincenty’s algorithm or another algorithm of your choice to do the calculations although PostGIS has much more functionality than just distance calculations between lng/lat points.
I’ll blog more on the subject of PostGIS over the coming months so that you can get a better idea of the full capabilities it has to offer.