Today’s connected enterprise requires a single database that can handle both structured and unstructured data efficiently and that adapts dynamically to swiftly changing and emerging data types. For many organizations, that database is Postgres. With JSON, Postgres can support document databases alongside relational tables and even combine structured and unstructured data. This provides organizations with the atomicity, consistency, isolation and durability (ACID) compliance and common business logic required to ensure data integrity.
Postgres database administrators are expanding their expertise working with the NoSQL capabilities in Postgres. Critical to this skill set is building document databases using JSON. This post is the second in a series dedicated to helping DBAs enrich their abilities for implementing the new advantages of Postgres and transform their data center into an innovation platform. The first post, Building JSON Documents from Relational Tables, published on March 3, 2016, explained how to create nested JSON documents from relational tables with PostgreSQL 9.5 using a data set of 47,000 airports.
Enterprise-ready Postgres tools for high availability, monitoring, and disaster recovery. Download Now.
downloaded here. (* READ includes CREATE, UPDATE and DELETE. READ and indexing examples will be explored in a later post in this series.)
This post explores the Create, Read, Update, and Delete operations in PostgreSQL, known as CRUD. It is the second in a series of posts exploring PostgreSQL's unique JSON capabilities within the RDBMS framework. The (many) examples use the airports.json data set created in the previous post and can beThe airports.json data set contains a document for ~47K worldwide airports. It is based on the six csv files / RDBMS tables in the http://ourairports.com data set. See the previous post for more details.
PostgreSQL's JSON CRUD capabilities expand the range of JSON applications we can build with PostgreSQL because they free developers from the limited previous workflow need to:
- Create JSON in the application / get JSON from an external source
- Store JSON documents in the database
- Retrieve documents when needed
- Modify JSON by
a. Retrieve the document to the application tier
b. Use application code to make changes
c. Put the document back in the database when changes are finished
With PostgreSQL's JSON CRUD capabilities, developers can now:
- Create JSON (and GeoJSON!) from PostgreSQL tables, application code or external sources
- Store JSON documents in the PostgreSQL
- Retrieve tags, values, arrays, subdocuments and documents using SQL
- Modify JSON inside PostgreSQL using
CREATE
CREATE documents by loading from a *.json file and create new elements within documents.
Load *.json file of documents into PostgreSQL
The airports.json file is available here for download or it can be built using the SQL in the blog post, Building JSON Documents from Relational Tables.
NOTE: Think about loading *.json files just like loading any other type of data. I generally use Foreign Data Wrappers (FDWs) because they are simple, fast, and easy to troubleshoot. The syntax using COPY is even simpler and is included for reference:
CREATE EXTENSION IF NOT EXISTS file_fdw;
CREATE SERVER ext_table FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE airports_fdw (
airport JSONB) SERVER ext_table
OPTIONS (filename '/path/to/file/airports.json', format 'csv',
header 'false', delimiter '|', quote '$');
-- delimiter and quote should not appear in the file
CREATE TABLE airports (
id BIGSERIAL PRIMARY KEY,
airport JSONB);
INSERT INTO airports (airport)
(SELECT airport FROM airports_fdw);
Load using COPY
A simpler, less flexible alternative is to use COPY, http://www.postgresql.org/docs/9.5/static/sql-copy.html.
COPY airports (airport)
FROM '/path/to/file/airports.json'
WITH delimiter '|' quote '$' CSV;
See what a record looks like
Look at Baltimore/Washington International Thurgood Marshall Airport (KBWI) as an example.
SELECT JSONB_PRETTY(airport)
FROM airports
WHERE airport ->> 'ident' = 'KBWI';
NOTE: ->> operator returns JSONB. -> returns TEXT/ INT. See http://www.postgresql.org/docs/9.5/static/functions-json.html
Add an outer-level tag and value
The airport JSON documents don’t have a tag to store the airlines that service them. Add a tag, carrier_hub, to Southwest Airlines hubs using the JSONB concatenation operator, ||, and jsonb_set.
UPDATE airports
SET airport = airport || '{"carrier_hub": ["Southwest"]}'
WHERE airport ->> 'ident' IN (
'KBWI', -- Baltimore-Washington International
'KMDW', -- Chicago Midway
'KLAS', -- Las Vegas
'KDAL'); -- Love Field, Dallas;
Add an id tag to the airport JSON
jsonb set.http://www.postgresql.org/docs/9.5/static/functions-json.html, is the primary PostgreSQL JSON CRUD operator. Note the optional create missing flag, which make jsonb_set behave like a INSERT when false, the default, or like an UPSERT when the flag is true.
Note that jsonb_set has access to other data within PostgreSQL, such as columns, variables, etc..
UPDATE airports_json
SET airport = jsonb_set(
airport,
'{id}',
id::TEXT::JSONB,
true); -- creates id column in JSON using airports_json.id
Verify the results:
SELECT
airport ->> 'name',
airport ->> 'ident',
airport ->> 'carrier_hub'
FROM airports
WHERE airport ->> 'ident' IN ('KBWI', 'KMDW', 'KLAS', 'KDAL');
Add an element to an array
The tag airport_keyword is an array of cities that the airport is associated with. For example, Baltimore/Washington International Thurgood Marshall Airport (KBWI), Ronald Reagan Washington National Airport (KDCA) and Washington Dulles International Airport (KIAD), all have the WAS element.
The SQL below adds 'BAL' to the airport_keyword array for BWI, Baltimore-Washington Airport using the concatenation operator, ||, and JSONB_SET with a subquery.
UPDATE airports
SET airport = JSONB_SET(
airport,
'{airport_keywords}',
(SELECT (airport -> 'airport_keywords') || TO_JSONB('BWI'::TEXT)
FROM airports WHERE airport ->> 'ident' = 'KBWI'),
false)
WHERE airport ->> 'ident' = 'KBWI';
Verify the results
SELECT airport ->> 'airport_keywords'
FROM airports
WHERE airport ->> 'ident' = 'KBWI';
READ
READing documents, subdocuments, arrays, tags and values will be in the next blog post along with indexing.
UPDATE
Change the value of a first-level tag
Change Salisbury Ocean City Wicomico Regional Airport, KSBY, from a medium_airport to a large_airport.
Check type first
SELECT airport ->> 'type'
FROM airports
WHERE airport ->> 'ident' = 'KSBY';
Update type
UPDATE airports
SET airport = JSONB_SET(
airport,
'{type}',
TO_JSONB('large airport'::TEXT),
false)
WHERE airport ->> 'ident' = 'KSBY';
Recheck type
Change the value of a tag in a nested document
Change the Howard County General Hospital Heliport, MD25, runway surface from 'GRASS / SOD' to 'ASPH'.
Note that I used a nested JSONB_SET. The inner level is to change the value in the runways nested document and the outer level is to update the parent document.
Check the surface first:
SELECT JSONB_ARRAY_ELEMENTS(airport -> 'runways') ->> 'surface'
FROM airports
WHERE airport ->> 'ident' = 'MD25';
JSONB_ARRAY_ELEMENTS returns a row for each JSONB subdocument.
It is a helpful technique to think of the -> operator, which returns JSONB, in a series of pipes - just like in Linux. In this example, the first (and only) element in the runways array of subdocuments is ‘piped’ to the ->> operator, which returns TEXT / INT/
Update the value.
UPDATE airports
SET airport = JSONB_SET(
airport,
'{runways}',
(WITH heli_runs AS (
SELECT airport -> 'runways' AS heli_run
FROM airports
WHERE airport ->> 'ident' = 'MD25')
SELECT JSONB_SET(
JSONB_ARRAY_ELEMENTS(heli_run),
'{surface}',
TO_JSONB('asph'::text),
false)
FROM heli_runs),
false)
WHERE airport ->> 'ident' = 'MD25';
Recheck surface
DELETE
Remove an outer-level tag
Use the minus operator, '-', to remove the airport_wikipedia and region_wikipedia tags from each document.
UPDATE airports
SET airport = (airport - 'airport_wikipedia' - 'region_wikipedia');
Remove an element from an array
Remove the 'Idlewild' element from the airport_keywords array for John F. Kennedy International Airport, KJFK.
UPDATE airports
SET airport =
JSONB_SET(
airport,
'{airport_keywords}',
(WITH kjfk_keys AS (
SELECT JSONB_ARRAY_ELEMENTS(airport -> 'airport_keywords') keywords
FROM airports
WHERE airport ->> 'ident' = 'KJFK')
SELECT JSONB_AGG(keywords)
FROM kjfk_keys
WHERE keywords::TEXT != '"Idlewild"'))
WHERE airport ->> 'ident' = 'KJFK';
Remove a tag from a sub-document
BWI airport has four runways:
●22 / 04 6,000 ft
●28 / 10 10,520 ft
●33R / 15L 5,000 ft
●33L / 15R 9,501 ft
NOTE: Runways are named as 10 * the magnetic direction a plane is heading when on the centerline. For example, runway 28 points almost due west: 280°. The second name is the same runway in the other direction, 180° off, or subtracting 18. Runway 22 - 18 = runway 04. Parallel runways follow the same rules and add L (left) or R (right).
This SQL removes the le_displaced_threshold_ft from BWI runway '28'.
NOTE: A runway’s displaced threshold is the distance past the beginning of runway pavement that a plan should land.
UPDATE airports
SET airport =
JSONB_SET(
airport,
'{runways}',
(SELECT
(WITH kbwi_runways AS (
SELECT JSONB_ARRAY_ELEMENTS (airport -> 'runways') runway
FROM airports
WHERE airport ->> 'ident' = 'KBWI')
SELECT runway - 'le_displaced_threshold_ft'
FROM kbwi_runways
WHERE runway ->> 'he_ident' = '28')
||
(WITH kbwi_runways AS (
SELECT JSONB_ARRAY_ELEMENTS(airport -> 'runways') AS runway
FROM airports
WHERE airport ->> 'ident' = 'KBWI')
SELECT JSONB_AGG(runway)
FROM kbwi_runways
WHERE runway ->> 'he_ident' != '28'))
)
WHERE airport ->> 'ident' = 'KBWI';
The logic of this SQL is:
- Create a row/JSONB_document for each runway with JSONB ARRAY ELEMENTS, which returns each element in a JSONB array as a result set row.
- Find the row for runway '28' and remove the le displaced threshold ft ' tag
- Concatenate the modified runway '28' subdocument with the other runway subdocuments
- Update the BWI airport JSONB document with the updated array or runways.
It would be simpler if there was only one runway or we knew the array position of runway '28'.
Summary
PostgreSQL 9.5 has powerful and unique JSON CRUD capabilities that enable new use cases, data models and (hopefully) new markets!
Jamey Hanson is a Data Architect at Freedom Consulting Group, an EnterpriseDB Partner.