How to Secure PostgreSQL: Security Hardening Best Practices and Tips

December 02, 2024

Securing data is mission-critical for the success of any enterprise and the safety of its customers. This article is intended as a comprehensive overview that will help you examine the security of your Postgres deployment from end to end.

The vast majority of the discussion will focus on features, functionality, and techniques that apply equally to PostgreSQL and EDB Postgres Advanced Server (EPAS); however, it will also touch on a couple of features that are only available in EPAS. These will be clearly noted. This article references the latest version of Postgres currently available: 12.3.

As part of the review of the components and process of securing Postgres, we'll look at the following sections:

  1. Access
  2. Authentication
  3. Roles
  4. Data access control
  5. Encryption

Access

The first part of any security review is to look at how the server is connected to and accessed. As with any security configuration, follow the principle of least privilege when considering how to configure your system; that is, only allow as much access as is required to implement a working system and no more.

Physical Access

It can be extremely difficult to prevent someone with physical access to a server from obtaining the data, but there are several measures that can be taken, both physical and technological.

First, physical access should be limited as much as possible by ensuring the server is located in a secure facility. This may be a privately owned server room, in which case measures can be taken to ensure that only authorized personnel can enter the room and that monitoring such as CCTV is employed. In the case that a co-location facility is used, ensure that the chosen provider has a strictly enforced security policy appropriately designed to prevent unauthorized access, and in facilities that allow users to enter, that locking racks and cages are available to keep other customers away from your hardware.

There's little that can be done in this regard with the major cloud providers other than to trust that they do implement the high levels of physical security that they claim. However for both cloud providers and co-location facilities, it is essential to check that they have appropriate documentation attesting to the level of security they provide, such as SOC 2 or 3.

Connecting

There are two ways to connect to a Postgres server; via a Unix Domain Socket or a TCP/IP Socket

  • Unix Domain Socket

    Unix Domain Sockets (UDS) are the default method for connecting to a Postgres database on Unix-like platforms. On Windows they are not available at present but will be in Postgres v13 and later.

    UDS are only accessible from the machine on which they are present (and therefore are not subject to direct remote attacks) and appear as special files on the file system. This means that access to them is subject to the same access controls as other files (though only write permission is needed to use the socket) and can be controlled by managing the permissions and group ownership of the socket through the unix_socket_permissions and unix_socket_group configuration options, as well as the permissions on the directory in which the socket is created. Sockets are always owned by the user that the Postgres server is running as.

    To offer even more flexibility, Postgres can create multiple sockets (though by default, only one is created) using the unix_socket_directories configuration option, each of whose directories can have different permissions as required to segregate different users or applications and help to apply the principle of least privilege.

    If your application is running on the same host as the database server, seriously consider allowing access to the server via one or more UDS only.

  • TCP/IP socket

    If you need to access your Postgres server from a remote system, as is often the case when implementing applications with multiple tiers or services, or just for remote administration using tools such as pgAdmin, you will need to use a TCP/IP network socket.

    As is generally the case when it comes to security, we want to minimize the potential attack area for anyone attempting to gain access to the system. How this is done depends on how the server is hosted on the network. If it's inside a corporate network, it may be hosted on multiple VLANs or physical networks, which can be used for different purposes, such as applications, management, and storage access, for example.

    The system should only be configured to listen for and accept connections on the networks that are required; by default, a source code build of Postgres will listen only on the localhost or loopback address, which prevents connections from other machines. However, some pre-packaged builds of Postgres override this, so you should check your installation.

    Use the listen_addresses configuration parameter in postgresql.conf to ensure Postgres only listens and accepts connections on the required network addresses, thus preventing access from, say, the storage network.

Firewall

Firewalls are an important tool to prevent unauthorized access to network ports. Many also offer logging facilities that can be used as part of a broader initiative to proactively detect intrusion attempts, helping mitigate them before they succeed.

  • Local machine

    Most modern operating systems include firewalls, including the Windows Defender Firewall on Windows and iptables on Linux; plus, there are also a number of third-party products you might choose.

    Typical firewalls will allow you to define inbound and outbound rules that specify the traffic that is allowed. These rules will consist of several common parameters:

    • The protocol, e.g. TCP or IPv6
    • The local port, e.g. 5432 (the default port for PostgreSQL)
    • The source address; i.e., where the connection attempt is coming from.

    Some firewalls offer additional options to give far greater flexibility; for example, Windows Defender Firewall allows you to specify a program instead of port number.

    As always, we want to minimize access to Postgres, so it would be quite normal to create a rule for TCP (and/or IPv6) traffic arriving on port 5432 to be rejected (or black-holed) unless it's coming from the address of our application server. The source address can usually be a list of addresses or subnets.

    If your server has any Foreign Data Wrappers or similar extensions installed, it may also be desirable to create outbound rules to prevent them from being used to connect to anything other than a predefined set of servers.

    While configuring Windows Defender Firewall is quite straightforward, configuring iptables is much more complex. Linux distributions such as Redhat and Ubuntu offer management tools to make this easier, and there are also other open source tools available, such as Ferm and Shorewall.

Minimize access to your server as much as possible using a firewall.

  • Cloud provider

    Most cloud providers recommend against using firewalls in virtual instances, suggesting instead that users make use of the firewalls built into the platform. This typically makes management much easier, allowing rule sets to be created that can be reused and attached to multiple servers and allowing management through their web and command line interfaces and REST APIs.

    Firewalls at the cloud providers are implemented as part of their network infrastructure and generally work in much the same way as the host firewalls described in the previous section, i.e., specify the source addresses, protocol, and destination port for traffic to allow.

    Most cloud providers also offer Virtual Private Clouds (VPC), in which a number of servers can coexist in a single virtual environment with its own private network or networks. This type of configuration has become the default and makes it very easy to deploy a multi-tiered system on the public cloud while keeping the non-public tiers segregated from the internet in general.

    The use of multiple subnets within a VPC can make it easy to further segregate servers, keeping the public tiers in a "DMZ" subnet, with only minimal access to the database servers that are in a private subnet with no direct internet connection.

Transport Encryption

If traffic to the database server is flowing across the network, it is good (arguably essential) practice to encrypt that traffic. Postgres uses OpenSSL to provide transport security – though work has been underway for some time to add support for Microsoft Secure Channel or Schannel and Apple Secure Transport – through the use of TLS (previously SSL).

To encrypt connections in Postgres you will need at least a server certificate and key, ideally protected with a passphrase that can be securely entered at server startup either manually or using a script that can retrieve the passphrase on behalf of the server, as specified using the ssl_passphrase_command configuration parameter. Passphrases are not supported on Windows, at least as of Postgres 12. The server certificate and key are specified using the ssl_cert_file and ssl_key_file, respectively.

If you have an existing Certification Authority (CA) in use, you can use certificates provided with Postgres. The configuration parameters ssl_ca_file and ssl_crl_file allow you to provide the CA (and intermediate) certificates and the certificate revocation list to the server. This gives you the flexibility to revoke certificates in response to security incidents and have the server reject client certificates or the client reject server certificates. It also allows you to configure the client and server to reject each other if the identity of either cannot be verified through the chain of trust to prevent as-yet undetected spoofing. The use of certificates for client authentication is discussed below.

It's important to ensure that your use of TLS is secure as well. There are several configuration parameters that can be set to ensure that you're not using ciphers or other options that may no longer be considered secure. It is recommended that you check and appropriately configure the following configuration parameters in your postgresql.conf configuration file:

  • ssl_ciphers
  • ssl_ecdh_curve
  • ssl_dh_params_file
  • ssl_min_protocol_version

No recommendation is made in this article on what those parameters should be set to, as inevitably, they will change over time. You should periodically check to ensure you're using options that continue to be regarded as secure and update them when appropriate.

If traffic to your server flows over the network, ensure it's encrypted using the strongest possible ciphers and other options.

Authentication

After access, the next security component to be considered is client authentication, which is how we authenticate users and control whether they can connect to the server successfully through the pg_hba.conf configuration file.

pg_hba.conf

The pg_hba.conf file (typically found in the Postgres data directory) defines the access rules and authentication methods for the data server. Lines in the file are processed sequentially when a connection is established, and the first line that matches the properties of the connection is used to determine the authentication method.

There are seven different possible formats for lines in the file (as well as comments, which start with a #), of which there are three main variants, the rest following the same structure as one of the others but with a different connection type in the first field. Here are some examples:

local     my_db     my_user     scram-sha-256

In this example, a connection attempt from my_user to the database my_db over a local (UDS) connection using scram-sha-256 will be accepted.

host        my_db     my_user     172.16.253.47/32        md5

In this example, a connection attempt from my_user to the database my_db from 172.16.253.47 using md5 as the authentication method will be accepted.

Note that the address shown in the example has /32 on the end to denote that all 32 of the high-order bits must be matched. To allow connections to match from anywhere on that subnet, we could also write it as 172.16.0.0/16 or 172.16.0.0 255.255.0.0 (the third format variant).

The fields in each line are always the connection type, the database name(s), the user name(s), the client network address/subnet (where needed), and the authentication method, plus any options that may be applicable. See the documentation for more information.

As a rule of thumb, any network connections should use either the hostssl or hostgssenc connection types to ensure that connections are encrypted.

In the following subsections we will examine several of the most used authentication methods.

Trust

The trust authentication method should only be used in exceptional circumstances, if at all, as it allows a matching client to connect to the server with no further authentication. Trust is useful for testing and development work on the local machine when connecting via a UDS, and when only fully trusted users have access to the machine and data security is not a concern.

It is also a useful mechanism for resetting passwords in the server if there is no other way to log in; temporarily allow trust access to connections from a UDS, connect to the server and reset the password, and then disable the trust access again.

Use trust with extreme care. It can be very dangerous!

Peer and Ident

Peer and ident are both methods of allowing users to be authenticated by the underlying operating system. Many Postgres packages come pre-configured to use peer authentication.

The peer authentication method is only available for local connections. When peer is used, the server gets the client’s username from the operating system and checks that it matches the requested database username.

The ident authentication method is only available for network connections. It works in a similar way to peer authentication, except that it relies on an ident server running on the client to confirm the username.

Both peer and ident allow the use of connection maps to handle acceptable mismatches between the username known to the client and that known to the database server.

Note that ident should not be relied upon, as the client running the ident server is unlikely to be guaranteed trustworthy.

md5 vs. SCRAM

For many years md5 was the preferred hashing mechanism for use with Postgres, and though still widely used it's strongly recommended that users move to the scram-sha-256, where password authentication is required.

Both md5 and scram-sha-256 use a challenge-response mechanism to prevent sniffing and store hashed passwords on the server. However, scram-sha-256 stores the hashes in what is currently considered to be a cryptographically secure form to avoid issues if an attacker gains access to the hash.

If you need to support password authentication with a standalone Postgres server, you should be using scram-sha-256 as the authentication method. Do not use md5 in new deployments!

LDAP vs. Kerberos

LDAP and Kerberos are often utilized in corporate environments when integrated with Single Sign On (SSO) systems. In such systems, the Postgres server is configured to authenticate the user through an LDAP directory or Kerberos infrastructure.

In LDAP systems, there are various ways that user access can be controlled, for example, by only granting access to users that are members of a specific organizational unit or group. When setting up your pg_hba.conf file, additional options can be specified at the end of the line, including an LDAP search filter, which will only allow users that match the filter to connect to the database.

Kerberos authentication is available through the gssapi authentication method in Postgres. Setting it up can be a little more daunting than LDAP and other authentication methods, but it is considered secure and offers automatic authentication for client software that supports it. There is a related Windows-specific sspi authentication method that can be used in Windows domains.

Though LDAP authentication is very popular based on user feedback the author has received, Kerberos authentication should always be preferred as, unlike LDAP, the user's password is never sent to the Postgres server.

TLS Certificates

TLS (sometimes referred to as SSL) certificates can be used for authentication and as a requirement for TLS encryption, as discussed in part one of this blog series. Certificate authentication works by trusting a top-level certificate (or one of its children or “intermediate” certificates) to issue certificates only to trusted clients. Clients in possession of a certificate and key issued by the higher authority who also issued the server certificate and key can be considered trusted.

In a simple example, you would first create a Certificate Authority (CA) certificate and key. This is extremely valuable and sensitive, so it must be kept completely secure. Then, you create a certificate and key for the Postgres server and sign it using the CA certificate and key.

Both the server certificate and key are then installed into the Postgres server, along with a copy of the CA's certificate (but not the CA's key).

Client certificates and keys can then also be created and signed by the CA as required.

When the client connects to Postgres and the cert authentication method is used, the Postgres server will check that the certificate presented by the client is trusted and that the Common Name (CN) field of the certificate matches the username for the client. Username mapping can also be done as with peer and ident.

The client can also specify several options when connecting to the server, including whether to (and to what extent) verify the trustworthiness of the server's certificate. This gives protection against spoofing.

Both the client and the server can use certificate revocation lists to keep track of any certificates that should no longer be trusted.

Certificates are an ideal way to authenticate automated systems that need to connect across the network to a Postgres server.

Additional Configuration

There are two additional configuration options that are worth considering:

authentication_timeout is a parameter that can be set in postgresql.conf. Its purpose is to set the maximum amount of time in which authentication must be completed before the server closes the connection. This is to ensure that incomplete connection attempts don't occupy a connection slot indefinitely.

auth_delay is a contrib module for Postgres that can be loaded through the shared_preload_libraries configuration option in postgresql.conf. Its purpose is to pause briefly when an authentication attempt fails before failure is reported to make brute force attacks much more difficult.

Other Authentication Methods

These are some of the more popular authentication methods available in Postgres. There are a number of other authentication methods available, but these are less widely used and have more specialized applications.

Roles

The next critical component in securing a Postgres deployment is the creation and setting of roles, which can limit database access for specified users.

What is a Role?

Very old (practically prehistoric) versions of PostgreSQL offered users and user groups as ways to group user accounts together. In PostgreSQL 8.1 this system was replaced with the SQL Standard compliant roles system.

A role can be a member of other roles or have roles that are members of it. We sometimes refer to this as "granting" a role to another role. Roles have several attributes that can be set, including ones that effectively make them user accounts that can be used to log in to the database server. An example of granting a role to another role is shown below:

GRANT pg_monitor TO nagios;

This makes the nagios role a member of pg_monitor, thereby giving nagios access to the extended functionality reserved for superusers and members of the pg_monitor role.

Role Attributes

Roles have several fixed attributes that can be set:

  • LOGIN: Can this role be used to login to the database server?
  • SUPERUSER: Is this role a superuser?
  • CREATEDB: Can this role create databases?
  • CREATEROLE: Can this role create new roles?
  • REPLICATION: Can this role initiate streaming replication?
  • PASSWORD: The password for the role, if set.
  • BYPASSRLS: Can this role bypass Row Level Security checks?
  • VALID UNTIL: An optional timestamp after which time the password will no longer be valid.

Roles with the SUPERUSER flag set automatically bypass all permission checks except the right to log in.

There are several other less commonly used role attributes that can be set. See the documentation for more information.

Grant SUPERUSER (and potentially dangerous attributes such as CREATEDB and CREATEROLE) with great care. Do not use a role with SUPERUSER privileges for day-to-day work.

Password Complexity

PostgreSQL (as opposed to EDB Postgres Advanced Server) doesn't include any password complexity enforcement functionality by default. It does include a hook that can be used to plug in a module to do password complexity checks, but this will have no effect if the user changes their password using a pre-hashed string.

A sample password check module can be found in Postgres' contrib directory in the source tree and is included with most package sets. This module can be used as an example for developing something more complex that meets an organization's specific requirements, though it does require C development work.

The most effective way to enforce password complexity in Postgres is to use an external identity service for authentication, such as LDAP or Kerberos, as described above.

Password Profiles

EDB Postgres Advanced Server offers a password profile feature that can be used with the password (never use this, as the password will be transferred in plain text!), md5, and scram-sha-256 authentication methods configured in pg_hba.conf. Password profiles can be configured by the superuser and applied to one or more roles. A profile allows you to define the following options:

  • FAILED_LOGIN_ATTEMPTS: The number of failed login attempts that may occur before the role is locked out for the amount of time specified in the PASSWORD_LOCK_TIME parameter.
  • PASSWORD_LIFE_TIME: The number of days a password can be used before the user is prompted to change it.
  • PASSWORD_GRACE_TIME: The length of the grace period after a password expires until the user is forced to change their password. When the grace period expires, a user will be allowed to connect but will not be allowed to execute any command until they update their expired password.
  • PASSWORD_REUSE_TIME: The number of days a user must wait before reusing a password.
  • PASSWORD_REUSE_MAX: The number of password changes that must occur before a password can be reused.
  • PASSWORD_VERIFY_FUNCTION: The name of a PL/SQL function that can check password complexity.

Note that if the user changes their password by supplying a new one in a pre-hashed form, then it is not possible to verify the complexity with the PASSWORD_VERIFY_FUNCTION option or re-use with the PASSWORD_REUSE_MAX option. To mitigate this, the PASSWORD_ALLOW_HASHED option may be set to false in the password profile.

If you're running EDB Postgres Advanced Server and not using an external authentication provider such as LDAP or Kerberos, consider using password profiles to ensure your users maintain strong, regularly changed passwords.

SET ROLE

The SET ROLE SQL command may be applied by a user to change the user identifier of the current session to the name of any role of which they are a member. This may be used to either add or restrict privileges from the session and may be reset using RESET ROLE (thus making SET ROLE unsuitable for use as a multi-tenancy solution).

SET ROLE is like using the sudo su - on a Unix-like system. It essentially allows you to run SQL commands as that other user.

When a role is a member of another role, it will automatically inherit the privileges of that role. To use SET ROLE effectively, the NOINHERIT keyword should be used when creating the role to prevent it from inheriting privileges automatically, requiring the use of SET ROLE to explicitly gain them when needed.

In addition to SET ROLE, there is also a SET SESSION AUTHORIZATION command, which is only available to superusers. The high-level difference between them is that SET ROLE will change the current_user value but not session_user, whilst SET SESSION AUTHORIZATION will change both.

In practical terms, this means that after running SET SESSION AUTHORIZATION, any subsequent SET ROLE commands will be restricted to those that the session_user could perform, regardless of the fact that the original session_user was a superuser. This allows superusers to more accurately imitate another user.

Consider using SET ROLE to allow users to temporarily elevate their privileges only when required to perform more potentially dangerous tasks.

Monitoring Roles

Postgres comes with several built-in monitoring roles (originally developed by your humble author!) that have access to functionality that was restricted to superusers only in earlier versions of Postgres. These roles allow you to grant specific privileges to roles that are used to monitor the system without having to give them full superuser access:

  • pg_monitor: A role which combines all the following roles:
  • pg_read_all_settings: Read all configuration variables, even those normally visible only to superusers.
  • pg_read_all_stats: Read all pg_stat_* views and use various statistics-related extensions, even those normally visible only to superusers.
  • pg_stat_scan_tables: Execute monitoring functions that may take ACCESS SHARE locks on tables, potentially for a long time.

Use the monitoring roles to give elevated privileges to the roles you use to monitor your database servers to avoid the need to give them superuser access. Ensure that your roles have the minimum privileges required to do what you need.

Data Access Control

As part of examining the security setup of your Postgres deployment, it is important to look at data access control and how we can prevent users from accessing data they should not be able to.

ACLs

Access Control Lists or ACLs are somewhat cryptic strings attached to objects such as tables, functions, views, and even columns in Postgres. They contain a list of privileges such as select, insert, execute, and so on that are granted to each role, as well as an additional optional flag (*) for each privilege that, if present, denotes that the role has the ability to grant this privilege to other roles, and the name of the role that granted the privileges.

An example of an ACL for a table created by Joe might be as follows:

joe=arwdDxt/joe =r/joe sales_team=arw/joe

The first section tells us that Joe has all the available privileges on the table (INSERT, SELECT, UPDATE, DELETE, TRUNCATE, REFERENCES, and TRIGGER), originally granted by Joe (when he created the table).

The second section tells us that read access has been granted to PUBLIC (a special pseudo role that means everyone) by Joe, and the third section tells us that the Sales Team has been granted INSERT, SELECT, and UPDATE privileges, again, by Joe.

The privilege flags in ACLs vary quite significantly based on the type of object in question; please review the documentation for further details.

It's useful to understand how ACLs are written in Postgres, particularly if you prefer working with command line tools, which will typically show them in the internal format. Graphical tools such as pgAdmin will parse and display the ACL in a visual format that is much easier to read.

Any well-designed system should use roles in conjunction with ACLs to protect the schema and data in the database. It is good practice to have the schema (i.e., the tables and other objects) be owned by a non-superuser role that is not a role that the application uses to connect to the database or to grant other privileges to login roles.

Create group roles that reflect the permissions or roles within your application that have the required database privileges and grant those roles to login roles as required. It is not usually a good idea to grant privileges directly to login roles used by end users, as that can quickly become difficult to manage.

Spend time fully understanding the privileges required in your system for users and applications to be able to do their jobs. Minimize privileges to only those required, separate schema ownership from data, and make use of group roles to simplify privilege management for individual login roles.

GRANT and REVOKE

ACLs are managed on objects in Postgres using the GRANT and REVOKE SQL commands. In most cases, when an object is created, only the owner has any privileges to use or work with that object in any way, exceptions being that PUBLIC is granted EXECUTE permission on functions and procedures, CONNECT and TEMPORARY permission on databases, and USAGE permission on languages, data types, and domains. Any of these privileges can be revoked if required.

Permission to modify or drop an object is always reserved for the owner of the object and superusers. The object ownership can be reassigned using the ALTER SQL command.

Default privileges can be overridden using the ALTER DEFAULT PRIVILEGES command for some object types. This allows you to configure the system such that certain privileges are automatically granted to roles when new objects are created. For instance, Joe, in the previous example, could issue a command such as the one below to grant the Sales Team insert, select, and update privileges on any new tables (but not preexisting ones, which may need to be updated manually):

ALTER DEFAULT PRIVILEGES 
GRANT INSERT, SELECT, UPDATE 
ON TABLES 
TO sales_team;

Assuming that when a new object is created it doesn't automatically include the required privileges in the ACL, we can use GRANT and REVOKE to set up the ACL as required. To continue our previous example, Joe might use the following SQL command to grant the Sales Team permissions on the orders table:

GRANT INSERT, SELECT, UPDATE ON orders TO sales_team;

To revoke any automatically granted privileges or to revoke previously granted privileges to meet changing business needs, we can use the REVOKE SQL command:

REVOKE UPDATE ON orders FROM sales_team;

Assuming the Sales Team previously had the INSERT, SELECT, and UPDATE privileges, as seen in the earlier example, this would remove the UPDATE privilege, allowing them to view and add orders but not modify them.

It is worth noting that the use of ACLs on columns can sometimes catch people out because the wildcard in a SELECT * FROM query will not exclude the columns that users don't have access to and will return an access denied message for the table. In such cases, the user should explicitly list the columns from which they have permission to SELECT.

Having created group roles to organize login users, use the GRANT and REVOKE SQL commands to give the group roles the minimum level of privilege required to work. Use default privileges where appropriate as a time-saver, but be careful that doing so doesn't give more privileges than are appropriate in the future. Use GRANT to give the privileges to the required login roles by making them members of the group roles.

RLS

Row Level Security or RLS is a technology available in Postgres that allows you to define policies that limit the visibility of rows in a table to certain roles. Before we dive into the details of how an RLS policy can be set up, there are two important caveats to note:

  1. Superusers and roles with the BYPASSRLS attribute always bypass row level security policies, as do table owners, unless they force the policy on themselves.
  2. The existence of a row may be inferred by a user through "covert channels." For example, a unique constraint on a field such as a social security number might prevent the user inserting another row with the same value. The user cannot access the row, but they can infer that a record with that social security number already exists.

By default, row level security is turned off on tables in Postgres. It can be enabled with a command such as ALTER TABLE...ENABLE ROW LEVEL SECURITY, which will enable a restrictive policy preventing access to all data unless or until other policies have been created.

The policy itself consists of a name, the table to which the policy applies, the optional role to which it applies, and the USING clause, which defines how matching or allowed rows will be identified. For example, we might limit access to orders to the Sales Team member who created them:

CREATE POLICY sales_team_policy ON orders TO sales_team USING (sales_person = current_user);

We can also specify operations to which the policy applies. The following example would allow all members of the Sales Team to select any orders, but only the original salesperson would be able to update or delete an order:

CREATE POLICY sales_team_select_policy ON users FOR SELECT USING (true);
CREATE POLICY sales_team_modify_policy ON users USING (sales_person = current_user);

By default, permissive policies are used, meaning that where there are multiple policies that apply, they are combined using a boolean OR. It is also possible to use restrictive policies, where a boolean AND is used when evaluating whether access to a row satisfies the combined policies.

Row Level Security policies can take some effort to set up, and index design must also consider them, but there are cases when it may be essential to do so, such as in a medical records system where it may be a legal requirement to restrict access to patient records to the medical staff that are directly responsible for the patient's care.

Consider the legal and ethical requirements to restrict access to specific rows in each of your tables, and design and implement RLS policies to meet those requirements where necessary. Take care to minimize covert channels by avoiding the use of sensitive data in constraints.

Views

Views are obviously useful for encapsulating commonly executed queries into an object that can be queried as if it were also a table, but they can also be useful for preventing unauthorized access to data by ensuring that roles do not have the ability to select from the underlying tables and have to access the data from the view instead. A classic example is part of Postgres; the pg_catalog.pg_authid table contains a row for each role in the database, including a column containing the hash of the password for the role if it's been set. Because the hash is considered sensitive information, the table does not have SELECT privileges for any roles other than the superuser that the database was initialized with.

A view (pg_catalog.pg_roles) is provided instead, which can be selected from by any user. When selecting from the view, the password is always returned as ********. This is arguably more convenient than simply using an ACL on the password column in the underlying table as that would cause a permissions error if queried with SELECT * FROM.

When using updateable views, a CHECK OPTION is available when defining the view. When omitted, the view will allow the user to insert or update records such that they wouldn't be visible through the view. Otherwise the insert or update will only be allowed if the row would be visible to the user. If LOCAL CHECK OPTION is specified, row visibility is checked only against conditions on the view being used directly, but when CASCADED CHECK OPTION is used (the default, if CHECK OPTION is specified), row visibility is checked against the view being used directly as well as any other underlying views.

Consider using views over secured tables as a method of allowing access to a limited subset of the columns in the underlying table to appropriate roles.

Security Barriers

Using views to restrict access to a column is quite common, but people often also use them to restrict access to certain rows. While there is certainly value in doing that, one must be mindful of one particularly nasty side effect; it's possible to trick the Postgres optimizer into leaking the hidden data!

This is not actually a bug; it's the way the system is intended to work. Essentially, what can happen is that when a query against a view is executed by the user, and the user includes a call to a very low-cost function in that outer query, the optimizer may choose to run the query for every row in the data underlying the view, before it applies the selectivity clauses in the view, thus allowing the function to access the restricted data. This is demonstrated nicely in a blog post by my colleague Robert Haas.

To solve this problem, we use a security barrier, which is basically an option that is passed when the view is created. This tells Postgres to always execute the qualifiers on the view first, thus ensuring that the function never sees the hidden rows.

Related to security barriers is the LEAKPROOF parameter for functions. This can only be used by superusers when creating a function and serves to certify that the function doesn't leak any information besides the intended return value. This allows Postgres to better optimize queries where a function is used with a security barrier view, safe in the knowledge that the function won't leak any information.

Be careful when using views to hide rows to ensure that they are marked as security barriers to avoid leaking data. Consider whether RLS might be a better solution for limiting access to specific rows.

Security Definer Functions

By default, functions and procedures in Postgres are what we call SECURITY INVOKER functions. That means that when they are called, they execute with the privileges of the calling role.

Passing the SECURITY DEFINER option when creating the function means that whenever the function is called, it will be executed with the privileges of the owner instead of the calling role. This is similar to the setuid bit in a Unix file ACL, which when set will allow an executable to run with the permissions of its owner instead of the user that executed it.

This ability can be useful in various situations. One example might be a function that is called by a trigger on a table to write a record to an audit log, which prevents all login and group roles from accessing in any way. It is important to carefully consider the consequences of using SECURITY DEFINER functions though – in particular, ensure that they are kept as simple as possible and perform only a single task without taking any parameters that may allow them to be used for other purposes for which they were not intended.

Consider using SECURITY DEFINER functions to provide specific functionality to roles that cannot perform that task directly themselves. Be careful to consider the possible ramifications and ways in which such functions could be misused, and ensure they are limited to performing only the intended task.

Data Redaction

Data redaction is a technique that hides specific pieces of sensitive information from users by dynamically changing the value that is displayed. While this can be done to some extent with views in Postgres as described above, EDB Postgres Advanced Server includes native data redaction functionality.

Redaction is implemented in EPAS using data redaction policies on tables. In short, these policies specify one or more columns on a table to operate on, an expression that determines whether the policy should be applied, a function to perform the redaction, a scope, and any exceptions to the policy. See the documentation link above for an example of how policies can be created.

When using EDB Postgres Advanced Server and working with sensitive data such as credit card numbers, consider using data redaction policies to dynamically change the data displayed to a redacted form such as "XXXX XXXX XXXX XXXX 8397" to prevent users from having access to sensitive data unnecessarily.

Encryption

The final component to consider when executing end-to-end security for Postgres is encrypting sensitive data. There are several methods and extensions available that can be configured to provide additional security through encryption.

pgcrypto

pgcrypto is a standard extension of Postgres and EPAS that is included as a contrib module in the source tree and most binary distributions. Its purpose is to provide SQL functions for encryption and hashing that can be utilized as part of the logic in your database design.

Installation

With most binary distributions of Postgres, pgcrypto can be installed by first ensuring that the contrib modules are installed on the server. Installer-based distributions such as those for Windows and macOS are typically installed as part of the database server itself. Linux packages such as Debian/Ubuntu's DEBs and Redhat/SUSE RPMs may include them in a sub-package. For example, the PostgreSQL Community's packages from yum.postgresql.org have a package called postgresql12-contrib for PostgreSQL 12.

Once the package is installed on your server, simply run the CREATE EXTENSION command in the desired database as a superuser:

CREATE EXTENSION pgcrypto;

Consider using the pgcrypto extension in your databases when you require the ability to hash or encrypt individual pieces of data to meet regulatory and similar requirements.

Hashing

Hashing is a method of generating a cryptographically secure representation of a piece of data, typically of a fixed length (the size of which is dependent on the algorithm used). Importantly, it is non-reversible; that is, the original data cannot be extracted from the hash value – however because the hashed value is unique to the original data, it can be used as a checksum to see if the data has been changed or to see if a user-provided value matches the original value.

Hashing is mostly used to store passwords and other sensitive information that may need to be verified but not returned.

As an example, we can use pgcrypto to hash a password that the user will use in the future:

INSERT INTO users 
    (username, email, password)
VALUES
    ('pgsnake', 'dave.page@enterprisedb.com', crypt('new password', gen_salt('md5')));

To verify this password later we can SELECT the user record from the table:

SELECT
    *
FROM
    users
WHERE
   username = 'pgsnake' AND 
   password = crypt('entered password', password)

If a record is returned, the password was entered correctly – otherwise, it was incorrect.

It is important to note that when passwords are included in SQL commands, as shown above, they may end up being written to log files on the database server. Network communications may also leak commands like these if not protected with encryption.

Never store user passwords in plain text or obfuscated form in the database, and never use a reversible encrypted form unless the functionality of the application absolutely requires it (for example, if writing a password manager application). Use non-reversible hashing wherever possible for application passwords and other information that must be verified but not returned.

Encryption

pgcrypto also provides functionality for encrypting data that is useful when storing information that needs to be retrieved but should be stored in a secure form. There are "raw" encryption/decryption functions provided with pgcrypto as well as PGP functions.

The PGP functions are strongly encouraged over use of the raw functions, which use a user-provided key directly as the cipher key, provide no integrity checking, expect the user to manage all encryption parameters, and work with bytea data, not text.

Symmetric key encryption is the easiest to use as it doesn't require a PGP key. For example, we can demonstrate encryption and decryption of data as shown with this simple SQL command in which the inner function call encrypts the data and the outer one decrypts it:

SELECT pgp_sym_decrypt(
    pgp_sym_encrypt('Hi There', 'password'), 
    'password');

Note that the cipher text returned by the encryption function and passed to the decryption function is in bytea format.

To use public key functionality, a key is first required. This can be generated using GnuPG with a command such as:

gpg --gen-key

The PostgreSQL documentation suggests that the preferred key type is "DSA and Elgamal." Once the key is generated, you'll need to export it:

# List the keys in the keyring:
gpg --list-secret-keys
 
# Export a public key in ASCII armor format:
gpg -a --export KEYID > public.key
 
# Export a secret key in ASCII armor format:
gpg -a --export-secret-keys KEYID > secret.key

The public key can now be used to encrypt data using the SQL encryption function:

pgp_pub_encrypt('<data>', '<public key>')

Similarly, the data can later be decrypted using:

pgp_pub_decrypt(<cipher text>, '<private key>')

Again, note that the cipher text is in bytea format.

Encryption is used when storing sensitive data in the database that need to be retrieved in the future. Consider carefully whether symmetric or public key encryption is most appropriate for your use case. Public key generally makes more sense when exchanging data with others (because there's no shared secret), while symmetric may make more sense for a self-contained application.

Key Management

One major issue with the use of encryption in a database is key management. In its simplest form, an application may have a hard-coded or centrally configured key that it uses when encrypting and decrypting data. Unless the application can change the key (which may also be expensive if there's a lot of data to re-encrypt), then that key will be valid for the lifetime of the application, and it also means that all users are sharing one single key. These factors greatly increase the chances of that key being known to multiple people (e.g., the administrators of the application), some of whom may leave the organization, taking that knowledge with them.

Key management systems alleviate some of these problems by offering ways to store keys in a secure service separately from the database and application and to potentially use different keys for different users or purposes. Some, such as Bruce Momjian's pgcryptokey extension, also offer functionality for re-encrypting through SQL commands. That can still be expensive, of course, but the extension does make it trivial to do.

Key management systems can also avoid the need for users to ever see the actual keys; their access to the keys can be controlled through a password or passphrase (which may be authenticated using Kerberos or a similar enterprise identity management system), with the key itself being passed directly to the database server or application as needed.

At the time of writing there is an ongoing discussion in the PostgreSQL community about the development of a key management system as a feature of the database server. Those who are interested in this feature or interested in seeing how features are discussed and added to PostgreSQL might want to read the original and current email threads.

Consider whether the use of a key management system may be appropriate for managing your cryptographic keys to avoid the use of shared keys or to separate their storage from the application.

File System and Full Disk Encryption

When using file system encryption (or full disk encryption, as the benefits are essentially the same for the scope of this blog), we typically encrypt the volumes that are used to store the database and write-ahead log, or often the entire system. These types of encryption are transparent to the database server and require no configuration in Postgres.

It is important to note that the file system and data encryption in Postgres protect against different attack vectors. The operating system may make use of a password or key management system very early on in the boot phase to ensure that keys are kept externally, but once a server with file system encryption is booted and running with filesystems mounted, all the data is accessible in the same way as it would be on a machine without encryption.

This gives us protection against physical attacks on non-running hardware, such as a stolen hard disk. File system or full disk encryption does not protect against attacks on a system that is up and running, nor do they enable us to control the visibility of data in the database for different users.

There are different file system or full disk system encryption options available bundled with most operating systems, commercially and as open source products. Among the most common options are FileVault, which is included with Apple macOS, BitLocker for use with Microsoft Windows, and LUKS on Linux systems.

Encrypted volumes are also available on all major cloud providers to protect data. For example, Amazon's Elastic Block Service (EBS) provides an option for creating encrypted volumes, which can use a default key or one provided through their key management system.

It's worth noting that Amazon does, of course, have access to both your keys and the physical devices on which the volumes are provisioned, but they go to lengths to ensure that there is the separation of duties between the staff that may have access to the keys and staff that may have access to the hardware.

It is a good idea to use a file system or full disk encryption on any computer to protect against physical loss of hardware, in your humble author's opinion. All popular operating systems have options available to allow this built in.

Conclusion

In this article, we've looked at the security of your Postgres implementation, from the client perspective to the on-disk storage. We saw how several factors related to server access can affect the security of your Postgres servers and that the following should be considered as part of any deployment or review:

  • Physical access
  • Server access via Unix Domain Sockets and the network
  • Firewalls
  • Transport encryption

We also saw how determining user authentication mechanisms to authenticate different connection attempts is critical to securing your Postgres deployment. Roles are also an important part of security in Postgres, and by configuring and securing them properly, we can use them to minimize the risk to our database servers using the principle of least privilege.

Techniques for securing and minimizing access to sensitive data in Postgres require planning and careful design but can significantly improve the security of your data. Finally, we saw how encryption security for your Postgres deployment encompasses both data and file system/full disk encryption; it may also be desirable to integrate with a key management system.

In all aspects of this journey, there are options that are appropriate for one deployment, and other options that are appropriate for others. Additional functionality should also be considered, such as sepgsql – which can work with SELinux and could take an entire blog series to describe on its own!

Hopefully, this overview is helpful for reviewing the security of your deployments, but do remember that each deployment scenario is unique, and the suggestions made here are not a "one size fits all" solution.

Share this
What are the essential steps to secure a PostgreSQL database? chevron_right

​To secure a PostgreSQL database, start by enforcing strong passwords and user authentication methods, such as SCRAM-SHA-256.​ Configure the pg_hba.conf file to restrict access based on IP addresses and authentication types. Implement network security measures like firewalls to limit access to the database server and ensure that SSL/TLS encryption is enabled to protect data in transit. Regularly update PostgreSQL and its extensions to patch vulnerabilities and follow best practices for user roles and permissions to minimize risks.

How do I manage user roles and permissions effectively chevron_right

Managing user roles and permissions in PostgreSQL involves creating roles that follow the principle of least privilege, ensuring users only have the permissions necessary for their tasks. Use role hierarchies to simplify the management process and avoid privilege creep by regularly auditing user access and adjusting permissions as needed. Consider implementing groups for common roles, allowing easier assignment of permissions, and leveraging row-level security to control access to specific data.

What is the best way to store sensitive data securely? chevron_right

Storing sensitive data securely requires using encryption both at rest and in transit. For data at rest, consider utilizing PostgreSQL's built-in support for encryption functions or storing encrypted data at the application level. For data in transit, enable SSL/TLS to ensure that all communication between clients and the PostgreSQL server is encrypted. Additionally, access to sensitive data should be tightly controlled through rigorous role management and data masking techniques where applicable.

How can I enforce data retention and compliance policies? chevron_right

To enforce data retention and compliance policies in PostgreSQL, create automated processes to archive or delete data that exceeds retention periods. Implement data masking for sensitive information in non-production environments and regularly review and update privacy policies to reflect current regulations. Use PostgreSQL’s logging capabilities to track data access and modifications, providing an audit trail that can be helpful for compliance purposes.

What logging features does PostgreSQL offer for security auditing? chevron_right

PostgreSQL offers robust logging features that enable administrators to record various activities, including user logins, data changes (INSERTs, UPDATEs, DELETEs), and structural changes (DDL statements). You can configure logging parameters in the postgresql.conf file to control the level of detail recorded. For more advanced auditing, third-party extensions like pgAudit can provide enhanced logging capabilities, allowing for specific event tracking and detailed auditing of database actions.

What should I do to prevent SQL injection attacks? chevron_right

To prevent SQL injection attacks, always use parameterized queries or prepared statements instead of dynamically constructing SQL queries from user inputs. This practice ensures that user inputs are treated as data rather than executable commands, significantly reducing the risk of injection. Additionally, it validates and sanitizes user inputs meticulously and implements strict access controls to limit the permissions of the application connecting to the database.

How do I migrate my PostgreSQL database securely? chevron_right

When migrating a PostgreSQL database, ensure that the migration is performed over secure connections (using SSL/TLS) to protect data during transfer. Use tools like pg_dump and pg_restore to safely export and import data, ensuring that sensitive information is encrypted during transit. Before completing the migration, verify that all security measures are in place on the new instance, such as updated pg_hba.conf configurations and appropriate user roles.

What is row-level security, and how do I implement it? chevron_right

Row-level security (RLS) in PostgreSQL allows for fine-grained access control by defining policies that dictate which rows of data users can access based on their roles. To implement RLS, enable it for a specific table using the command ALTER TABLE <table_name> ENABLE ROW LEVEL SECURITY, and then create policies that specify the conditions under which users can view or modify rows. This feature is particularly useful in multi-tenant applications where data isolation is crucial.

How do I handle backups securely? chevron_right

To handle backups securely, encrypt backup files using strong encryption algorithms both at rest and during transport. Schedule regular backups using automated tools like pg_dump or third-party solutions that provide additional security features, such as encryption and integrity checks. Ensure that access to backup files is limited to authorized personnel only and implement policies for regular testing of backup restoration procedures to validate their reliability.

What are the risks of using third-party PostgreSQL extensions? chevron_right

Using third-party PostgreSQL extensions can introduce risks such as compatibility issues, security vulnerabilities, or conflicts with existing system configurations. It's essential to thoroughly review the extension's code, documentation, and community feedback before installation. Regularly update extensions to their latest versions, and only source them from reputable repositories to minimize potential security risks and ensure that they comply with PostgreSQL's security standards.

Achieve Top-Level Enterprise Data Security with EDB

Get a team of Postgres experts to safeguard your data