Exploring Postgres's arena allocator by writing an HTTP server from scratch

November 06, 2024

Postgres manages memory through nested arenas called MemoryContexts. MemoryContexts are convenient because, for the most part, you don't need to worry about explicitly freeing memory because memory allocated within the context will be freed when the context is freed. So the next two things you worry about, things that took me a while to understand, are 1) in which MemoryContext should some particular object be allocated (i.e. how long should some particular object live) and 2) when should I reach for creating a new MemoryContext?

To help get a feel for how we can use MemoryContexts we're going to do something a little strange. We're going to build a little HTTP server and web framework from scratch inside of a Postgres extension. This is not (and many other choices made in this post are also not) a good idea for a production extension. But it may prove a sufficiently interesting scenario in which to explore the use of MemoryContexts.

We'll be able to set up the server with some custom routing and handling like this:

DROP EXTENSION IF EXISTS pgweb;
CREATE EXTENSION pgweb;
-- Create a handler for a route
DROP FUNCTION IF EXISTS handle_hello_world;
CREATE FUNCTION handle_hello_world(params JSON) RETURNS TEXT AS $$
BEGIN
 RETURN 'Hello, ' || (params->>'name') || E'!\n';
END;
$$ LANGUAGE plpgsql;
-- Register the handler with the route
SELECT pgweb.register_get('/hello', 'handle_hello_world');
-- Start the server
SELECT pgweb.serve('127.0.0.1', 9003);

It will be about 500 lines of code and you can see the completed project on GitHub.

I'm going to assume you are on a Debian virtual machine (we'll install unstable packages, so beware) so I can give some precise prerequisite instructions. With the right packages installed, you can run the actual code from this post on any Linux distro or macOS. But the memory leak tool we'll use is only available on Linux.

Prerequisite: Grab and build Postgres

Set up the unstable Debian repository and install a couple of packages 1) to build Postgres and 2) for finding memory leaks:

$ sudo sh -c 'echo deb http://cloudfront.debian.net/debian sid main >> /etc/apt/sources.list'
$ sudo apt-get install -y bpfcc-tools libbpfcc libbpfcc-dev linux-headers-$(uname -r) build-essential git bison flex libcurl4-openssl-dev

Now for Postgres:

$ cd ~ $ git clone https://github.com/postgres/postgres
$ cd postgres
$ git checkout REL_16_STABLE
$ ./configure --prefix=$(pwd)/build --libdir=$(pwd)/build/lib --enable-cassert --enable-debug --without-icu --without-readline --without-zlib
$ make -j8
$ make install
$ export PATH="$PATH:$(pwd)/build/bin"

Now we can set up the basic extension boilerplate.

Postgres extension boilerplate

Make a new directory for the extension:

$ cd ~
$ mkdir pgweb
$ cd pgweb

All the following files will be in the pgweb directory.

Inside Makefile write:

MODULES = pgweb
EXTENSION = pgweb
DATA = pgweb--0.0.1.sql
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

And inside pgweb.control write:

default_version = '0.0.1'
module_pathname = '$libdir/pgweb'
relocatable = true

Alright, now we need to think about the SQL interface we'll expose.

The SQL interface

We need users to be able to register routes. And we need them to be able to turn on the server.

So in pgweb--0.0.1.sql write:

-- pgweb--0.0.1.sql
-- Create the extension schema
CREATE SCHEMA pgweb;
-- Registers a route with a Postgres function handler.
-- Example:
--   DROP FUNCTION IF EXISTS handle_hello_world
--   CREATE FUNCTION handle_hello_world(params JSON) RETURNS TEXT AS $$
--   BEGIN
--     RETURN 'Hello, ' || (params->>'name') || E'!\n';
--   END;
--   $$ LANGUAGE plpgsql;
--
--   SELECT pgweb.register_get('/hello', 'handle_hello_world');
CREATE OR REPLACE FUNCTION pgweb.register_get(TEXT, TEXT)
RETURNS VOID AS 'pgweb', 'pgweb_register_get'
LANGUAGE C STRICT;
GRANT EXECUTE ON FUNCTION pgweb.register_get(TEXT, TEXT) TO PUBLIC;
-- Starts the web server at the address and port.
-- Example:
--   SELECT pgweb.serve('127.0.0.1', 9090);
CREATE OR REPLACE FUNCTION pgweb.serve(TEXT, INT)
RETURNS VOID AS 'pgweb', 'pgweb_serve'
LANGUAGE C STRICT;
GRANT EXECUTE ON FUNCTION pgweb.serve(TEXT, INT) TO PUBLIC;
-- Example:
--  $ curl 127.0.0.1:9090/hello?name=Phil
--  Hello, Phil!

The bulk of this is just comments to illustrate how we'll be able to use this extension. And again all this really does is just expose two C functions (we will write soon) as functions that can be called from SQL.

The implementation

First let's set up a top-level MemoryContext, called PGWServerContext that will hold all objects that should live for the lifetime of the server. In pgweb.c:

#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <time.h>
#include <unistd.h>
#include "postgres.h"
#include "catalog/pg_type_d.h"
#include "fmgr.h"
#include "nodes/pg_list.h"
#include "parser/parse_func.h"
#include "utils/builtins.h"
#include "utils/datum.h"
#include "utils/json.h"
#include "utils/memutils.h"
#include "utils/regproc.h"
PG_MODULE_MAGIC;
typedef struct PGWHandler PGWHandler;
typedef enum PGWRequestMethod PGWRequestMethod;
typedef struct PGWRequest PGWRequest;
static bool pgweb_handle_connection(int client_fd);
static void pgweb_handle_request(PGWRequest *request, PGWHandler *handler, char **errmsg);
static Datum pgweb_request_params_to_json(PGWRequest *request);
static void pgweb_parse_request_url(PGWRequest *r, int buflen, int *bufp, char **errmsg);
static PGWRequestMethod pgweb_parse_request_method(PGWRequest *r, int buflen, int *bufp, char **errmsg);
static void pgweb_parse_request(PGWRequest *request, char *buf, int buflen, char **errmsg);
static void pgweb_send_response(PGWRequest *request, int code, char *status, char *body);
static MemoryContext PGWServerContext = NULL;

Allocations within a MemoryContext

To allocate memory within the current memory context (a global called CurrentMemoryContext) we cannot use malloc (and similarly cannot use free). We must use CurrentMemoryContext-aware methods from Postgres like palloc, pstrdup, and psprintf. These methods may use malloc under the hood, but if we use malloc itself directly we lose all the benefits of the MemoryContext infrastructure.

We can switch between MemoryContexts with MemoryContextSwitchTo. And we can free all memory within the MemoryContext with MemoryContextReset.

_PG_init

Now we can write a _PG_init() method (that is called by Postgres when the extension is loaded) that will initialize this memory. We're going to have our PGWServerContext be a nested arena inside of TopMemoryContext (the builtin top-level MemoryContext in Postgres which exists for the life of the process).

void _PG_init(void)
{
   PGWServerContext = AllocSetContextCreate(TopMemoryContext,
                                            "PGWServerContext",
                                            ALLOCSET_DEFAULT_SIZES);
}

Next let's handle registering routes.

Registering routes

We need to map paths to a route handler method. To keep things simple we'll use an association list.

typedef struct PGWHandler {
   char *route;
   char *funcname;
} PGWHandler;
static List /* PGWHandler * */ *handlers;

We'll store the handlers list in the PGWServerContext. To do this we must switch into the PGWServerContext before we allocate. And then to be a good citizen we'll switch back out of this context to whatever the previous context was when we are done allocating.

PG_FUNCTION_INFO_V1(pgweb_register_get);
Datum
pgweb_register_get(PG_FUNCTION_ARGS)
{
   MemoryContext oldctx;
   PGWHandler *handler;
   oldctx = MemoryContextSwitchTo(PGWServerContext);
   handler = palloc(sizeof(PGWHandler));
   handler->route = TextDatumGetCString(PG_GETARG_DATUM(0));
   handler->funcname = TextDatumGetCString(PG_GETARG_DATUM(1));
   handlers = lappend(handlers, handler);
   MemoryContextSwitchTo(oldctx);
   PG_RETURN_VOID();
}

Serving routes

Next we need to actually listen on a port and accept new connections. This is a very basic part of UNIX APIs so I will not describe them and refer you to the section in Beej's Guide to Network Programming that covers these APIs if you want to read more.

Basically, we will accept one TCP connection at a time and block until we handle the connection and respond with the appropriate HTTP response. Like many parts of this post this is not a good idea for a production web server. But it keeps the code simple and workable for the example usage we have in mind.

Like the pgweb_register_get method we will switch into the PGWServerContext before and after the meat of the logic so that any allocations happen inside this context.

PG_FUNCTION_INFO_V1(pgweb_serve);
Datum
pgweb_serve(PG_FUNCTION_ARGS)
{
   char *address;
   int32 port = PG_GETARG_INT32(1);
   int server_fd;
   struct sockaddr_in server_addr;
   MemoryContextSwitchTo(PGWServerContext);
   address = TextDatumGetCString(PG_GETARG_DATUM(0));
   memset(&server_addr, 0, sizeof(server_addr));
   server_addr.sin_family = AF_INET;
   server_addr.sin_addr.s_addr = inet_addr(address);
   server_addr.sin_port = htons(port);
   server_fd = socket(AF_INET, SOCK_STREAM, 0);
   if (server_fd == -1)
   {
       int e = errno;
       elog(ERROR, "Could not create socket: %s.", strerror(e));
   }
   if (bind(server_fd, (struct sockaddr *) &server_addr, sizeof(server_addr)) == -1)
   {
       int e = errno;
       elog(ERROR, "Could not bind to %s:%d: %s.", address, port, strerror(e));
   }
   if (listen(server_fd, 10 /* Listen backlog. */) == -1)
   {
       int e = errno;
       elog(ERROR, "Could not listen to %s:%d: %s.", address, port, strerror(e));
   }
   elog(INFO, "Listening on %s:%d.", address, port);
   while (1)
   {
       struct sockaddr_in peer_addr;
       socklen_t peer_addr_size;
       int client_fd = accept(server_fd, (struct sockaddr *) &peer_addr, &peer_addr_size);
       bool stayalive;
       if (client_fd == -1)
       {
           int e = errno;
           elog(ERROR, "Could not accept connection: %s.", strerror(e));
       }
       stayalive = pgweb_handle_connection(client_fd);
       Assert(CurrentMemoryContext == PGWServerContext);
       close(client_fd);
       if (!stayalive)
       {
           elog(INFO, "Shutting down.");
           break;
       }
   }
   close(server_fd);
   MemoryContextReset(PGWServerContext);
   PG_RETURN_VOID();
}

Also notice the Assert(CurrentMemoryContext == PGWServerContext); after the call to pgweb_handle_connection (which we will write next). The assertion is to help us as we're developing make sure that the pgweb_handle_connection also always switches back to the memory context from which it was called (if it ever switches away).

Working with connections

Inside of a connection we will allocate a new MemoryContext to handle all allocations that need only live for the duration of the request itself.

static MemoryContext PGWRequestContext = NULL;

Now let's sketch out the types for an HTTP connection itself. We will only even attempt to parse the request method (GET or POST) and the request URL. Though we will decompose the URL into its path and parameter components.

typedef enum PGWRequestMethod {
   PGW_REQUEST_METHOD_GET,
   PGW_REQUEST_METHOD_POST,
} PGWRequestMethod ;
typedef struct PGWRequestParam {
   char *key;
   char *value;
} PGWRequestParam;
typedef struct PGWRequest {
   int conn_fd; /* Where to read/write */
   char *buf; /* Bytes we have already read */
   PGWRequestMethod method;
   char *url; /* The entire requested URL. */
   char *path; /* Only the path portion of the URL, excluding URL parameters. */
   List /* PGWRequestParam * */ *params; /* All keyword parameters in the URL. */
} PGWRequest;

If we were going to handle multiple concurrent requests, we would want to allocate a new MemoryContext for every request. And we would store the per-request MemoryContext on this PGWRequest struct itself. We would get rid of the global PGWRequestContext. For example:

typedef struct PGWRequest {
   MemoryContext context; /* Where all request memory is allocated */
   int conn_fd; /* Where to read/write */
   char *buf; /* Bytes we have already read */
   PGWRequestMethod method;
   char *url; /* The entire requested URL. */
   char *path; /* Only the path portion of the URL, excluding URL parameters. */
   List /* PGWRequestParam * */ *params; /* All keyword parameters in the URL. */
} PGWRequest;

But since our code only handles a single connection and request at a time, we can simplify things and just use the global PGWRequestContext above.

Handling a connection

To handle a connection we need to read bytes off the wire, parse the bytes into a request, find the appropriate handler for the path, call the handler, otherwise return a 404 if there is no handler for the path.

All of this code will allocate within the PGWRequestContext. So we need to make sure it is initialized and we need to switch into this context for the body of the method. When the method ends, we will "reset" the MemoryContext to free all memory allocated within the context, all at once.

static bool
pgweb_handle_connection(int client_fd)
{
   char *buf;
   ssize_t n;
   char *errmsg = NULL;
   ListCell *lc;
   bool handler_found = false;
   int errcode = 500;
   MemoryContext oldctx;
   clock_t start = clock();
   clock_t stop;
   bool stayalive = true;
   PGWRequest request;
   memset(&request, 0, sizeof(request));
   if (PGWRequestContext == NULL)
       PGWRequestContext = AllocSetContextCreate(PGWServerContext,
                                                    "PGWRequestContext",
                                                    ALLOCSET_DEFAULT_SIZES);
   oldctx = MemoryContextSwitchTo(PGWRequestContext);
   buf = palloc(4096);
   n = recv(client_fd, buf, 4096, 0);
   // Let's just not support longer requests.
   if (n == 4096)
   {
       errmsg = "Request is too long.";
       goto done;
   }
   pgweb_parse_request(&request, buf, n, &errmsg);
   if (errmsg != NULL)
       goto done;
   request.conn_fd = client_fd;
   /* Special case for  */
   if (strcmp(request.url, "/_exit") == 0)
   {
       stayalive = false;
       goto done;
   }
   foreach (lc, handlers)
   {
       PGWHandler *handler = lfirst(lc);
       if (strcmp(handler->route, request.path) == 0)
       {
           pgweb_handle_request(&request, handler, &errmsg);
           handler_found = true;
           break;
       }
   }
   if (!handler_found)
   {
       errcode = 404;
       errmsg = "Not found";
   }
done:
   if (errmsg)
       pgweb_send_response(&request,
                           errcode,
                           errcode == 404 ? "Not Found" : "Internal Server Error",
                           errmsg);
   stop = clock();
   elog(INFO, "[%fs] %s %s",
        (double)(stop - start) / CLOCKS_PER_SEC,
        request.method == PGW_REQUEST_METHOD_GET ? "GET" : "POST",
        request.url);
   Assert(CurrentMemoryContext == PGWRequestContext);
   MemoryContextReset(PGWRequestContext);
   MemoryContextSwitchTo(oldctx);
   return stayalive;
}

Since we loop forever accepting a single connection at a time on a single thread/single process, we hack in this special /_exit path to allow us to shut down the server cleanly.

From here we need to implement three helper methods:

  • pgweb_parse_request
  • pgweb_handle_request
  • pgweb_send_response

Sending a response is simple so let's do that first.

Sending an HTTP response

According to RFC9112 for HTTP, we must write out a status-line to send an HTTP response. We will also send Content-Length and Content-Type headers before sending the HTTP response body itself.

static void
pgweb_send_response(PGWRequest *request, int code, char *status, char *body)
{
   char *buf = psprintf("HTTP/1.1 %d %s\r\n"
                        "Content-Length: %lu\r\n"
                        "Content-Type: text/plain\r\n"
                        "\r\n"
                        "%s",
                        code,
                        status,
                        strlen(body),
                        body);
   ssize_t n = send(request->conn_fd, buf, strlen(buf), 0);
   Assert(CurrentMemoryContext == PGWRequestContext);
   if (n != strlen(buf))
   {
       int e = errno;
       elog(ERROR, "Failed to send response to client: %s.", strerror(e));
   }
}

We could pfree(buf) at the end of this method. We can also at any point create dynamic and short-lived MemoryContexts. For example:

static void
pgweb_send_response(PGWRequest *request, int code, char *status, char *body)
{
   char *buf;
   ssize_t n;
   MemoryContext ctx = AllocSetContextCreate(CurrentMemoryContext,
                                            "SendResponseContext",
                                            ALLOCSET_DEFAULT_SIZES);
   Assert(CurrentMemoryContext == PGWRequestContext);
   MemoryContextSwitchTo(ctx);
   buf = psprintf("HTTP/1.1 %d %s\r\n"
                        "Content-Length: %lu\r\n"
                        "Content-Type: text/plain\r\n"
                        "\r\n"
                        "%s",
                        code,
                        status,
                        strlen(body),
                        body);
   MemoryContextReset(ctx);
   MemoryContextSwitchTo(PGWRequestContext);
   n = send(request->conn_fd, buf, strlen(buf), 0);
   if (n != strlen(buf))
   {
       int e = errno;
       elog(ERROR, "Failed to send response to client: %s.", strerror(e));
   }
}

This can be useful if we are doing a lot of allocations at once that can shortly be freed and we are not near the point where the CurrentMemoryContext will be reset. However, sending a response happens at the end of the lifetime of PGWRequestContext so there is not much value in aggressively and explicitly freeing what we allocated to send a response.

Parsing an HTTP request

Next let's handle parsing a request. Within the request-line we'll basically only look at the HTTP method and the requested URL. And we'll just completely ignore any headers and request body that may or may not exist.

static void
pgweb_parse_request(PGWRequest *request, char *buf, int buflen, char **errmsg)
{
   int bufp = 0;
   Assert(CurrentMemoryContext == PGWRequestContext);
   request->buf = buf;
   request->method = pgweb_parse_request_method(request, buflen, &bufp, errmsg);
   if (request->method == -1)
   {
       /* pgweb_parse_request_method should handle setting the errmsg in this case. */
       Assert(errmsg != NULL);
       return;
   }
   Assert(request->buf[bufp] == ' ');
   bufp++;
   pgweb_parse_request_url(request, buflen, &bufp, errmsg);
}

If the HTTP method is not GET or PUT we'll error out.

static PGWRequestMethod
pgweb_parse_request_method(PGWRequest *r, int buflen, int *bufp, char **errmsg)
{
   int bufp_original = *bufp;
   int len;
   Assert(CurrentMemoryContext == PGWRequestContext);
   while (*bufp < buflen && r->buf[*bufp] != ' ')
       (*bufp)++;
   if (*bufp == buflen)
   {
       *errmsg = psprintf("Incomplete request: '%s'", pnstrdup(r->buf, buflen));
       return -1;
   }
   len = *bufp - bufp_original;
   if (len == 3 && strncmp(r->buf + bufp_original, "GET", len) == 0)
       return PGW_REQUEST_METHOD_GET;
   if (len == 4 && strncmp(r->buf + bufp_original, "POST", len) == 0)
       return PGW_REQUEST_METHOD_POST;
   *errmsg = psprintf("Unsupported method: '%s'", pnstrdup(r->buf + bufp_original, len));
   return -1;
}

And finally we'll parse request parameters.

static void
pgweb_parse_request_url(PGWRequest *r, int buflen, int *bufp, char **errmsg)
{
   int bufp_original = *bufp;
   int bufp_tmp = *bufp;
   int len = 0;
   char *key = NULL;
   char *value = NULL;
   PGWRequestParam *param = NULL;
   bool path_found = false;
   Assert(CurrentMemoryContext == PGWRequestContext);
   r->params = NIL;
   while (*bufp < buflen && r->buf[*bufp] != ' ')
   {
       len = *bufp - bufp_tmp;
       if (r->buf[*bufp] == '?')
       {
           r->path = pnstrdup(r->buf + bufp_tmp, len);
           path_found = true;
           (*bufp)++;
           bufp_tmp = *bufp;
           continue;
       }
       if (r->buf[*bufp] == '=')
       {
           key = pnstrdup(r->buf + bufp_tmp, len);
           (*bufp)++;
           bufp_tmp = *bufp;
           continue;
       }
       if (r->buf[*bufp] == '&')
       {
           value = pnstrdup(r->buf + bufp_tmp, len);
           (*bufp)++;
           bufp_tmp = *bufp;
           param = palloc0(sizeof(PGWRequestParam));
           param->key = key;
           param->value = value;
           r->params = lappend(r->params, param);
           continue;
       }
       (*bufp)++;
   }
   len = *bufp - bufp_original;
   if (!path_found)
       r->path = pnstrdup(r->buf + bufp_original, len);
   else if (key != NULL && strlen(key) > 0)
   {
       param = palloc0(sizeof(PGWRequestParam));
       param->key = key;
       param->value = pnstrdup(r->buf + bufp_tmp, *bufp - bufp_tmp);
       *bufp += len;
       r->params = lappend(r->params, param);
   }
   r->url = pnstrdup(r->buf + bufp_original, len);
}

Importantly we make sure to allocate request parameters within the PGWRequestContext so they get freed when the request ends.

Finally, we need to implement route handling!

Calling a handle for a route

To spice things up, and to demonstrate control over which MemoryContext we allocate, we're going to not just look up the registered handler for a route and call it. We're also going to cache the response the handler returns. If we ever see the same URL again, if it is in our cache, we'll return the cached result immediately and never call the handler again. The cache will be allocated in the PGWServerContext because it must live for the duration of the process rather than the duration of a single request.

typedef struct PGWResponseCache {
   char *url;
   char *response;
} PGWResponseCache;
static List /* PGWResponseCache * */ *response_cache;
static void
pgweb_handle_request(PGWRequest *request, PGWHandler *handler, char **errmsg)
{
   ListCell *lc;
   char *msg = NULL;
   PGWResponseCache *cached = NULL;
   /* If there's a cached response, use it. */
   foreach (lc, response_cache)
   {
       cached = lfirst(lc);
       if (strcmp(cached->url, request->url) == 0)
       {
           msg = cached->response;
           elog(INFO, "Cached request.");
           break;
       }
   }
   /* No cached response, run the route handler! */
   if (!msg)
   {
       List *func_name_list = stringToQualifiedNameList(handler->funcname, NULL);
       Oid argtypes[] = {JSONOID};
       Oid func_oid = LookupFuncName(func_name_list,
                                     sizeof(argtypes) / sizeof(Oid),
                                     argtypes,
                                     false);
       FmgrInfo func_info;
       Datum params = pgweb_request_params_to_json(request);
       Datum result;
       fmgr_info(func_oid, &func_info);
       result = FunctionCall1(&func_info, params);
       msg = TextDatumGetCString(result);
       /* Cache this response for the future. */
       Assert(CurrentMemoryContext == PGWRequestContext);
       MemoryContextSwitchTo(PGWServerContext);
       cached = palloc0(sizeof(*cached));
       cached->url = pstrdup(request->url);
       cached->response = pstrdup(msg);
       response_cache = lappend(response_cache, cached);
       MemoryContextSwitchTo(PGWRequestContext);
   }
   pgweb_send_response(request, 200, "OK", msg);
}

Now we have only one last helper to implement, pgweb_request_params_to_json, which will convert our association list of request parameters to a Postgres JSON Datum so that the SQL handlers can have this nicely accessible mapping of parameters.

We'll do this by building up a JSON string of the request parameter mappings and then casting the string to Postgres's JSON type.

static Datum
pgweb_request_params_to_json(PGWRequest *request)
{
   ListCell *lc;
   StringInfoData json_string;
   Assert(CurrentMemoryContext == PGWRequestContext);
   initStringInfo(&json_string);
   appendStringInfoString(&json_string, "{");
   foreach (lc, request->params)
   {
       PGWRequestParam *param = lfirst(lc);
       if (json_string.len > 1)
           appendStringInfoString(&json_string, ", ");
       /* We're just going to assume there's no quotes in key or value. */
       appendStringInfo(&json_string,
                        "\"%s\": \"%s\"",
                        param->key,
                        param->value);
   }
   appendStringInfoString(&json_string, "}");
   return DirectFunctionCall1(json_in,
                              CStringGetDatum(json_string.data));
}

Here is another place where we do intermediate allocations, some of which we could free up at the end of the function. We could pfree(json_string.data) before returning for example. Or we could again create a new temporary MemoryContext and reset it before returning. Similar to what we discussed being possible in pgweb_send_response.

In the current implementation however these allocations will stick around until the request ends.

And that's it! Let's try it out.

Tying everything together

We've already install Postgres and pg_config and all the other Postgres binaries should be in our $PATH. So within this extension directory run:

$ make
$ make install

And let's start up a Postgres database in which to try out the extension:

$ initdb testdata
The files belonging to this database system will be owned by user "phil".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
Transparent data encryption is disabled.
creating directory testdata ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
   pg_ctl -D testdata -l logfile start
$ pg_ctl -D testdata -l logfile start
waiting for server to start.... done
server started

Now remember our test SQL script:

$ cat test.sql
DROP EXTENSION IF EXISTS pgweb;
CREATE EXTENSION pgweb;
DROP FUNCTION IF EXISTS handle_hello_world;
CREATE FUNCTION handle_hello_world(params JSON) RETURNS TEXT AS $$
BEGIN
 RETURN 'Hello, ' || (params->>'name') || E'!\n';
END;
$$ LANGUAGE plpgsql;
-- For debugging memory usage.
SELECT pg_backend_pid();
SELECT pgweb.register_get('/hello', 'handle_hello_world');
SELECT pgweb.serve('127.0.0.1', 9003);

Let's run it:

$ psql postgres -f test.sql
psql:test.sql:1: NOTICE:  extension "pgweb" does not exist, skipping
DROP EXTENSION
CREATE EXTENSION
psql:test.sql:4: NOTICE:  function handle_hello_world() does not exist, skipping
DROP FUNCTION
CREATE FUNCTION
pg_backend_pid
----------------
         52001
(1 row)
register_get
--------------
(1 row)
psql:test.sql:16: INFO:  Listening on 127.0.0.1:9003.

It is now blocking waiting for requests. So in a new terminal curl the server!

$ curl '127.0.0.1:9003/hello?name=Tim'
Hello, Tim!

What about how we handle unregistered paths?

$ curl 127.0.0.1:9003
Not found

Nice. And to shut down?

$ curl '127.0.0.1:9003/_exit'
curl: (52) Empty reply from server

Great! The psql postgres -f test.sql command above should have now gracefully exited. We can run that SQL script again at any point to turn on the server again.

Looking for memory leaks

In the beginning I said I'd assume you are on Debian to make the install instructions easier. But now I'm going to assume you are on Linux in general (i.e. not macOS) so that we can run the memleak program from the bcc tools repo.

Let's turn off ever resetting a memory context:

$ git diff
diff --git a/pgweb.c b/pgweb.c
index b7eaa98..2c760f3 100644
--- a/pgweb.c
+++ b/pgweb.c
@@ -373,7 +373,7 @@ done:
                request.url);
       Assert(CurrentMemoryContext == PGWRequestContext);
-       MemoryContextReset(PGWRequestContext);
+       //MemoryContextReset(PGWRequestContext);
       MemoryContextSwitchTo(oldctx);
       return stayalive;
@@ -440,6 +440,6 @@ pgweb_serve(PG_FUNCTION_ARGS)
       }
       close(server_fd);
-       MemoryContextReset(PGWServerContext);
+       //MemoryContextReset(PGWServerContext);
       PG_RETURN_VOID();
}

Now rebuild and reinstall the extension.

$ make
$ make install

Destroy and rebuild the Postgres database:

$ pg_ctl -D testdata -l logfile stop
$ rm -rf logfile initdb
$ initdb testdata
The files belonging to this database system will be owned by user "server".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory testdata ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... US/Eastern
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
   pg_ctl -D testdata -l logfile start
$ pg_ctl -D testdata -l logfile start
waiting for server to start.... done
server started

And run the SQL test script:

$ psql postgres -f test.sql
psql:test.sql:1: NOTICE:  extension "pgweb" does not exist, skipping
DROP EXTENSION
CREATE EXTENSION
psql:test.sql:4: NOTICE:  function handle_hello_world() does not exist, skipping
DROP FUNCTION
CREATE FUNCTION
pg_backend_pid
----------------
         66893
(1 row)
register_get
--------------
(1 row)
psql:test.sql:16: INFO:  Listening on 127.0.0.1:9003.

Note that the pg_backend_pid is 66893. Since the server is running within the same process that executes the pgweb.serve() function, the memory leaks that happen within this function will show up in this process.

In a new terminal, start up the memleak program against this process.

$ sudo memleak-bpfcc -p 66893
Attaching to pid 66893, Ctrl+C to quit.

And in another terminal, let's make a ton of curl requests (10,000) against the server to exercise the paths and allocations.

$ cat test.sh
#!/bin/bash
for ((i=0;i<10000;i++)); do
       curl "127.0.0.1:9003/hello?name=Test$i"
done
$ bash test.sh
Hello, Test0!
… omitted …
Hello, Test9999!

And check out the leaks reported by memleak like this:

        66895872 bytes in 12 allocations from stack
               0x0000ffffbbf0c4cc      [unknown] [libc.so.6]
               0x0000ffffbbf0d254      [unknown] [libc.so.6]
               0x0000ffffbbf0e50c      [unknown] [libc.so.6]
               0x0000ffffbbf0f0cc      malloc+0x21c [libc.so.6]
               0x0000fffffffff000      [unknown] [[uprobes]]
               0x0000aaaac416a708      palloc+0x68 [postgres]
               0x0000ffffbbcf1598      pgweb_serve+0x148 [pgweb.so]
               0x0000aaaac3e250d4      ExecInterpExpr+0x1620 [postgres]
               0x0000aaaac3e5e204      ExecResult+0x114 [postgres]
               0x0000aaaac3e28f8c      standard_ExecutorRun+0x14c [postgres]
               0x0000aaaac3ff5b34      PortalRunSelect+0x194 [postgres]
               0x0000aaaac3ff7448      PortalRun+0x268 [postgres]
               0x0000aaaac3ff1d3c      exec_simple_query+0x37c [postgres]
               0x0000aaaac3ff2928      PostgresMain+0x848 [postgres]
               0x0000aaaac3f526d0      ServerLoop.isra.0+0x1c2c [postgres]
               0x0000aaaac3f53824      PostmasterMain+0xec4 [postgres]
               0x0000aaaac3c25f44      main+0x214 [postgres]
               0x0000ffffbbea7740      [unknown] [libc.so.6]
               0x0000ffffbbea7818      __libc_start_main+0x98 [libc.so.6]
               0x0000aaaac3c26270      _start+0x30 [postgres]

About 65328Kib of data leaked during this workload. And that is just one leak it found.

(I am not sure why it can't tell me the line number of the palloc here. But at least we see it coming from within the pgweb_serve callstack.)

Now gracefully exit the server by running:

$ curl localhost:9003/_exit
curl: (52) Empty reply from server

And let's drop that diff we introduced above that commented out resetting memory contexts. i.e. undo this:

$ git diff pgweb.c
diff --git a/pgweb.c b/pgweb.c
index 210656d..94b0888 100644
--- a/pgweb.c
+++ b/pgweb.c
@@ -375,7 +375,7 @@ done:
                request.url);
       Assert(CurrentMemoryContext == PGWRequestContext);
-       MemoryContextReset(PGWRequestContext);
+       //MemoryContextReset(PGWRequestContext);
       MemoryContextSwitchTo(oldctx);
       return stayalive;
@@ -442,6 +442,6 @@ pgweb_serve(PG_FUNCTION_ARGS)
       }
       close(server_fd);
-       MemoryContextReset(PGWServerContext);
+       //MemoryContextReset(PGWServerContext);
       PG_RETURN_VOID();
}

After undoing that diff we will always reset the memory context when we're done with it.

Rebuild and reinstall the extension.

$ make
$ make install

And re-run the test SQL script to start the server:

$ psql postgres -f test.sql
DROP EXTENSION
CREATE EXTENSION
DROP FUNCTION
CREATE FUNCTION
pg_backend_pid
----------------
         87001
(1 row)
register_get
--------------
(1 row)
psql:test.sql:16: INFO:  Listening on 127.0.0.1:9003.

In a new terminal run memleak against this new backend pid:

$ sudo memleak-bpfcc -p 87001
Attaching to pid 87001, Ctrl+C to quit.

And in a new terminal run the 10,000 request workload:

$ bash test.sh
Hello, Test0!
… omitted …
Hello, Test9999!

And note the largest outstanding allocations according to memleak.

        1052672 bytes in 1 allocations from stack
               0x0000ffffbbf0c4cc      [unknown] [libc.so.6]
               0x0000ffffbbf0d254      [unknown] [libc.so.6]
               0x0000ffffbbf0e50c      [unknown] [libc.so.6]
               0x0000ffffbbf0f0cc      malloc+0x21c [libc.so.6]
               0x0000fffffffff000      [unknown] [[uprobes]]
               0x0000aaaac4169fbc      MemoryContextAlloc+0x5c [postgres]
               0x0000aaaac416b338      pstrdup+0x28 [postgres]
               0x0000ffffbbcf1c10      pgweb_serve+0x770 [pgweb.so]
               0x0000aaaac3e250d4      ExecInterpExpr+0x1620 [postgres]
               0x0000aaaac3e5e204      ExecResult+0x114 [postgres]
               0x0000aaaac3e28f8c      standard_ExecutorRun+0x14c [postgres]
               0x0000aaaac3ff5b34      PortalRunSelect+0x194 [postgres]
               0x0000aaaac3ff7448      PortalRun+0x268 [postgres]
               0x0000aaaac3ff1d3c      exec_simple_query+0x37c [postgres]
               0x0000aaaac3ff2928      PostgresMain+0x848 [postgres]
               0x0000aaaac3f526d0      ServerLoop.isra.0+0x1c2c [postgres]
               0x0000aaaac3f53824      PostmasterMain+0xec4 [postgres]
               0x0000aaaac3c25f44      main+0x214 [postgres]
               0x0000ffffbbea7740      [unknown] [libc.so.6]
               0x0000ffffbbea7818      __libc_start_main+0x98 [libc.so.6]
               0x0000aaaac3c26270      _start+0x30 [postgres]

Just 1028Kib outstanding! And this almost certainly comes from the cache which lives for the life of process.

If we now curl the exit route (that resets the PGWServerContext and thus frees the cache):

$ curl localhost:9003/_exit
curl: (52) Empty reply from server

And observe the output of memleak:

        16384 bytes in 1 allocations from stack
               0x0000aaaac4161f18      AllocSetAlloc+0x378 [postgres]
               0x0000aaaac416a480      MemoryContextAllocExtended+0x70 [postgres]
               0x0000aaaac4147570      element_alloc+0x50 [postgres]
               0x0000aaaac41482d0      hash_search_with_hash_value+0x1a0 [postgres]
               0x0000aaaac3d71b1c      oper+0x1c8 [postgres]
               0x0000aaaac3d71f50      make_op+0x70 [postgres]
               0x0000aaaac3d6b964      transformAExprOp+0x74 [postgres]
               0x0000aaaac3d6b954      transformAExprOp+0x64 [postgres]
               0x0000aaaac3d6c204      transformExpr+0x20 [postgres]
               0x0000aaaac3d78f5c      transformTargetEntry+0xbc [postgres]
               0x0000aaaac3d7afd4      transformTargetList+0x80 [postgres]
               0x0000aaaac3d3c618      transformStmt+0xcb4 [postgres]
               0x0000aaaac3d3de38      parse_analyze_withcb+0x54 [postgres]
               0x0000aaaac3ff16c8      pg_analyze_and_rewrite_withcb+0x34 [postgres]
               0x0000aaaac3e6a0c8      _SPI_prepare_plan+0x98 [postgres]
               0x0000aaaac3e6aa68      SPI_prepare_extended+0x98 [postgres]
               0x0000ffffb2b2cd00      exec_prepare_plan+0x40 [plpgsql.so]
               0x0000ffffb2b2d56c      exec_eval_expr+0x27c [plpgsql.so]
               0x0000ffffb2b33118      exec_stmts+0x1a14 [plpgsql.so]
               0x0000ffffb2b34a94      exec_stmt_block+0x544 [plpgsql.so]
               0x0000ffffb2b34bcc      exec_toplevel_block.constprop.0+0x58 [plpgsql.so]
               0x0000ffffb2b354d0      plpgsql_exec_function+0x200 [plpgsql.so]
               0x0000ffffb2b405e8      plpgsql_call_handler+0xf8 [plpgsql.so]
               0x0000aaaac4142cdc      FunctionCall1Coll+0x3c [postgres]
               0x0000ffffbbcf1bd8      pgweb_serve+0x738 [pgweb.so]
               0x0000aaaac3e250d4      ExecInterpExpr+0x1620 [postgres]
               0x0000aaaac3e5e204      ExecResult+0x114 [postgres]
               0x0000aaaac3e28f8c      standard_ExecutorRun+0x14c [postgres]
               0x0000aaaac3ff5b34      PortalRunSelect+0x194 [postgres]
               0x0000aaaac3ff7448      PortalRun+0x268 [postgres]
               0x0000aaaac3ff1d3c      exec_simple_query+0x37c [postgres]
               0x0000aaaac3ff2928      PostgresMain+0x848 [postgres]
               0x0000aaaac3f526d0      ServerLoop.isra.0+0x1c2c [postgres]
               0x0000aaaac3f53824      PostmasterMain+0xec4 [postgres]
               0x0000aaaac3c25f44      main+0x214 [postgres]
               0x0000ffffbbea7740      [unknown] [libc.so.6]
               0x0000ffffbbea7818      __libc_start_main+0x98 [libc.so.6]
               0x0000aaaac3c26270      _start+0x30 [postgres]

We see only even smaller outstanding allocations and these may be happening in memory contexts we don't control, despite being within our call stack. Because, again, anyone can switch to any MemoryContext they want to at any time.

Lastly, let's try out the theory that our major outstanding allocations (when we correctly reset memory contexts) are because of the cache. We can test this by making the same request 10,000 times rather than making a different request 10,000 times.

$ git diff
diff --git a/test.sh b/test.sh
index 7db8f8b..95b6521 100644
--- a/test.sh
+++ b/test.sh
@@ -1,5 +1,5 @@
#!/bin/bash
for ((i=0;i<10000;i++)); do
-       curl "127.0.0.1:9003/hello?name=Test$i"
+       curl "127.0.0.1:9003/hello?name=Test1"
done

Re-run the test SQL script:

$ psql postgres -f test.sql
DROP EXTENSION
CREATE EXTENSION
DROP FUNCTION
CREATE FUNCTION
pg_backend_pid
----------------
         97085
(1 row)
register_get
--------------
(1 row)
psql:test.sql:16: INFO:  Listening on 127.0.0.1:9003.

Attach memleak to this new backend process:

$ sudo memleak-bpfcc -p 97085
Attaching to pid 97085, Ctrl+C to quit.

And run this new workload:

$ bash test.sh
Hello, Test1!
… 9,998 similar log lines omitted …
Hello, Test1!

And observe that even without exiting the server process the biggest allocation remains the same: not our allocation:

        16384 bytes in 1 allocations from stack
               0x0000aaaac4161f18      AllocSetAlloc+0x378 [postgres]
               0x0000aaaac416a480      MemoryContextAllocExtended+0x70 [postgres]
               0x0000aaaac4147570      element_alloc+0x50 [postgres]
               0x0000aaaac41482d0      hash_search_with_hash_value+0x1a0 [postgres]
               0x0000aaaac3d71b1c      oper+0x1c8 [postgres]
               0x0000aaaac3d71f50      make_op+0x70 [postgres]
               0x0000aaaac3d6b964      transformAExprOp+0x74 [postgres]
               0x0000aaaac3d6b954      transformAExprOp+0x64 [postgres]
               0x0000aaaac3d6c204      transformExpr+0x20 [postgres]
               0x0000aaaac3d78f5c      transformTargetEntry+0xbc [postgres]
               0x0000aaaac3d7afd4      transformTargetList+0x80 [postgres]
               0x0000aaaac3d3c618      transformStmt+0xcb4 [postgres]
               0x0000aaaac3d3de38      parse_analyze_withcb+0x54 [postgres]
               0x0000aaaac3ff16c8      pg_analyze_and_rewrite_withcb+0x34 [postgres]
               0x0000aaaac3e6a0c8      _SPI_prepare_plan+0x98 [postgres]
               0x0000aaaac3e6aa68      SPI_prepare_extended+0x98 [postgres]
               0x0000ffffb2b2cd00      exec_prepare_plan+0x40 [plpgsql.so]
               0x0000ffffb2b2d56c      exec_eval_expr+0x27c [plpgsql.so]
               0x0000ffffb2b33118      exec_stmts+0x1a14 [plpgsql.so]
               0x0000ffffb2b34a94      exec_stmt_block+0x544 [plpgsql.so]
               0x0000ffffb2b34bcc      exec_toplevel_block.constprop.0+0x58 [plpgsql.so]
               0x0000ffffb2b354d0      plpgsql_exec_function+0x200 [plpgsql.so]
               0x0000ffffb2b405e8      plpgsql_call_handler+0xf8 [plpgsql.so]
               0x0000aaaac4142cdc      FunctionCall1Coll+0x3c [postgres]
               0x0000ffffbbcf1bd8      pgweb_serve+0x738 [pgweb.so]
               0x0000aaaac3e250d4      ExecInterpExpr+0x1620 [postgres]
               0x0000aaaac3e5e204      ExecResult+0x114 [postgres]
               0x0000aaaac3e28f8c      standard_ExecutorRun+0x14c [postgres]
               0x0000aaaac3ff5b34      PortalRunSelect+0x194 [postgres]
               0x0000aaaac3ff7448      PortalRun+0x268 [postgres]
               0x0000aaaac3ff1d3c      exec_simple_query+0x37c [postgres]
               0x0000aaaac3ff2928      PostgresMain+0x848 [postgres]
               0x0000aaaac3f526d0      ServerLoop.isra.0+0x1c2c [postgres]
               0x0000aaaac3f53824      PostmasterMain+0xec4 [postgres]
               0x0000aaaac3c25f44      main+0x214 [postgres]
               0x0000ffffbbea7740      [unknown] [libc.so.6]
               0x0000ffffbbea7818      __libc_start_main+0x98 [libc.so.6]
               0x0000aaaac3c26270      _start+0x30 [postgres]

So indeed we are able to account for all of our allocations and observe when we misplace/do not free them.

Hopefully this little project helped to build your intuition for how to program with MemoryContext in Postgres!

Share this

More Blogs

Explaining ABI Breakage in PostgreSQL 17.1

PostgreSQL comes out with a scheduled major release every year and scheduled minor releases for all supported versions every quarter. But in the November minor releases, two issues caused the...
December 06, 2024