I am Developer! (And You Can Too!)

April 05, 2019

A while back, 2ndQuadrant notified a few of us that we should get more involved in Postgres Development in some capacity. Being as I’ve essentially fallen off the map in corresponding with the mailing lists in general, it would be a good way to get back into the habit.

But wait! Don’t we want more community involvement in general? Of course we do! So now is also a great opportunity to share my journey in the hopes others follow suit. Let’s go!

In the Beginning

So how does one start contributing? Do I have to know C? Must I poke around in the internals for weeks to form an understanding, however so tenuous, of the Black Magic that animates Postgres? Perhaps I should chant incantations to summon some dark entity to grant otherworldly powers necessary to comprehend the mind-bleedingly high-level conversations regularly churning within the deeply foreboding confines of the Hackers mailing list.

No.

God no.

If those were the prerequisites for getting involved, there would be approximately one person pushing Postgres forward, and his name is Tom Lane. Everyone else would be too terrified to even approach the process. I certainly count myself among those more timid individuals.

Instead, let’s start somewhere simple, and with something small. Sometimes all it takes to make a difference is to piggyback on the coattails of someone else who knows what they’re doing. If we’re too inexperienced to submit a patch, maybe we can review one instead.

Getting the Party Started (In Here)

Postgres development marches forward steadily in a form of punctuated equilibrium. Every few months, we throw a bunch of patches against the wall, and see what sticks. That cadence is currently marshaled by a master list of past and present commit fests going back to late 2014. It’s hardly an exhaustive resource dating back to antiquity, but it doesn’t need to be.

All we really need is the most recent iteration that’s in progress. At the time of this writing, that’s 2019-03.

Decisions, Decisions

Now for what might just be the most difficult portion of the whole endeavor: choosing a patch to review. Patches are grouped by category, and range from the relatively mundane documentation revision, to the more brain numbing voodoo of planner bugs.

Some of these, by their very nature, may seem storied and impenetrable. The Logical decoding of two-phase transactions patch for example, has been in commitfests since early 2017! It’s probably best to focus on patches that offer high utility, or seem personally interesting. There’s certainly no shortage of selection!

Just browse through a few, click through the patches until one seems interesting, and read the emails. As a personal hint, use this link after opening the “Emails” link in another tab:

 

Don’t click through each message, see them all!

It will let you view the entire conversation on the patch until now, and can be invaluable for understanding some of the background before moving forward.

I personally elected to review the patch on a separate table level option to control compression for a few reasons:

  1. It’s really just an extension on the Postgres Grand Unified Config based on a parameter that already exists: toast_tuple_target.
  2. The patch included documentation for the new parameter, so I’d have some basis for how it should work.
  3. The patch included tests, so I could compare how Postgres behaved with and without it.
  4. The existing conversation was relatively short, so I could provide some meaningful input that wouldn’t be immediately overshadowed by more experienced participants. Gotta maximize that value!
  5. More selfishly, I worked with the author, so I could clarify things off list if necessary.

With that out of the way, it was time press “Become Reviewer” to get to “work”.

Bits and Pieces

Each patch summary page will, or should, list all of the patches in the “Emails” pane. Simply grab the latest of these and download it into your local copy of the Postgres source.

Wait, you don’t have one of those? Do you have git installed? If not, that’s a whole different conversation. So let’s just assume git is available and get some Postgres code:

git clone git://git.postgresql.org/git/postgresql.git

Then the easiest way to proceed is to work on your own local branch. In most cases, we want to apply our patch to HEAD, so we just need to create a new branch as a snapshot of that before we start reviewing. This is what I did:

git checkout -b cf_table_compress

Now when HEAD advances, it won’t mess with our testing until we rebase or pull HEAD into our branch. But what about obtaining and applying the patch? Remember those email links? Know how to use wget? Great! Here’s the patch I retrieved:

wget https://www.postgresql.org/message-id/attachment/99931/0001-Add-a-table-level-option-to-control-compression.patch

Then we need to apply the patch itself. The patch utility applies patches to code referenced in the patch file. Using it can sometimes be confusing, but there’s really only two things relevant to us: the -p parameter, and the patch itself.

Just drop the patch in the base folder of your Postgres repository clone, and do this:

patch -p1 < 0001-Add-a-table-level-option-to-control-compression.patch

Thus a level-1 prefix application of the supplied patch is now applied to the Postgres code, assuming there were no errors. That in fact, can be our first criteria.

  1. Did the patch apply cleanly?

Compile, compile, compile

So what now? Well, really we just follow the usual process of building most code with one exception. If we want to test a running Postgres instance, it’s best to put the binaries somewhere they won’t interfere with the rest of the system, a container, or otherwise. The easiest way is to just set the --prefix before building.

So do this:

./configure --prefix=/opt/cf_table_compress
make
make install

And it’s also good to do this:

make check
make installcheck

That will invoke the test routines both before and after installing the build, to prove everything works. It’s a good idea to do this even if the patch didn’t include its own tests, because the patch itself may have broken existing functionality. So now we have some additional criteria to share:

  1. Did the code compile cleanly?
  2. Did all tests complete successfully?
  3. Did the build install properly?
  4. Does the installed version work?
  5. Did tests succeed against the installed version?

Blazing a New Trail

At this point we’re free to apply any further criteria we deem necessary. Some starting points that could prove useful:

  • Does the patch do what it claims?
  • Is the functionality accurately described?
  • Can you think of any edge cases it should handle? Test those.
  • Is there anything in the documentation that could be clarified?
  • If applicable, were there any performance regressions?

Really, the sky is the limit. Just remember that sunlight is the best disinfectant. You’re not just proving contributing isn’t as hard as it looks, your efforts can directly make Postgres better by proving a patch is worthy of inclusion. You, yes you, have the power to reveal bad behavior before it becomes a bug we have to fix later.

Take as much time as you need, and be as meticulous as you feel necessary. Just share your observations before too long, or existing reviewers or constraints of the commitfest itself may beat you to the punch.

Sharing the Wisdom

Remember that original email thread that the patch summary page referenced? Once a patch has been put through its paces, it’s time to tell everyone what you did, how it went, and how many thumbs (in which directions) the patch deserves.

Did it completely destroy everything, corrupt all data you inserted, and also make vague threats against your pets? Make a note about all of that. Did it perform as expected, and also clean your gutters before the spring rainfalls? We want to know that too. You were the one performing the tests, and did all the hard work to decide what you wanted to verify or augment. Don’t keep it to yourself!

With that said, there really is no formal review procedure. There appears to be a loosely advocated format, but really all that’s necessary is to convey your experiences without omitting anything critical. My own initial effort was probably unduly ostentatious, but hey, I’m still learning. It’s OK to be a bit awkward until experience digs a more comfortable rut.

We want to hear from you! We crave your input! Be a part of the Postgres team.

You know you want to.

Share this

Relevant Blogs

Random Data

This post continues from my report on Random Numbers. I have begun working on a random data generator so I want to run some tests to see whether different random...
December 03, 2020

More Blogs

Full-text search since PostgreSQL 8.3

Welcome to the third – and last – part of this blog series, exploring how the PostgreSQL performance evolved over the years. The first part looked at OLTP workloads, represented...
November 05, 2020

Números aleatorios

He estado trabajando gradualmente en el desarrollo desde cero de herramientas para probar el rendimiento de los sistemas de bases de datos de código abierto. Uno de los componentes de...
November 04, 2020