The use of server-side logic, particularly stored procedures, has been a highly contentious topic among database professionals for decades. The question has always been what amount of logic should be encoded in the database vs. in client applications or application servers.
Beyond using the database as a simple data store, there are three levels in which logic can be added to Postgres:
-- Database constraints, e.g. unique, check, foreign keys
-- Triggers
-- Manually-called stored procedures, i.e., functions not called as triggers
Let's look at each of these in turn:
- Database constraints are often seen as optional by database application developers. They don't really do anything except prevent invalid data from being entered into the database. The overhead of performing these checks is often seen as negative, particularly foreign key checks. (Default clauses that call stored procedures are similar to triggers; the serial data type uses this.)
- Triggers are stored procedures assigned to tables that are executed automatically during insert, update, delete, or truncate commands. They can be triggered before the command runs, usually for complex constraint checks or to modify incoming data, e.g. capitalization. When triggered after the command, they perform post-command operations, such as adding entries to a log table.
- Manually-called stored procedures are functions called usually in where clauses or in the target list of select queries. Stored procedures, including ones used as triggers, can be written in many languages.
Now, on to the question of when to use these features. Some Postgres users use none of these features for reasons of performance and portability. Others use all of them, and require applications to call stored procedures to perform tasks, rather than issuing sql statements. This allows database administrators to control every aspect of database access. For most users, the best solution is something in the middle, and figuring out when to use what can be tricky.
For database constraints, it is often possible to perform constraint checks in the application, rather than in the database. While this is possible for most checkconstraints, it is hard to do for unique constraints because multi-user applications rarely share state except inside the database. If database constraints are not used, applications and manually-issued sql queries must perform the checks, and any mistakes must be detected later and corrected. If multiple applications access the database, they must perform the checks in the same way — this is particularly difficult if they are written in different languages. Application upgrades also require constraint synchronization in multi-application environments.
Triggers can be avoided by having applications perform the checks and queries that triggers would have performed. Performing data checks application-side have the same downsides as avoiding database constraints. Additional application queries necessary when triggers are avoided can lead to slower performance due to network round-trip delays.
Avoiding manually-called stored procedures requires all logic to be in the application. This can lead to serious slowdowns because if a function cannot be used in a where clause, the entire data set must be transferred to the client application where filtering can be done.
Ultimately, the decision of when to use server-side logic revolves around efficiency — efficiency of hardware utilization, efficiency of development, and efficiency of data management. Your use of server-side logic will be dependent on which area of efficiency is most important to you. This email thread outlines many of the tradeoffs seen by Postgres users.
Bruce Momjian is a Senior Database Architect at EnterpriseDB.
This post originally appeared on Bruce's personal blog.