Explore the qualities that make a PostgreSQL Database as a Service (DBaaS) most effective for your operations
A customer once asked us what it would take for them to create an on-premise Database as a Service (DBaaS) from scratch. While this is an interesting question, it requires a deeper understanding of the features and characteristics of an ideal DBaaS.
First, let’s define DBaaS, which is not as clear as it seems. The NIST Definition of Cloud Computing by the US Department of Commerce National Institute of Standards and Technology (NIST) is a good starting point. While it doesn’t define DBaaS directly, its explanation of other cloud service models can help.
NIST identifies three service models for cloud computing: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS).
Infrastructure as a Service provides processing, storage, networks, and other fundamental computing resources. Users deploy and run the software, which includes their operating systems and applications, but they don’t own, manage, or control the underlying cloud infrastructure. Instead, they have control over operating systems, storage, and deployed applications, and may have limited control over select networking components like hosting firewalls.
IaaS is similar to traditional software and database management since users control most aspects of their infrastructure except for the location. The main difference is that a cloud server is used instead of a physical one; think of it as the next logical extension to virtual machines.
Platform as a Service (PaaS) is a cloud service model where users can deploy their own applications made with programming languages, libraries, services, and tools offered and supported by the cloud provider onto the cloud infrastructure. The user does not manage or control the underlying cloud infrastructure, including network, servers, operating systems, or storage, but controls the deployed applications and possibly configuration settings for the application-hosting environment.
PaaS differs from IaaS in that the user pays the cloud provider to use the development platform and infrastructure instead of purchasing these themselves. The provider installs and maintains any needed components on the server, including patching and upgrading the applications and the operating system. Think of PaaS as a deployed environment ready for application development.
Software as a Service (SaaS) is a cloud service model where the user utilizes the provider’s applications on a cloud infrastructure. The applications are accessible from various client devices through a thin client interface, such as a web browser (e.g., web-based email), or a program interface. The user does not own the underlying software and does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities – with the possible exception of limited user-specific application configuration settings. All these tasks are the domain of the cloud provider. The required software is deployed on a cloud infrastructure and the user can access and utilize the software as needed. Microsoft Office 360 and Google Docs are typical examples of Software as a Service.
Where does a database fit into all of these “as a service” offerings or models? DBaaS is not directly aligned with either of the three models, according to strict NIST definitions. However, DBaaS is more narrowly defined and closely aligned to Platform as a Service. DBaaS is PaaS focused on a database. DBaaS is a cloud service model where the user is given a database deployed on a cloud provider's infrastructure. The cloud provider performs all or most of the administrative tasks and maintenance of the database and operating system. The user focuses on utilizing the database. Depending on the cloud provider and the underlying database, the user may have some control over the database and configuration parameters.
Now that we have defined DBaaS, the focus will shift to the ideal characteristics of DBaaS. Again, The NIST Definition of Cloud Computing is a good starting point as it lists five essential characteristics of cloud computing:
- On-demand self-service: Automated provisioning without human intervention
- Broad network access: Access from anywhere with any number of platforms
- Resource pooling: Multi-tenant utilization of hardware and software (with boundaries) to allow for greater flexibility and greater utilization of resources vs. idle servers sitting in a data center
- Rapid elasticity: Seamless ability to rapidly scale and release provision resources based on demand
- Measured service: Resource utilization is monitored, controlled, and reported by the cloud service provider, which offers transparency for both the provider and end-user of the utilized service, and the user is billed for what is used
While these are essential characteristics of any cloud service model and will serve as a basis for any “as a service” models, it is important to consider the additional characteristics and details outlined below for a DBaaS model.
Control Plane Architecture
Solid architecture is key to success, whether you’re building a house or developing a Database as a Service model. Simply using an API to call scripts like bash, Chef, Ansible, or Terraform to deploy a database in the cloud doesn’t make it DBaaS. A control plane isn’t just about provisioning pipelines that utilize scripts; instead, it is the centralized brain of the DBaaS. It offers a user interface for managing database lifecycle events such as provisioning, scaling, and backing up, and actively manages the databases during operational issues such as hardware failures.
Responding to failures is especially important here. This is because DBaaS faces more complex challenges than typical Software as a Service. Supporting each new user requires deploying and managing independent stateful distributed systems, often on independent hardware. A control plane-based architecture allows more sophisticated management and recovery processes to be implemented in a scalable way.
Some off-the-shelf options here include EDB’s Cloud Native Postgres (CNP), a control plane-based architecture that can be wrapped up in APIs to deliver much of the DBaaS experience. Comparing CNP to EDB’s Terraform/Ansible scripts or even TPAexec for deploying EDB Postgres Distributed, CNP is control plane-based while the others are more provisioning scripts. The control plane does not need to do everything; its functionality should align with the responsibility model of the DBaaS. Most DBaaS control planes, for example, will not automatically tune the database engine for users.
Self-Service/On-Demand
Self-service and on-demand require rapid provisioning, deployment, scaling, configuration, and other activities are automatically performed in minutes, not hours or days. These processes should be API-driven, needing only human intervention to initiate the process, not for provisioning and deployment (emailing the admin for provisioning, scaling, restoring, etc., is not self-service).
High availability and disaster recovery should be built-in or optional (maybe there is a development option). Database recovery, for example, should happen automatically, without the user needing to take action to recover a database instance. In a leader failure scenario, the replica should automatically be promoted, and new connections routed to the new leader while the old leader is automatically repaired and rejoins the cluster as a replica.
Configuration must also be self-service, and not just database configuration (e.g., shared buffers). Network and other infrastructure configurations should also have API-driven controls to manage network segments that can access the database. The underlying database should also be configurable by Database as a Service users; the more configuration/access, the better. However, this must be balanced against the control plane’s ability to manage and keep the system viable.
Resource Pooling
Resource pooling is also mentioned in the NIST document but should be reiterated with additional details. The underlying infrastructure should be abstracted away from the user. Based on the self-service characteristic, they should not have to provision their VMs and then ask the Database as a Service to deploy on them, as this is neither control plane-based architecture nor self-service.
DBaaS can be multi-tenanted at various levels. Deployments can take many forms, such as multi-tenancy within a database engine where many DBaaS customers share a single PostgreSQL instance. There can also be multi-tenancy within a virtual machine, with many single-tenant PostgreSQL instances on one virtual machine. Likewise, multi-tenancy can occur within an environment where each virtual machine has its own PostgreSQL, with many such machines in the DBaaS environment.
We recommend multi-tenancy within an environment. Avoid sharing resources at the database or virtual machine levels, as this overly complicates the deployment. Containers can help with multi-tenancy within a virtual machine, but they are extremely hard to get right. They serve better as a packaging mechanism than a resource isolation mechanism.
Measured Services
Consumption for a Database as a Service model should be measured based on time or utilization. For example, instead of billing for the number of cores, the user should be charged for core-hours used. In a more cloud-native, serverless deployment, billing could be based on requests rather than capacity. The measurement should be transparent and result in a predictable cost model.
Rapid Elasticity
Rapid elasticity refers to scalability, both up and down. The Database as a Service must orchestrate the scaling automatically. The user does not need to manually fail over database instances as instances are upgraded; instead, the system scales gracefully. A DBaaS might scale automatically in response to customer demand. For example, it will increase storage size if storage is at 90% utilization or add read replicas if existing read replicas are resource-constrained.
Shared Responsibility
This characteristic was not explicitly mentioned in the NIST document. The concept of shared responsibility dictates that the cloud provider and the user have some accountability to ensure that the Database as a Service is deployed, configured, and maintained, and that it functions as designed.
The cloud provider handles maintenance tasks such as database patching, hardware upgrades, high availability, and backup and restore. The user is responsible for query performance, password management, and resource allocation/selection (determining the hardware suitable for the expected workload). The responsibilities can vary with different cloud services, but the idea of shared responsibility remains.
This raises an interesting question for internal DBaaS: Who is the service provider? Is there an SRE or platform team managing the control plane that upgrades database instances or receives alerts in case of a failure? Usually, this would be the cloud vendor (EDB, AWS, Azure, etc.) and should not be overlooked.
No Vendor Lock-In
One of the goals of moving to a cloud deployment like Database as a Service is to avoid proprietary vendor databases that lock the business into long-term contracts and prevent agility. Agility means you can move the database from on-premise to AWS, Azure, or even switch between without major changes to the underlying applications and database.
The goal is not to abstract too far from the familiar database; it should look and feel like PostgreSQL. A database service that doesn’t behave like PostgreSQL can be tough for users. It is preferable to build a control plane around a close-to-open-source PostgreSQL and provide users with the database they are used to.
Governance and Security
Governance and security should be at the forefront of any “as a service” model, and should not be added as an afterthought.
Governance determines how individuals interact with the service. Some users might be able to connect to a database but not create a new one, while others can scale a database but not delete it. The Database as a Service management interface should allow you to limit certain user actions and track all activities.
Security is more than just repairing the DB and OS with the latest security patches. It should also minimize the impact of a security breach, such as implementing network segmentation or OS-level controls to limit potential virtual machine escapes. Therefore, threat modeling and other standard processes are essential when designing a DBaaS. Control plane security is also critical. If the control plane is compromised, all database instances on the platform are also compromised.
In conclusion, creating a Database as a Service from scratch requires significant effort. You need to establish the required infrastructure and architecture to support the control plane, self-service and on-demand, resource pooling, rapid elasticity, and shared responsibility capabilities while trying to avoid vendor lock-in and implementing governance and security.
We believe EDB Postgres® AI Cloud Service provides all these characteristics.
DBaaS stands for Database as a Service, a cloud-based service model that enables users and organizations to create, manage, maintain, and query databases in the cloud.
DBaaS operates by providing a cloud-based environment where the service provider hosts and manages database instances, allowing users to focus on development and data management rather than infrastructure setup.
DBaaS offers scalability, automated backups, high availability, and reduced management overhead, making it easier for enterprises to handle their database needs without extensive in-house resources.
DBaaS can host various types of databases, including relational databases like PostgreSQL and MySQL, as well as non-relational databases such as MongoDB and Cassandra.
Yes, many DBaaS providers can access databases across multiple cloud environments, facilitating hybrid deployment strategies.
Security measures typically include encryption, access controls, compliance with industry standards, and regular security audits conducted by the service provider.
Regular backups are crucial in DBaaS to safeguard data against loss or corruption, with many service providers offering automated backup solutions.
Potential limitations include reduced control over the hardware and software stack, vendor lock-in, and performance constraints based on the provider's infrastructure.
While most DBaaS offerings come with predefined configurations, many providers allow for customization regarding database parameters and resource allocations.
Costs can vary widely depending on the provider, based on storage, compute resources, and additional features. It’s essential to assess your needs and the provider's pricing model.
Data migration from on-premise databases to DBaaS is a common practice and typically involves data transfer methods such as logical dumps or replication.
Most DBaaS platforms provide integrated management tools for monitoring, administration, and performance tuning, allowing users to manage their databases effectively.
To ensure compliance, select a DBaaS provider that adheres to relevant data protection regulations and offers features that support compliance auditing and reporting.
Evaluate factors like security features, performance guarantees, support services, pricing, scalability, and the provider's reputation in the industry before making a decision.