Migrating Postgres to the Cloud

October 18, 2021

This blog post is designed to give developers and administrators practical steps and considerations when moving a database application from bare metal, virtual machines, and Kubernetes to public cloud infrastructure. While it does not prove all details about the move, it does explore the various things that must be considered.
 


Why move to the cloud?

You already have a running application and database on your own infrastructure, so why move it to the cloud? You might do it because:

  • Improved database uptime
  • Improved network connectivity
  • Easier deployment
  • Hardware size flexibility
  • Greater flexibility for uneven workloads
  • Cost reduction
  • Company policy

 

What are you moving?

Databases rarely operate in isolation. There are usually one or more applications that interact with the database, plus middleware, monitoring, backup. failover, and other tooling. This post will focus on moving the database to the cloud, and discuss issues related to moving database tooling.

 

How is it currently hosted?

As stated earlier, the database could be hosted on bare metal, virtual machines, or Kubernetes/containers. Why is that important?  Well, you are going to want to replicate your current setup as much as possible in the cloud. If you are using Kubernetes/containers, you have already abstracted the dependencies of the database, so you might be able to move Kubernetes pods directly to the cloud infrastructure. If you are using virtual machines, you probably already have tooling to recreate these virtual machines in the cloud. If you are using bare metal, the database might be intricately tied to the operating system, even in ways the staff doesn’t fully grasp. As part of the migration, all database dependencies will need to be identified, and it must be clear that all dependencies can be replicated in the cloud environment.

 

Migrating databases to the cloud: The moving process

Moving databases is never easy. First, they contain a lot of state, often terabytes of it, and moving that much data can be slow.  Second, databases are usually part of the critical enterprise infrastructure, meaning that downtime must be minimized.

The database moving process has two parts, the data moving process, and everything else, like the binaries, configuration, and extensions.
Postgres provides several ways of moving data to another system:

  1. File system snapshot
  2. Logical dump, e.g., pg_dumpall
  3. Binary replication
  4. Logical replication

The first method requires the server to be down, while the last two methods provide continuous synchronization, which can be helpful in reducing downtime.  If using the first and third options, you will need to use the same Postgres and operating system versions in the cloud. Moving the other parts is mostly a mechanical process.
 


The cloud target: IaaS vs. DBaaS 

There are various cloud vendors and offerings. Some offerings are infrastructure-as-a-service, known as IaaS, and others offer a database-as-a-service (DBaaS). Which one is right for you depends on whether you want the flexibility (and work) of creating and maintaining a database install vs having more limited flexibility by allowing the cloud provider to create and maintain the database. Both options are useful–the question is whether the flexibility and reduced cost of IaaS is valuable to you.

If you choose a fully managed database as a service, you should consider your support needs and the potential need for strategic partnership. Assuming that the databases that you are moving to the cloud are mission critical, then it is important to pick a DBaaS provider with deep expertise in that specific database software. Even when you use proven database software, like Postgres, your application may encounter a bug or need an urgent patch. Make sure the cloud provider can do that. Some cloud providers are strategically invested in the database software that they manage for you in the cloud, and they actively participate in the ongoing development of that software. Others just operate it and don’t influence the roadmap. If you plan to move strategically important capabilities to a DBaaS, you may benefit from partnering with a provider who is actively involved in the software’s development, and can improve the software over time to support your long term needs.
 


So many options

With physical hardware, you buy it once and either upgrade it later or replace it. With cloud computing, you can change hardware capabilities simply. This allows you to right-size your hardware for your current needs, and adjust it later as your workload changes. Cloud computing also might have different behaviors for CPU, memory, and I/O characteristics, so it is important to test your workload with various cloud hardware configurations.

 

Putting it all together

Once you have chosen a method to transfer your data, it is time to make the move. How long will the transfer take?  How will you deal with data needs during the switchover?  How will you test if the new configuration meets your performance and recovery needs?

For example, suppose you have a 20 GB database used by a single Python application. The application will remain on local infrastructure—only the database will be moved to the cloud. The application can be down for a few hours, and the database is small so the simple logical dump/restore method is ideal.  Database performance, customization, and cost are not critical, so we will choose DBaaS. Once the cloud database cluster is created and the logical dump loaded, the only step left is to configure the Python application to point to the cloud database instance. Authentication might require adjustment.

 

Migrate tooling?

A database is rarely used alone. There are usually monitoring, alerting, backup, failover, and load balancing aspects that must be handled. Are these requirements moving to the cloud too? Does the cloud infrastructure make these easier or harder? Should some of these tools remain in your local infrastructure?

 

Getting your data out

While you are probably focused on getting your data into the cloud infrastructure, eventually you might need to get your data out.  Some cloud providers make getting data out either hard or expensive, so consider your exit strategy when choosing a cloud provider.

To experience a fully managed database as a service (DBaaS) that is run and managed by database specialists, try EDB Cloud!
 

Share this