Back to blog

Expanding WarehousePG Cluster on existing hosts using gpexpand

July 01, 2026

WarehousePG is our fork of the formerly open-source Greenplum Database. WarehousePG offers a Massively Parallel Processing (MPP) architecture for high-performance, petabyte-scale data processing.

What is gpexpand utility?

gpexpand utility helps in expanding an existing WarehousePG Database across new hosts or existing hosts in the system.

In this guide, I’ll walk through expanding a WarehousePG database cluster using the gpexpand utility by adding new segments on existing segment hosts.

When you run the gpexpand utility, it runs in interactive mode and this helps in creating the new expansion schema.

In this example, I am going to add 2 more segments to the existing hosts:

Step 1: Start the gpexpand Utility

[gpadmin@mdw logs]$ gpexpand
20250604:20:59:20:051060 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.2.1-WHPG-rc.1 build commit:165c56297075f86a39ce8ccbf912a0689e8b8df9'
20250604:20:59:21:051060 gpexpand:mdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.2.1-WHPG-rc.1 build commit:165c56297075f86a39ce8ccbf912a0689e8b8df9) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26), 64-bit compiled on May  8 2025 13:42:44 Bhuvnesh C. WarehousePG'
20250604:20:59:21:051060 gpexpand:mdw:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20250604:20:59:21:051060 gpexpand:mdw:gpadmin-[WARNING]:-One or more segments are either down or not in preferred role.
...
...
Would you like to initiate a new System Expansion Yy|Nn (default=N):
> y

Step 2: Choose Hosts and Mirroring Strategy

Since we're adding segments to existing hosts, we still input the hostnames (even though they're already part of the cluster). 

I have two segment hosts : sdw1 and sdw2.

Enter a comma separated list of new hosts you want
to add to your array.  Do not include interface hostnames.
**Enter a blank line to only add segments to existing hosts**[]:
> sdw1, sdw2

In this example, I am going to use the existing hosts to expand my cluster.

For mirroring strategy, I chose spread mirroring:

What type of mirroring strategy would you like?
spread|grouped (default=grouped):
> spread

Next, specify how many new primary segments to add per host:

I want to add one primary segment per host. Here I am specifying the data directories for the new primary and mirror.

How many new primary segments per host do you want to add? (default=0):
> 1
Enter new primary data directory 1:
> /data1/primary
Enter new mirror data directory 1:
> /data1/mirror
  • After the above step, it generates an expansion file:

Generating configuration file...
..
20250604:21:00:29:051097 gpexpand:mdw:gpadmin-[INFO]:-Generating input file...
Input configuration file was written to 'gpexpand_inputfile_20250604_210029'.
Please review the file and make sure that it is correct then re-run
with: 
gpexpand -i gpexpand_inputfile_20250604_210029

Step 3: Review the Generated Expansion File

[gpadmin@mdw logs]$ cat gpexpand_inputfile_20250604_210029
sdw1|sdw1|6006|/data1/primary/gpseg12|26|12|p
sdw2|sdw2|7006|/data1/mirror/gpseg12|29|12|m
sdw2|sdw2|6006|/data1/primary/gpseg13|27|13|p
sdw1|sdw1|7006|/data1/mirror/gpseg13|28|13|m

This file defines:

  • Two new primary segments (p) — one on each host.

  • Two mirror segments (m) — mirrored across the other host.

Step 4: Check Current Segment Distribution

Before proceeding, let’s verify the number of segments per host:

gpadmin=# select count(*) from gp_segment_configuration group by hostname;
count
-------
    1
   12
   12
(3 rows)

Every host has 12 segments each. Now after running the expansion, this number will be 14 ( Adding one primary and one mirror ) on each host.

Step 5: Run the Expansion

gpexpand -i gpexpand_inputfile_20250604_210029

Output:

20250604:21:21:08:051698 gpexpand:mdw:gpadmin-[INFO]:-Heap checksum setting consistent across cluster
20250604:21:21:08:051698 gpexpand:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions
20250604:21:21:08:051698 gpexpand:mdw:gpadmin-[INFO]:-did not find gppkg binary. skip package sync
20250604:21:21:09:051698 gpexpand:mdw:gpadmin-[INFO]:-Locking catalog
20250604:21:21:09:051698 gpexpand:mdw:gpadmin-[INFO]:-Locked catalog
...
...
20250604:21:21:22:051698 gpexpand:mdw:gpadmin-[INFO]:-Creating expansion schema
20250604:21:21:23:051698 gpexpand:mdw:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database postgres
20250604:21:21:24:051698 gpexpand:mdw:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database template1
20250604:21:21:25:051698 gpexpand:mdw:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database gpadmin
20250604:21:21:25:051698 gpexpand:mdw:gpadmin-[INFO]:-Starting new mirror segment synchronization
20250604:21:21:48:051698 gpexpand:mdw:gpadmin-[INFO]:-************************************************
20250604:21:21:48:051698 gpexpand:mdw:gpadmin-[INFO]:-Initialization of the system expansion complete.
20250604:21:21:48:051698 gpexpand:mdw:gpadmin-[INFO]:-To begin table expansion onto the new segments
20250604:21:21:48:051698 gpexpand:mdw:gpadmin-[INFO]:-rerun gpexpand
20250604:21:21:48:051698 gpexpand:mdw:gpadmin-[INFO]:-************************************************
20250604:21:21:48:051698 gpexpand:mdw:gpadmin-[INFO]:-Exiting...

Step 6: Run gpexpand Again for Table Expansion

As per the message above, we need to run it one more time for table expansion:

[gpadmin@mdw ~]$ gpexpand
20250604:21:22:09:051967 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.2.1-WHPG-rc.1 build commit:165c56297075f86a39ce8ccbf912a0689e8b8df9'
20250604:21:22:09:051967 gpexpand:mdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.2.1-WHPG-rc.1 build commit:165c56297075f86a39ce8ccbf912a0689e8b8df9) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26), 64-bit compiled on May  8 2025 13:42:44 Bhuvnesh C. WarehousePG'
20250604:21:22:09:051967 gpexpand:mdw:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20250604:21:22:10:051967 gpexpand:mdw:gpadmin-[INFO]:-Expanding gpadmin.public.test
20250604:21:22:10:051967 gpexpand:mdw:gpadmin-[WARNING]:-Encountered unexpected issue when expanding table gpadmin.public.test, skipping
20250604:21:22:10:051967 gpexpand:mdw:gpadmin-[INFO]:-Resetting status_detail for gpadmin.public.test
20250604:21:22:11:051967 gpexpand:mdw:gpadmin-[INFO]:-Expanding gpadmin.public.test1
20250604:21:22:11:051967 gpexpand:mdw:gpadmin-[INFO]:-Finished expanding gpadmin.public.test1
20250604:21:22:15:051967 gpexpand:mdw:gpadmin-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20250604:21:22:15:051967 gpexpand:mdw:gpadmin-[INFO]:-Exiting...

Step 7: Verify Cluster Health with gpstate

Run `gpstate` to see everything is fine:

[gpadmin@mdw ~]$ gpstate -e
20250604:21:29:16:052215 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20250604:21:29:16:052215 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.2.1-WHPG-rc.1 build commit:165c56297075f86a39ce8ccbf912a0689e8b8df9'
20250604:21:29:16:052215 gpstate:mdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.2.1-WHPG-rc.1 build commit:165c56297075f86a39ce8ccbf912a0689e8b8df9) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26), 64-bit compiled on May  8 2025 13:42:44 Bhuvnesh C. WarehousePG'
20250604:21:29:16:052215 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20250604:21:29:16:052215 gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments...
..
20250604:21:29:20:052215 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20250604:21:29:20:052215 gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20250604:21:29:21:052215 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20250604:21:29:21:052215 gpstate:mdw:gpadmin-[INFO]:-All segments are running normally

Step 8: Confirm Segment Count

gpadmin=# select count(*) from gp_segment_configuration group by hostname;
count
-------
  1
 14
 14
(3 rows)

Step 9: Verify Data Distribution

We have the new segments added to the existing cluster. I have only one table and data is redistributed to the new segments ( id 12 and 13).

gpadmin=# select count(*) , gp_segment_id from test1 group by gp_segment_id;
count | gp_segment_id
-------+---------------
 709 |             5
 724 |            13
 727 |             1
 726 |             6
 711 |             2
 670 |             8
 687 |             0
 771 |             3
 701 |            12
 683 |             9
 724 |            10
 718 |            11
 743 |             7
 706 |             4
(14 rows)

In this example, we expanded the cluster by adding segments on existing hosts (sdw1, sdw2), confirmed their addition, and validated data redistribution.

Expanding WarehousePG on New hosts? Follow this article: WarehousePG cluster on New hosts

To learn more about WarehousePG, visit our WarehousePG docs page or explore the WarehousePG GitHub repository.

Share this