Loading data (sync or bring your own)

Loading data with Lakehouse sync

If you have a transactional database running in EDB Postgres AI Cloud Service, then you can sync tables from this database into a Managed Storage Location. See "How to lakehouse sync" for further details.

Bringing your own data

It's possible to point your Lakehouse node at an arbitrary S3 bucket with Delta Tables inside of it. However, this comes with some major caveats (which will eventually be resolved):

Caveats

  • The tables must be stored as Delta Lake Tables within the location.
  • A "Delta Lake Table" (or "Delta Table") is a folder of Parquet files along with some JSON metadata.

Loading data into your bucket

You can use the lakehouse-loader utility to export data from an arbitrary Postgres instance to Delta Tables in a storage bucket. See Delta Lake Table Tools for more information on how to obtain and use that utility.

For further details, see the External Tables documentation.


Could this page be better? Report a problem or suggest an addition!