Limitations

Blocking operations

Pipelines currently doesn't support asynchronous operations. All embedding operations are blocking and take place in the foreground. The impact of this behavior depends on the type of pipeline being executed:

  • When executing bulk data preparation or bulk embedding, the system will be blocked until the embedding is complete for all the items currently in the database or storage.
  • When using auto-embedding, the system will be blocked until the embedding is complete for each item being inserted or updated.

Observability

Observability is currently limited to the logs and metrics provided by the underlying components. There isn't a single pane of glass for monitoring the entire system.

Monitoring progress of initial pipeline execution also isn't currently available. We recommend using the system logs to track progress.

Large documents

Pipelines currently doesn't handle chunking large documents.

Data filtering

While Pipelines can limit embedded documents using SQL filters and views based on the content of rows, it currently doesn't support filtering on data in S3 storage. It's limited to using subpaths and prefix filtering.

Load balancing for models

There's currently no load balancing mechanism for model access.

Data formats

Retriever Pipelines currently supports only text and image formats. Other formats, including structured data, video, and audio, aren't currently supported.

Auto-Embedding for Images

  • Auto-Embedding is only supported for text data. Image data embeddings can be manually computed.

Auto-Data Preparation

  • Auto-Data Preparation is not currently supported. Bulk data preparation can be executed manually.

Upgrading

When upgrading the aidb and pgfs extensions, there's currently no support for Postgres extension upgrades. When upgrading to a new version of the extensions, you must therefore drop and re-create the extensions:

DROP EXTENSION aidb CASCADE;
DROP EXTENSION pgfs CASCADE;
CREATE EXTENSION aidb CASCADE;
CREATE EXTENSION pgfs CASCADE;

Could this page be better? Report a problem or suggest an addition!