← 返回命令列表

Linux command

streambed 命令

网络

需要网络或远程资源。

常用示例

Sync

streambed sync --source-url=[postgres://user:pass@host:5432/db] --s3-bucket=[bucket] --s3-endpoint=[https://s3] --s3-prefix=[path] --query-addr=:5433

Backfill

streambed resync --source-url=[postgres://...] --s3-bucket=[bucket] --s3-prefix=[path]

Serve queries

streambed query --s3-bucket=[bucket] --s3-prefix=[path] --query-addr=:5433

Delete

streambed cleanup --s3-bucket=[bucket] --s3-prefix=[path] --table=[name]

Inspect

streambed sync --help

说明

streambed is a change data capture (CDC) tool written in Go. It tails the PostgreSQL write-ahead log using logical replication, writes the resulting changes as Parquet files to S3, and commits Apache Iceberg metadata so the data is queryable as a versioned analytical table. The `sync` subcommand runs as a long-lived daemon and can optionally expose a query endpoint compatible with the PostgreSQL wire protocol, so existing Postgres clients can read Iceberg tables without changes. `resync` performs a one-shot backfill under a consistent snapshot using `COPY`. `query` runs the wire-protocol server alone against tables that were already populated. `cleanup` deletes S3 objects and tracking state for a given table. Streambed targets the use case of offloading analytical workloads from a production Postgres instance to an Iceberg-on-S3 lakehouse while still letting tools that speak the Postgres protocol query the result.

参数

sync
Stream WAL changes, write Iceberg, optionally serve queries.
resync
One-shot backfill via `COPY` under a consistent snapshot.
query
Postgres-wire query server for existing Iceberg tables.
cleanup
Delete S3 objects and state for a table.
--source-url _URL_
Postgres connection string (or `STREAMBED_SOURCE_URL`).
--s3-bucket _NAME_
Destination S3 bucket (or `STREAMBED_S3_BUCKET`).
--s3-endpoint _URL_
S3 endpoint (use MinIO or other S3-compatible storage).
--s3-prefix _PATH_
Key prefix within the bucket.
--query-addr _HOST:PORT_
Bind address for the Postgres-wire query server.

FAQ

What is the streambed command used for?

streambed is a change data capture (CDC) tool written in Go. It tails the PostgreSQL write-ahead log using logical replication, writes the resulting changes as Parquet files to S3, and commits Apache Iceberg metadata so the data is queryable as a versioned analytical table. The `sync` subcommand runs as a long-lived daemon and can optionally expose a query endpoint compatible with the PostgreSQL wire protocol, so existing Postgres clients can read Iceberg tables without changes. `resync` performs a one-shot backfill under a consistent snapshot using `COPY`. `query` runs the wire-protocol server alone against tables that were already populated. `cleanup` deletes S3 objects and tracking state for a given table. Streambed targets the use case of offloading analytical workloads from a production Postgres instance to an Iceberg-on-S3 lakehouse while still letting tools that speak the Postgres protocol query the result.

How do I run a basic streambed example?

Run `streambed sync --source-url=[postgres://user:pass@host:5432/db] --s3-bucket=[bucket] --s3-endpoint=[https://s3] --s3-prefix=[path] --query-addr=:5433` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does sync do in streambed?

Stream WAL changes, write Iceberg, optionally serve queries.