A RisingWave adapter plugin for dbt.
RisingWave is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. See https://www.risingwave.com.
dbt enables data analysts and engineers to transform data using software engineering workflows. For the broader RisingWave integration guide, see https://docs.risingwave.com/integrations/other/dbt.
- Install
dbt-risingwave.
python3 -m pip install dbt-risingwave-
Get RisingWave running by following the official guide: https://www.risingwave.dev/docs/current/get-started/.
-
Configure
~/.dbt/profiles.yml.
default:
outputs:
dev:
type: risingwave
host: 127.0.0.1
user: root
pass: ""
dbname: dev
port: 4566
schema: public
target: dev- Run
dbt debugto verify the connection.
Detailed reference: docs/.
Use schema_authorization when dbt should create schemas with a specific owner:
{{ config(materialized='table', schema_authorization='my_role') }}See docs/configuration.md for model-level and dbt_project.yml examples.
The adapter supports RisingWave session settings such as streaming_parallelism, streaming_parallelism_for_backfill, and streaming_max_parallelism in both profiles and model configs.
See docs/configuration.md for the full configuration matrix.
Use enable_serverless_backfill=true in a model config or profile to enable serverless backfills for streaming queries.
See docs/configuration.md for examples.
background_ddl=true lets supported materializations submit background DDL while still preserving dbt semantics by issuing RisingWave WAIT before dbt continues.
See docs/configuration.md for supported materializations, examples, and the cluster-wide WAIT caveat.
materialized_view and view support swap-based zero-downtime rebuilds through zero_downtime={'enabled': true} plus the runtime flag --vars 'zero_downtime: true'.
See docs/zero-downtime-rebuilds.md for requirements, cleanup behavior, and helper commands.
dbt-risingwave now supports a first version of dbt function resources for RisingWave scalar UDFs.
Current contract:
- supported:
- SQL scalar functions
- JavaScript scalar functions via
functions/*.sqlplusconfig.language: javascript - external Python scalar functions via
functions/*.sqlplusconfig.language: python- with
config.link: http://host:port - optional
config.remote_name - optional
config.always_retry_on_network_error
- with
- materialization:
CREATE FUNCTION IF NOT EXISTS - JavaScript async options:
config.async: true->WITH (async = true)config.batch: true->WITH (batch = true)config.always_retry_on_network_error: true->WITH (always_retry_on_network_error = true)
- supported volatility config:
deterministic->IMMUTABLEstable->STABLEnon-deterministic->VOLATILE
Current limits:
- no replace/update path for an existing function body
- no overload-family management
- no aggregate or table functions
- no default arguments
- upstream dbt-core function contracts do not yet map cleanly to RisingWave-native
.jsauthoring or RisingWave external Python UDF authoring, so JavaScript and Python currently use adapter config onfunctions/*.sql
See docs/functions.md for the full first-version contract and example layout.
RisingWave indexes support INCLUDE and DISTRIBUTED BY clauses beyond what the Postgres adapter exposes. Configure them in the model config:
{{ config(
materialized='materialized_view',
indexes=[
{'columns': ['user_id'], 'include': ['name', 'email'], 'distributed_by': ['user_id']}
]
) }}This generates:
CREATE INDEX IF NOT EXISTS "__dbt_index_mv_user_id"
ON mv (user_id)
INCLUDE (name, email)
DISTRIBUTED BY (user_id);| Option | Description |
|---|---|
columns |
Key columns for the index (required). |
include |
Additional columns stored in the index but not part of the key (optional). |
distributed_by |
Columns used to distribute the index across nodes (optional). |
Note: RisingWave does not support unique or type (index method) options from the Postgres adapter. These options are silently ignored.
The adapter follows standard dbt model workflows, with RisingWave-specific materializations and behaviors.
Typical usage:
{{ config(materialized='materialized_view') }}
select *
from {{ ref('events') }}| Materialization | Notes |
|---|---|
materialized_view |
Creates a materialized view. This is the main streaming materialization for RisingWave. |
materializedview |
Deprecated. Kept only for backward compatibility. Use materialized_view instead. |
ephemeral |
Uses common table expressions under the hood. |
table |
Creates a table from the model query. |
view |
Creates a view from the model query. |
incremental |
Batch-style incremental updates for tables. Prefer materialized_view when a streaming MV fits the workload. |
connection |
Runs a full CREATE CONNECTION statement supplied by the model SQL. |
source |
Runs a full CREATE SOURCE statement supplied by the model SQL. |
table_with_connector |
Runs a full CREATE TABLE ... WITH (...) statement supplied by the model SQL. |
sink |
Creates a sink, either from adapter configs or from a full SQL statement. |
See docs/configuration.md for adapter-specific configuration examples, including streaming session settings and background DDL.
- docs/README.md: documentation index
- docs/configuration.md: profile options, model configs, sink settings, and background DDL usage
- docs/functions.md: first-version RisingWave scalar function support and limitations
- docs/zero-downtime-rebuilds.md: zero-downtime rebuild behavior for materialized views and views
dbt run: creates models that do not already exist.dbt run --full-refresh: drops and recreates models so the deployed objects match the current dbt definitions.
Graph operators are useful when you want to rebuild only part of a project.
dbt-risingwave extends dbt data-test failure storage to support materialized_view in addition to the upstream table and view options.
Example:
models:
- name: my_model
columns:
- name: id
tests:
- not_null:
config:
store_failures: true
store_failures_as: materialized_viewThis is useful for realtime monitoring workflows where test failures should remain continuously queryable as a RisingWave materialized view.
dbt run --select "my_model+" # select my_model and all children
dbt run --select "+my_model" # select my_model and all parents
dbt run --select "+my_model+" # select my_model, and all of its parents and children- Official dbt example: jaffle_shop
- RisingWave example: dbt_rw_nexmark