Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ WORKDIR /usr/src/dbt/dbt_project
# Install the dbt Postgres adapter. This step will also install dbt-core
RUN pip install --upgrade pip
RUN pip install dbt-postgres==1.3.1
RUN pip install pytz

# Install dbt dependencies (as specified in packages.yml file)
# Build seeds, models and snapshots (and run tests wherever applicable)
Expand Down
91 changes: 17 additions & 74 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,34 +8,21 @@ The `docker-compose.yml` file consists of two services:
that are used to build the data models defined in the example project into a target Postgres database.

## `postgres` service and the Sakila Database
This is an instance of a Postgres database initialised with Sakila database (and thus we are using the
`frantiseks/postgres-sakila` image which is available on Docker Hub).
This is an instance of a Postgres database initialised with Sakila database (and thus we are using the
`frantiseks/postgres-sakila` image which is available on Docker Hub).

The database models a DVD rental store and contains several normalised tables that correspond to films, payments,
The database models a DVD rental store and contains several normalised tables that correspond to films, payments,
customers and other entities.

Sakila Database was developed by Mike Hillyer, who used to be a member of the AB documentation team at MySQL. For more
information regarding Sakila Database you can refer to the
[official MySQL documentation](https://dev.mysql.com/doc/sakila/en/sakila-introduction.html).
Sakila Database was developed by Mike Hillyer, who used to be a member of the AB documentation team at MySQL. For more
information regarding Sakila Database you can refer to the
[official MySQL documentation](https://dev.mysql.com/doc/sakila/en/sakila-introduction.html).
![Sakila DB](https://www.jooq.org/img/sakila.png)


## `dbt` service
This service is built out of the `Dockerfile` and is responsible for creating dbt seeds, models and snapshots
on `postgres` service. The example dbt project contains seeds, models (staging, intermediate and mart) as well as
snapshots.

Note that this is a dummy project, meaning that some entities (including aggregations) might not make too much sense
from a business perspective. For example, even though the Sakila database contains the `customer` table already, we
construct another table called `customer_base` that corresponds to a dbt seed, and is loaded form an external
`csv` file.

Additionally, the models created may not be the perfect examples of what it should be considered as an intermediate or
mart model. In general if you are interested in gaining a deeper understanding of these terms I would encourage you to
read the following articles:
- [Staging vs Intermediate vs Mart models in dbt](https://towardsdatascience.com/staging-intermediate-mart-models-dbt-2a759ecc1db1)
- [How to structure your dbt project and data models](https://towardsdatascience.com/dbt-models-structure-c31c8977b5fc)

Feel free to add, modify or remove models while cloning or forking the project in order to serve the purpose you
intend to use it for.
snapshots.


## Running the dummy dbt project
Expand All @@ -45,7 +32,7 @@ First, let's build the services defined in our `docker-compose.yml` file:
docker-compose build
```

and now let's run the services so that the dbt models are created in our target Postgres database:
and now let's run the services so that the dbt models are created in our target Postgres database:

```commandline
docker-compose up
Expand All @@ -56,13 +43,13 @@ This will spin up two containers namely `dbt` (out of the `dbt-dummy` image) and

Notes:
- For development purposes, both containers will remain up and running
- If you would like to end the `dbt` container, feel free to remove the `&& sleep infinity` in `CMD` command of the
- If you would like to end the `dbt` container, feel free to remove the `&& sleep infinity` in `CMD` command of the
`Dockerfile`


### Building additional or modified data models
Once the containers are up and running, you can still make any modifications in the existing dbt project
and re-run any command to serve the purpose of the modifications.
Once the containers are up and running, you can still make any modifications in the existing dbt project
and re-run any command to serve the purpose of the modifications.

In order to build your data models, you first need to access the container.

Expand All @@ -76,41 +63,16 @@ Then enter the running container:
docker exec -it <container-id> /bin/bash
```

And finally:

```commandline
# Install dbt deps
dbt deps

# Build seeds
dbt seeds --profiles-dir profiles

# Build data models
dbt run --profiles-dir profiles

# Build snapshots
dbt snapshot --profiles-dir profiles

# Run tests
dbt test --profiles-dir profiles
```

Alternatively, you can run everything in just a single command:

```commandline
dbt build --profiles-dir profiles
```

### Querying seeds, models and snapshots on Postgres
In order to query and verify the seeds, models and snapshots created in the dummy dbt project, simply follow the
steps below.
In order to query and verify the seeds, models and snapshots created in the dummy dbt project, simply follow the
steps below.

Find the container id of the postgres service (`postgres`):
```commandline
docker ps
docker ps
```

Then run
Then run
```commandline
docker exec -t <container-id> /bin/bash
```
Expand All @@ -119,22 +81,3 @@ We will then use `psql`, a terminal-based interface for PostgreSQL that allows u
```commandline
psql -U postgres
```

Now you can query the tables constructed form the seeds, models and snapshots defined in the dbt project:
```sql
-- Query seed tables
SELECT * FROM customer_base;

-- Query staging views
SELECT * FROM stg_payment;

-- Query intermediate views
SELECT * FROM int_customers_per_store;
SELECT * FROM int_revenue_by_date;

-- Query mart tables
SELECT * FROM cumulative_revenue;

-- Query snapshot tables
SELECT * FROM int_stock_balances_daily_grouped_by_day_snapshot;
```
1 change: 1 addition & 0 deletions dbt_project/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
target/
dbt_packages/
logs/
.user.yml
25 changes: 2 additions & 23 deletions dbt_project/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,38 +1,17 @@

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'test_dbt_project'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'test_profile'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target" # directory which will store compiled SQL files
clean-targets: # directories to be removed by `dbt clean`
target-path: "target"
clean-targets:
- "target"
- "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

# In this example config, we tell dbt to build all models in the example/ directory
# as tables. These settings can be overridden in the individual model files
# using the `{{ config(...) }}` macro.
#models:
# test_dbt_project:
# # Config indicated by + and applies to all files under models/example/
# example:
# +materialized: view
13 changes: 0 additions & 13 deletions dbt_project/models/intermediate/_intermediate_models.yml

This file was deleted.

7 changes: 0 additions & 7 deletions dbt_project/models/intermediate/int_customers_per_store.sql

This file was deleted.

7 changes: 0 additions & 7 deletions dbt_project/models/intermediate/int_revenue_by_date.sql

This file was deleted.

5 changes: 0 additions & 5 deletions dbt_project/models/marts/_mart_models.yml

This file was deleted.

14 changes: 0 additions & 14 deletions dbt_project/models/marts/cumulative_revenue.sql

This file was deleted.

11 changes: 11 additions & 0 deletions dbt_project/models/schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 2

models:
- name: my_model_name
description: >-
Description of the model.

columns:
- name: column_name
description: >-
Description of the column.
13 changes: 0 additions & 13 deletions dbt_project/models/staging/_staging_models.yml

This file was deleted.

6 changes: 0 additions & 6 deletions dbt_project/models/staging/_staging_sources.yml

This file was deleted.

4 changes: 0 additions & 4 deletions dbt_project/models/staging/stg_payment.sql

This file was deleted.

2 changes: 1 addition & 1 deletion dbt_project/packages.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
version: 1.1.1
1 change: 1 addition & 0 deletions dbt_project/profiles/.user.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
id: eb8bacf2-ac6f-4e63-a184-7e6a351e42a3
8 changes: 0 additions & 8 deletions dbt_project/snapshots/_snapshots.yml

This file was deleted.

20 changes: 0 additions & 20 deletions dbt_project/snapshots/int_customers_per_store_snapshot.sql

This file was deleted.