This repository contains a Fivetran Connector designed to generate synthetic data for the "Jaffle Shop" dataset (Customers, Orders, Items, Products, Payments). It mimics an e-commerce transactional database and is built using the Fivetran Connector SDK.
This connector simulates a real-world SaaS application database by generating diverse datasets:
-
Static Data (Dimension Tables):
customers: A fixed set of 10 customers with Japanese and English names.products: A catalog of 20 jaffles, beverages, and desserts with pricing.- Note: These tables are re-imported in every sync to capture any potential updates (simulating master data changes).
-
Incremental Data (Fact Tables):
orders: Transactional records of customer purchases.items: Line items associated with each order (1-3 items per order).payments: Payment records linked to orders.- Note: These tables are generated incrementally based on the previous state (
last_order_id, etc.).
The connector behavior is controlled via configuration.json. The primary configuration parameter is:
limit: (Integer) The number of new orders to generate in a single sync run. Default is20.
Example configuration.json:
{
"limit": 20
}The core logic resides in data_generator.py:
- Deterministic Generation: The generator can be seeded for reproducibility (
seed=42is used in the connector). - Customers & Products: Hardcoded lists to ensure consistent master data.
- Orders:
- New orders start from
last_order_id + 1. - Timestamps are incremented sequentially from the last order time, adding a random 0-10 second delay between orders to simulate natural traffic.
- Status is randomly assigned (
servedorcancelled).
- New orders start from
- Items:
- Each order contains 1-3 random items from the product catalog.
- Quantity is randomized (1-2).
- Payments:
- Calculated based on the sum of item prices.
- Payment method is random (
credit_card,cash,gift_card).
- Python 3.9+
- Fivetran Connector SDK (
fivetran-connector-sdk)
Ensure you have the SDK installed:
pip install fivetran-connector-sdkYou can run the connector locally to test the sync process and view the output (including state updates) in your terminal.
fivetran debugThis command simulates a sync using your configuration.json and local state.
To deploy this connector to Fivetran:
- Initialize Deployment (if not already done):
fivetran init
- Deploy:
fivetran deploy --api-key <YOUR_API_KEY> --destination <DESTINATION_NAME> --connection <CONNECTION_NAME>
For detailed deployment instructions, refer to the Fivetran Connector SDK Documentation.
connector.py: The entry point for the Fivetran connector. Defines theupdatefunction and schema.data_generator.py: Contains theDataGeneratorclass for creating synthetic data.schema_utils.py: Helper for defining the Fivetran schema.configuration.json: Configuration file for the connector.spec.json: Specification file defining the configuration schema.