Modular Architecture & Isolated Testing

Purpose

This document defines the future architectural direction for our Django monolith, focusing on:

Establishing a clear, Django-friendly folder structure for product boundaries
Using frozen dataclasses as the stable interface between products
Introducing facades as the only public interface for products
Enforcing isolation between products to avoid accidental cross-product coupling
Enabling selective testing via Turbo (task caching) and tach (import boundary enforcement)

This is a forward-looking design document, not a migration guide.

Terminology

Different tools use different names for the same concept:

Product — a self-contained feature area under products/<name>/. This is the unit of isolation, ownership, and selective testing.
Django app — the backend implementation of a product (products/<name>/backend/). Registered in INSTALLED_APPS via AppConfig.
Turbo package — the build/test unit defined by package.json. One product = one Turbo package.
tach module — the import boundary node in tach.toml. Maps 1:1 to a product (core code like posthog and ee are also tach modules).

This document uses "product" when talking about boundaries and architecture, and "Django app" only for Django-specific mechanics (models, migrations, apps.py).

1. Why Modularization?

As the codebase grows, running all tests for every change becomes expensive, and startup time of the dev server grows. Our goal:

Reduce CI time via selective testing
Make product boundaries explicit
Prevent accidental cross-product imports
Preserve developer velocity as the system grows

Turbo provides task-level caching so that:

Only tests affected by a change run
Contract files (frozen dataclasses, enums) determine whether downstream products need retesting

tach enforces Python import boundaries, ensuring dependencies are explicitly declared in tach.toml.

To benefit from selective testing, we must introduce architectural boundaries inside the Django monolith.

2. Turbo + tach (Initial Scope: Single Product)

We will begin by wiring up one product to:

Validate the folder structure
Test contract-based selective testing with Turbo
Verify import boundary enforcement with tach
Build the foundation for incremental test selection

Focus:

One product = one Turbo package with backend:test; isolated products also declare backend:contract-check
Non-isolated products must not declare backend:contract-check — turbo-discover uses this key to identify isolated products, and its presence causes selective testing to skip the full Django test suite
Facade (facade/api.py) will define the public interface
Internal files will be private implementation details
Presentation layer (DRF) will sit above the facade but remain outside the contract surface initially

Eventually this grows into:

A dependency graph across products via contract inputs
True selective test execution

But this document is about foundational structure, not full rollout.

3. Folder Structure

Each product adopts the following structure:

myproduct/
  backend/
    __init__.py
    apps.py
    models.py          # Django ORM only
    logic.py           # Business logic

    tasks/
      __init__.py
      tasks.py         # Celery entrypoints (call facade)
      schedules.py     # Celery beat / periodic config (optional)

    facade/
      __init__.py
      api.py           # Facade (the only thing other products may import)
      contracts.py     # Frozen dataclasses (+ enums if small enough)
      enums.py         # Optional: exported enums/shared types when contracts.py grows

    presentation/
      __init__.py
      serializers.py   # DRF serializers (frozen dataclasses <-> JSON)
      views.py         # DRF views (HTTP endpoints)
      urls.py          # HTTP routing

    tests/
      test_models.py
      test_logic.py
      test_api.py            # Facade tests
      test_presentation.py   # DRF integration tests
      test_tasks.py

Why this layout?

Matches Django conventions — low friction
Keeps business logic separate from HTTP concerns
Keeps the product root clean
Provides an explicit, enforced boundary (facade/)
Scales naturally with contract-based selective testing

For the broader monorepo structure (products, services, platform), see monorepo-layout.md.

4. Contracts (`contracts.py`)

Each product defines its public interface as frozen dataclasses in backend/facade/contracts.py. These are the only data structures that cross product boundaries — facades accept and return them, and other products import them.

Rules:

No Django imports
Immutable (frozen=True)
Small, hashable, stable
Facades accept them as inputs and return them as outputs

Example

@dataclass(frozen=True)
class Artifact:
    id: UUID
    project_id: int
    content_hash: str
    storage_path: str
    width: int
    height: int
    size_bytes: int
    created_at: datetime

Contracts should not depend on:

Django models
DRF serializers
Request objects

If input and output shapes are identical, reuse the same dataclass.

5. Facades: The Public Interface

Each product exposes a facade via backend/facade/api.py. This is the only file other products are allowed to import.

Responsibilities

Accept frozen dataclasses as input parameters
Call business logic (logic.py)
Convert Django models → frozen dataclasses before returning
Enforce transactions where needed
Remain thin and stable

Do NOT:

Implement business logic (use logic.py)
Import DRF, HTTP, or serializers
Expose Django models or return ORM instances

Example

class ArtifactAPI:
    @staticmethod
    def create(params: CreateArtifact) -> Artifact:
        instance = logic.create_artifact(params)
        return _to_artifact(instance)

Why explicit mappers?

Facades convert ORM models to frozen dataclasses via mapper functions. These look repetitive when fields align 1:1:

def _to_artifact(instance) -> contracts.Artifact:
    return contracts.Artifact(
        id=instance.id,
        content_hash=instance.content_hash,
        # ... more fields
    )

The value isn't the copying — it's having one place where "internal" becomes "external":

Explicit boundary — the frozen dataclass defines exactly what callers receive. Internal fields don't accidentally leak.
Transformation point — add computed fields, flatten relations, rename for consistency.
Drift absorption — when models and the exposed dataclass diverge, the mapper absorbs it instead of changes leaking everywhere.

The alternative — returning ORM objects — works until it doesn't, then you're retrofitting isolation under pressure.

6. Business Logic (backend/logic.py)

Business logic lives here: validation, calculations, business rules, ORM queries.

Examples:

Deduplication rules
Business invariants
Cross-field validations
Idempotency checks

Why separate from the facade?

Facades must stay thin and stable
Presentation should not contain business rules
Frozen dataclasses remain pure data
Logic is internal implementation — changes here don't affect other products' tests

7. Presentation Layer (DRF)

Located in backend/presentation/.

Responsibilities:

Validate incoming JSON (via DRF serializers)
Convert incoming JSON → frozen dataclasses
Call facade methods
Convert frozen dataclasses → JSON responses
No business logic

Why not mix with the facade?

Keeps HTTP concerns decoupled
Allows reusing business logic for async tasks, CLI, future services

Don't API views leak implementation?

No. Views only call facades, and facades only return frozen dataclasses. The presentation layer remains decoupled from internal details — when the facade hasn't changed, nothing outside the product is affected.

8. Isolation Rules

Forbidden

Importing another product's models.py directly
Importing anything from another product's logic.py
Importing views or serializers from another product
Returning ORM objects from facades

Allowed

Importing another product's backend.facade (the facade)
Using frozen dataclasses returned by facades
Calling business logic from within the same product
Presentation calling its own product's facade

Concrete examples

Product A needs data from Product B — use the facade:

# products/revenue_analytics/backend/logic.py
from products.data_warehouse.backend.facade import DataWarehouseAPI

# OK: calling the facade, getting back frozen dataclasses
tables = DataWarehouseAPI.list_tables(team_id=team_id)

Not this:

# WRONG: importing models directly from another product
from products.data_warehouse.backend.models.table import DataWarehouseTable
tables = DataWarehouseTable.objects.filter(team_id=team_id)

Product exposing functionality — keep the facade thin:

# products/signals/backend/facade/api.py — real example from the codebase
async def emit_signal(team_id, source_product, source_type, source_id, description, weight):
    """Other products call this. They never touch signals' models or internals."""
    ...

Using contracts from another product:

# products/other_product/backend/logic.py
from products.visual_review.backend.facade.contracts import Artifact

def process_artifact(artifact: Artifact) -> None:
    # artifact is a frozen dataclass, not an ORM object
    ...

What tach enforces

The interfaces setting in tach.toml controls which paths inside a product other products can import. This is machine-enforced — tach will reject any import that doesn't go through the declared interfaces.

During migration, existing cross-product model imports are tracked in tach.toml depends_on. The goal is to replace them with facade calls over time.

Django Foreign Keys

Django allows ForeignKey relationships across products. This is still allowed, but ForeignKey relations create implicit reverse dependencies, even if you never use them:

# visual_review/backend/models.py
project = models.ForeignKey(Project, ...)

Django will auto-generate reverse relations (project.visualreview_set), migration dependencies, and app loading order dependencies — all of which violate isolation.

Rule: a product may have ForeignKeys to core models, but other products must not reference models inside this product. Use related_name='+' to disable reverse relations. If you need reverse access, use explicit facade calls rather than ORM traversal.

9. Turbo Tasks & Contract-Based Testing

Each product is a Turborepo package with tasks defined in its package.json.

Contract files vs. implementation files

Turbo uses file-based inputs to determine cache validity. The key distinction:

Contract inputs (used by backend:contract-check):

backend/facade/contracts.py — frozen dataclasses (enums can live here too)
backend/facade/enums.py — optional, for exported enums/constants/shared types when contracts.py grows

Implementation inputs (used by backend:test):

All backend/**/*.py files

Other products depend on a product's contract files only. When contract files haven't changed, downstream products don't need retesting.

Import boundaries are enforced by tach via tach.toml. This ensures products don't accidentally import each other's internals, which would break the contract-based isolation model.

Dependency rules for contract files (keep them pure):

No Django imports (from django.*)
No DRF imports (from rest_framework.*)
Use stdlib for errors, not django.core.exceptions
No from_model() methods — put conversion in implementation code

How selective testing works

other_product tests
       | depends on
visual_review contracts  (facade/contracts.py, facade/enums.py)
       | does NOT depend on
visual_review impl       (logic.py, models.py)

Scenario: Change visual_review/logic.py

visual_review backend:test → reruns (impl files changed)
visual_review backend:contract-check → cache hit (contract files unchanged)
other_product backend:test → skipped (depends only on contracts, which didn't change)

CI commands

# Run all product tests
pnpm turbo run backend:test

# Run specific product tests
pnpm turbo run backend:test --filter=@posthog/products-visual_review

# Run contract checks
pnpm turbo run backend:contract-check

10. Summary

This document outlines the future direction of our codebase:

Django-idiomatic layout with product boundaries
Frozen dataclasses as the stable interface between products
Thin facades as the only public interface
Business logic isolated and testable
DRF presentation decoupled from core logic
Turbo for task caching and selective test execution
tach for Python import boundary enforcement

This architecture reduces coupling, enables selective testing, and keeps the system maintainable as we grow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modular Architecture & Isolated Testing

Purpose

Terminology

1. Why Modularization?

2. Turbo + tach (Initial Scope: Single Product)

3. Folder Structure

Why this layout?

4. Contracts (`contracts.py`)

Rules:

Example

5. Facades: The Public Interface

Responsibilities

Do NOT:

Example

Why explicit mappers?

6. Business Logic (backend/logic.py)

Why separate from the facade?

7. Presentation Layer (DRF)

Why not mix with the facade?

Don't API views leak implementation?

8. Isolation Rules

Forbidden

Allowed

Concrete examples

What tach enforces

Django Foreign Keys

9. Turbo Tasks & Contract-Based Testing

Contract files vs. implementation files

How selective testing works

CI commands

10. Summary

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Modular Architecture & Isolated Testing

Purpose

Terminology

1. Why Modularization?

2. Turbo + tach (Initial Scope: Single Product)

3. Folder Structure

Why this layout?

4. Contracts (contracts.py)

Rules:

Example

5. Facades: The Public Interface

Responsibilities

Do NOT:

Example

Why explicit mappers?

6. Business Logic (backend/logic.py)

Why separate from the facade?

7. Presentation Layer (DRF)

Why not mix with the facade?

Don't API views leak implementation?

8. Isolation Rules

Forbidden

Allowed

Concrete examples

What tach enforces

Django Foreign Keys

9. Turbo Tasks & Contract-Based Testing

Contract files vs. implementation files

How selective testing works

CI commands

10. Summary

4. Contracts (`contracts.py`)