-
Notifications
You must be signed in to change notification settings - Fork 101
feat(dsql): Add PostgreSQL schema conversion and migration references #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,9 @@ | ||
| --- | ||
| name: dsql | ||
| description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, query plan explainability, and SQL compatibility validation. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow." | ||
| description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, PostgreSQL-to-DSQL schema conversion, PL/pgSQL transpilation, FK replacement code generation, OCC retry patterns, ORM migration (Django/Hibernate/Rails), DDL operations, query plan explainability, and SQL compatibility validation. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, convert to DSQL, PostgreSQL to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow, DSQL ENUM, DSQL foreign key, DSQL PL/pgSQL, DSQL trigger, DSQL OCC retry, DSQL Django, DSQL Hibernate, DSQL Rails, DSQL multi-region, DSQL JSONB, DSQL index async, DSQL GIN index." | ||
| license: Apache-2.0 | ||
| metadata: | ||
| tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp, orm | ||
| tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp, orm, plpgsql, trigger, enum, foreign-key, occ-retry, django, hibernate, rails, multi-region, schema-conversion, type-mapping | ||
| --- | ||
|
|
||
| # Amazon Aurora DSQL Skill | ||
|
|
@@ -109,6 +109,70 @@ sampled in [.mcp.json](../../.mcp.json) | |
| **When:** Load when migrating a complete MySQL table to DSQL | ||
| **Contains:** End-to-end MySQL CREATE TABLE migration example with decision summary | ||
|
|
||
| ### PostgreSQL Migration (modular): | ||
|
|
||
| #### [pg-migrations/type-mapping.md](references/pg-migrations/type-mapping.md) | ||
|
|
||
| **When:** MUST load when migrating PostgreSQL schemas to DSQL or answering type mapping questions | ||
| **Contains:** Complete PostgreSQL → DSQL type mapping (50+ types), COLLATE "C" rules, NUMERIC precision guidance, JSON/JSONB behavior, array alternatives, migration decision matrix | ||
|
|
||
| #### [pg-migrations/plpgsql-patterns.md](references/pg-migrations/plpgsql-patterns.md) | ||
|
|
||
| **When:** MUST load when converting PL/pgSQL functions or triggers to DSQL-compatible SQL | ||
| **Contains:** 10 transpilation patterns with before/after code, detection signals, app-responsibility notes, unconvertible pattern stubs | ||
|
|
||
| #### [pg-migrations/fk-replacement.md](references/pg-migrations/fk-replacement.md) | ||
|
|
||
| **When:** MUST load when generating FK validation functions or cascade replacement code | ||
| **Contains:** validate_fk_*() templates, cascade function templates, tenant-scoped validation, ORM integration patterns (Django/SQLAlchemy/Spring) | ||
|
|
||
| #### [pg-migrations/index-conversion.md](references/pg-migrations/index-conversion.md) | ||
|
|
||
| **When:** MUST load when resolving `dsql_lint` unfixable index diagnostics (index_using, index_partial, index_expression) | ||
| **Contains:** GIN/GiST/BRIN → btree conversion, partial index removal, expression index → computed column patterns | ||
|
|
||
| #### [pg-migrations/schema-objects.md](references/pg-migrations/schema-objects.md) | ||
|
|
||
| **When:** MUST load when converting ENUM types, materialized views, extensions, roles/grants, or handling multi-schema flattening | ||
| **Contains:** ENUM → CHECK, composite types → json, materialized views → views, extension alternatives, role/IAM mapping, multi-schema consolidation | ||
|
|
||
| #### [pg-migrations/function-compatibility.md](references/pg-migrations/function-compatibility.md) | ||
|
|
||
| **When:** Load when checking if a PostgreSQL function works in DSQL or finding replacements | ||
| **Contains:** Supported/unsupported function matrix, uuid_generate_v4() → gen_random_uuid(), lastval() → currval(), COPY → batched INSERT, maintenance command removal | ||
|
|
||
| #### [pg-migrations/occ-retry-patterns.md](references/pg-migrations/occ-retry-patterns.md) | ||
|
|
||
| **When:** MUST load when generating OCC retry code for any language | ||
| **Contains:** Retry strategy, Python/Node.js/Java/Go implementations, conflict mitigation, idempotent transaction design | ||
|
|
||
| #### [pg-migrations/data-migration.md](references/pg-migrations/data-migration.md) | ||
|
|
||
| **When:** Load when planning data migration from PostgreSQL to DSQL | ||
| **Contains:** Migration order, COPY → batched INSERT patterns, Python/Node.js loaders, sequence alignment, validation queries, pre-flight checklist | ||
|
|
||
| #### [pg-migrations/multi-region.md](references/pg-migrations/multi-region.md) | ||
|
|
||
| **When:** Load when user asks about multi-region DSQL, active-active, or high availability | ||
| **Contains:** Architecture, schema deployment, geographic partitioning, OCC cross-region behavior, performance considerations | ||
|
|
||
| ### ORM Migration Guides: | ||
|
|
||
| #### [orm-guides/django.md](references/orm-guides/django.md) | ||
|
|
||
| **When:** Load when migrating a Django application to DSQL | ||
| **Contains:** aurora-dsql-django adapter setup, model changes, migration patterns, OCC retry decorator | ||
|
|
||
| #### [orm-guides/hibernate.md](references/orm-guides/hibernate.md) | ||
|
|
||
| **When:** Load when migrating a Java/Spring Boot application to DSQL | ||
| **Contains:** Hibernate dialect, HikariCP config, entity changes, Spring Retry, Liquibase patterns | ||
|
|
||
| #### [orm-guides/rails.md](references/orm-guides/rails.md) | ||
|
|
||
| **When:** Load when migrating a Ruby on Rails application to DSQL | ||
| **Contains:** IAM token initializer, model associations without FK, async indexes, OCC retry concern | ||
|
|
||
| ### Query Plan Explainability (modular): | ||
|
|
||
| **When:** MUST load all four at Workflow 8 Phase 0 — [query-plan/plan-interpretation.md](references/query-plan/plan-interpretation.md), [query-plan/catalog-queries.md](references/query-plan/catalog-queries.md), [query-plan/guc-experiments.md](references/query-plan/guc-experiments.md), [query-plan/report-format.md](references/query-plan/report-format.md) | ||
|
|
@@ -278,6 +342,54 @@ PGPASSWORD="$TOKEN" psql "host=$HOST port=5432 user=admin dbname=postgres sslmod | |
|
|
||
| **Safety.** Plan capture uses `readonly_query` exclusively — it rejects INSERT/UPDATE/DELETE/DDL at the MCP layer. Rewrite DML to SELECT (Phase 1) rather than asking `transact --allow-writes` to run it; write-mode `transact` bypasses all MCP safety checks. **MUST NOT** run arbitrary DDL/DML or pl/pgsql. | ||
|
|
||
| ### Workflow 9: Full PostgreSQL → DSQL Schema Migration | ||
|
|
||
| End-to-end conversion of a PostgreSQL schema to DSQL-compatible DDL with companion code generation. Complements `dsql_lint` by handling semantic conversions the linter cannot automate. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. very long workflow additive, we're recommended to keep the top-level worth considering if instructions can be followed directly out of reference files so top-level phase instructions aren't needed? |
||
|
|
||
| **Phase 0 — Load reference material.** Load [pg-migrations/type-mapping.md](references/pg-migrations/type-mapping.md) and [pg-migrations/schema-objects.md](references/pg-migrations/schema-objects.md) before starting. | ||
|
|
||
| **Phase 1 — Lint first.** Run `dsql_lint(sql=source_sql, fix=true)` per Workflow 7. This handles SERIAL, JSON, FK removal, index ASYNC, transaction splitting mechanically. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: maybe we should avoid listing specifics here. When we add support it will be easier to maintain. Instead we can say "unsupported SQL" or similar. |
||
|
|
||
| **Phase 2 — Resolve unfixable diagnostics.** For each `unfixable` diagnostic: | ||
| - `index_using` → Load [pg-migrations/index-conversion.md](references/pg-migrations/index-conversion.md), convert GIN/GiST/BRIN to btree | ||
| - `index_partial` → Remove WHERE clause or add filter column to composite index | ||
| - `index_expression` → Add GENERATED ALWAYS AS STORED column + btree index | ||
| - `create_table_as` → CREATE TABLE with explicit columns + INSERT...SELECT | ||
| - `unsupported_alter_table_op` → Table Recreation Pattern per Workflow 6 | ||
|
|
||
| **Phase 3 — Semantic conversions (beyond dsql_lint).** Apply these using skill knowledge: | ||
| - ENUM types → CHECK constraints (load [pg-migrations/schema-objects.md](references/pg-migrations/schema-objects.md)) | ||
| - PL/pgSQL functions/triggers → SQL functions (load [pg-migrations/plpgsql-patterns.md](references/pg-migrations/plpgsql-patterns.md)) | ||
| - Add COLLATE "C" to all string columns | ||
| - uuid_generate_v4() → gen_random_uuid() | ||
| - lastval() → currval('explicit_name') | ||
| - Materialized views → regular views | ||
| - Extensions → alternatives | ||
|
|
||
| **Phase 4 — Generate companion code:** | ||
| - FK validation functions (load [pg-migrations/fk-replacement.md](references/pg-migrations/fk-replacement.md)) | ||
| - Cascade functions for ON DELETE CASCADE/SET NULL | ||
| - OCC retry wrapper (load [pg-migrations/occ-retry-patterns.md](references/pg-migrations/occ-retry-patterns.md)) | ||
|
|
||
| **Phase 5 — Re-lint and deploy.** Run `dsql_lint(fix=true)` on the final converted SQL to verify. Deploy each DDL via `transact` (one per call). | ||
|
|
||
| ### Workflow 10: ORM Migration (Django/Hibernate/Rails) | ||
|
|
||
| Framework-specific migration guidance. Load the appropriate guide: | ||
| - Django → [orm-guides/django.md](references/orm-guides/django.md) | ||
| - Hibernate/Spring Boot → [orm-guides/hibernate.md](references/orm-guides/hibernate.md) | ||
| - Rails → [orm-guides/rails.md](references/orm-guides/rails.md) | ||
|
|
||
| Key steps common to all ORMs: | ||
| 1. Install DSQL adapter/dialect | ||
| 2. Configure IAM token authentication (no passwords) | ||
| 3. Replace ForeignKey/ManyToOne with plain ID fields + application validation | ||
| 4. Use UUID primary keys | ||
| 5. Add OCC retry logic (SQLSTATE 40001) | ||
| 6. Split migrations to one DDL per transaction | ||
| 7. Use ASYNC indexes (raw SQL in migrations) | ||
| 8. Set connection pool max lifetime below 1 hour | ||
|
|
||
| --- | ||
|
|
||
| ## Error Scenarios | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,171 @@ | ||
| # Django ORM Migration Guide for DSQL | ||
|
|
||
| How to run Django applications against Aurora DSQL. | ||
|
|
||
| Sources: | ||
| - [Aurora DSQL Django Adapter](https://github.com/awslabs/aurora-dsql-orms/tree/main/python/django) | ||
| - [aurora-dsql-django on PyPI](https://pypi.org/project/aurora-dsql-django/) | ||
| - [Django Pet Clinic Example](https://github.com/awslabs/aurora-dsql-orms/tree/main/python/django/examples/pet-clinic-app) | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Installation | ||
|
|
||
| ```bash | ||
| pip install aurora-dsql-django boto3 | ||
| ``` | ||
|
|
||
| ## 2. Database Configuration | ||
|
|
||
| ```python | ||
| # settings.py | ||
| DATABASES = { | ||
| 'default': { | ||
| 'ENGINE': 'aurora_dsql_django', # NOT 'django.db.backends.postgresql' | ||
| 'NAME': 'postgres', # Always 'postgres' for DSQL | ||
| 'HOST': '<cluster-id>.<region>.dsql.amazonaws.com', | ||
| 'PORT': '5432', | ||
| 'OPTIONS': { | ||
| 'sslmode': 'require', | ||
| }, | ||
| 'CONN_MAX_AGE': 1800, # 30 min (below DSQL's 1-hour timeout) | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Key differences:** | ||
| - Engine is `aurora_dsql_django` (handles IAM token generation automatically) | ||
| - No `USER` or `PASSWORD` — IAM token via boto3 | ||
| - Database name is always `postgres` | ||
| - SSL required | ||
|
|
||
| ## 3. Model Changes | ||
|
|
||
| ### Replace ForeignKey with Plain Fields | ||
|
|
||
| ```python | ||
| # BAD: Django creates FK constraint (DSQL rejects) | ||
| class Ticket(models.Model): | ||
| org = models.ForeignKey(Organization, on_delete=models.CASCADE) | ||
|
|
||
| # GOOD: Plain field + application-layer validation | ||
| class Ticket(models.Model): | ||
| org_id = models.BigIntegerField(db_index=True) | ||
| reporter_id = models.UUIDField(db_index=True) | ||
|
|
||
| def clean(self): | ||
| if not Organization.objects.filter(id=self.org_id).exists(): | ||
| raise ValidationError({'org_id': 'Organization does not exist'}) | ||
| if not User.objects.filter(id=self.reporter_id).exists(): | ||
| raise ValidationError({'reporter_id': 'User does not exist'}) | ||
| ``` | ||
|
|
||
| ### Use UUID Primary Keys | ||
|
|
||
| ```python | ||
| import uuid | ||
| from django.db import models | ||
|
|
||
| class BaseModel(models.Model): | ||
| id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False) | ||
| class Meta: | ||
| abstract = True | ||
|
|
||
| class Organization(BaseModel): | ||
| name = models.CharField(max_length=200, unique=True) | ||
| settings = models.JSONField(default=dict) # Stored as json in DSQL | ||
| ``` | ||
|
|
||
| ### Field Mapping | ||
|
|
||
| | Django Field | DSQL Behavior | Alternative | | ||
| |---|---|---| | ||
| | `ForeignKey` | FK constraint fails | `BigIntegerField` / `UUIDField` | | ||
| | `ArrayField` | Not a stored type | `JSONField` with list | | ||
| | `HStoreField` | Not supported | `JSONField` | | ||
| | `SearchVectorField` | No FTS | External search (OpenSearch) | | ||
| | `CITextField` | No citext extension | `CharField` + `lower()` queries | | ||
|
|
||
| ## 4. Migrations (One DDL Per Transaction) | ||
|
|
||
| ```python | ||
| # Split complex migrations into separate files | ||
|
|
||
| # 0001_create_users.py | ||
| class Migration(migrations.Migration): | ||
| operations = [ | ||
| migrations.CreateModel(name='User', fields=[ | ||
| ('id', models.UUIDField(primary_key=True, default=uuid.uuid4)), | ||
| ('email', models.CharField(max_length=255)), | ||
| ]), | ||
| ] | ||
|
|
||
| # 0002_add_users_email_index.py (SEPARATE migration) | ||
| class Migration(migrations.Migration): | ||
| operations = [ | ||
| migrations.RunSQL("CREATE UNIQUE INDEX ASYNC idx_users_email ON myapp_user (email)"), | ||
| ] | ||
| ``` | ||
|
|
||
| ## 5. OCC Retry Decorator | ||
|
|
||
| ```python | ||
| import time, random | ||
| from django.db import OperationalError, transaction | ||
|
|
||
| def with_occ_retry(max_retries=5): | ||
| def decorator(func): | ||
| def wrapper(*args, **kwargs): | ||
| for attempt in range(max_retries): | ||
| try: | ||
| with transaction.atomic(): | ||
| return func(*args, **kwargs) | ||
| except OperationalError as e: | ||
| if hasattr(e, '__cause__') and hasattr(e.__cause__, 'pgcode'): | ||
| if e.__cause__.pgcode == '40001' and attempt < max_retries - 1: | ||
| delay = min(0.05 * (2 ** attempt) + random.uniform(0, 0.05), 5.0) | ||
| time.sleep(delay) | ||
| continue | ||
| raise | ||
| return wrapper | ||
| return decorator | ||
|
|
||
| # Usage: | ||
| @with_occ_retry() | ||
| def create_ticket(org_id, reporter_id, title): | ||
| ticket = Ticket(org_id=org_id, reporter_id=reporter_id, title=title) | ||
| ticket.full_clean() | ||
| ticket.save() | ||
| return ticket | ||
| ``` | ||
|
|
||
| ## 6. Collation (ORDER BY) | ||
|
|
||
| ```python | ||
| # C collation: uppercase sorts before lowercase | ||
| # For case-insensitive ordering: | ||
| from django.db.models.functions import Lower | ||
| Organization.objects.order_by(Lower('name')) | ||
| ``` | ||
|
|
||
| ## 7. Settings to Remove | ||
|
|
||
| ```python | ||
| # Remove or avoid: | ||
| # - django.contrib.postgres (ArrayField, HStoreField) | ||
| # - CONN_MAX_AGE > 3600 (DSQL timeout is 1 hour) | ||
| ``` | ||
|
|
||
| ## 8. Checklist | ||
|
|
||
| - [ ] Install `aurora-dsql-django` and `boto3` | ||
| - [ ] Change ENGINE to `aurora_dsql_django` | ||
| - [ ] Remove USER/PASSWORD from database config | ||
| - [ ] Replace all `ForeignKey` with plain ID fields | ||
| - [ ] Add `clean()` or signal-based FK validation | ||
| - [ ] Use `UUIDField` for primary keys | ||
| - [ ] Add OCC retry decorator | ||
| - [ ] Set `CONN_MAX_AGE` ≤ 1800 | ||
| - [ ] Split migrations to one DDL per file | ||
| - [ ] Use `RunSQL("CREATE INDEX ASYNC ...")` for indexes | ||
| - [ ] Test ORDER BY with C collation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: some places we use MUST load, but here just load?
Generally we try to follow RFC language like MUST, SHOULD, MAY, etc.