feat(dsql): Add PostgreSQL schema conversion and migration references#168
feat(dsql): Add PostgreSQL schema conversion and migration references#168pyraenix wants to merge 1 commit into
Conversation
Extend the DSQL skill with migration knowledge that complements dsql_lint: - PL/pgSQL transpilation (10 patterns) - FK validation function generation - GIN/GiST/BRIN index conversion - ENUM to CHECK constraint conversion - OCC retry patterns (Python/Node/Java/Go) - ORM guides (Django, Hibernate, Rails) - Data migration (COPY replacement) - Multi-schema flattening - Function compatibility matrix - Multi-region design patterns 13 evals added, 70/70 expectations pass at 100%.
|
Working with Aleksandar on this. |
There was a problem hiding this comment.
Thanks for the contribution. There are some build failures that you will need to address.
I think we should probably consider the tenet "dsql-lint is the source of truth" and thus handles everything possible and try to remove some redundant conversion tables.
For example I think:
Expression Index Conversion
section is useful because it is really tough to model that in a linter, but for converting X type into Y type, we should handle that in dsql-lint. If dsql-lint doesn't handle it, we should cut an issue for that, but maintaining a list here sort of defeats the purpose of dsql-lint.
In general, the steering docs should act as a layer on-top of dsql-lint and provide semantic guidance and tips that we cannot embed into a deterministic tool.
The main thing we want to avoid is having multiple sources of truth that drift or become redundant.
| @@ -0,0 +1,293 @@ | |||
| # Data Migration to DSQL | |||
There was a problem hiding this comment.
For all the files above 150 lines, the general suggestion is to add a table of contents at the top to make it easier to index. Can you please do that?
|
|
||
| **Phase 0 — Load reference material.** Load [pg-migrations/type-mapping.md](references/pg-migrations/type-mapping.md) and [pg-migrations/schema-objects.md](references/pg-migrations/schema-objects.md) before starting. | ||
|
|
||
| **Phase 1 — Lint first.** Run `dsql_lint(sql=source_sql, fix=true)` per Workflow 7. This handles SERIAL, JSON, FK removal, index ASYNC, transaction splitting mechanically. |
There was a problem hiding this comment.
nit: maybe we should avoid listing specifics here. When we add support it will be easier to maintain. Instead we can say "unsupported SQL" or similar.
|
|
||
| #### [pg-migrations/data-migration.md](references/pg-migrations/data-migration.md) | ||
|
|
||
| **When:** Load when planning data migration from PostgreSQL to DSQL |
There was a problem hiding this comment.
nit: some places we use MUST load, but here just load?
Generally we try to follow RFC language like MUST, SHOULD, MAY, etc.
| | PostgreSQL Index Feature | DSQL Conversion | Notes | | ||
| |---|---|---| | ||
| | CREATE INDEX | CREATE INDEX ASYNC | `dsql_lint` handles | | ||
| | CREATE UNIQUE INDEX | CREATE UNIQUE INDEX ASYNC | Uniqueness preserved | |
There was a problem hiding this comment.
dsql_lint will also handle this.
|
|
||
| --- | ||
|
|
||
| ## Conversion Rules Summary |
There was a problem hiding this comment.
I ran all of these with dsql-lint.
They are all transformed other than:
- WHERE (partial)
- Expression Indexes
- Operator class
Perhaps we can write a disclaimer at the top to try with dsql-lint first? Otherwise maybe these make sense as an issue cut to dsql-lint rather than a separate table here which will need to be maintained
| @@ -0,0 +1,293 @@ | |||
| # Data Migration to DSQL | |||
There was a problem hiding this comment.
I think we should remove/trim this given we have: https://github.com/aws-samples/aurora-dsql-loader
| @@ -0,0 +1,381 @@ | |||
| # OCC Retry Patterns for DSQL | |||
There was a problem hiding this comment.
I think this should belong in a section separate from pg-migrations. There are other actions we might want occ retries for.
I think given this PR is focused on migrations, we should probably leave this out for now.
Also a lot of our connectors have built-in opt-in occ retry at the pool/single connection level.
I do not think we need the per language examples here and can instead say something like:
"Use the DSQL Connector for your language (see language.md). If no connector is available, implement the following strategy: [parameters only]."
|
|
||
| --- | ||
|
|
||
| ## Migration Decision Matrix |
There was a problem hiding this comment.
similar comment to what I had earlier.
I think this should be an issue cut to dsql-lint instead of a table we have to maintain separately.
Ideally this would be fully redundant.
| -- Table is immediately available in Region 2 as well | ||
| ``` | ||
|
|
||
| ### Performance |
|
large volume of format errors that need to be fixed: https://github.com/awslabs/agent-plugins/actions/runs/25952230919/job/76575725027?pr=168#step:4:11 mise build should catch and capture those |
|
|
||
| ### Workflow 9: Full PostgreSQL → DSQL Schema Migration | ||
|
|
||
| End-to-end conversion of a PostgreSQL schema to DSQL-compatible DDL with companion code generation. Complements `dsql_lint` by handling semantic conversions the linter cannot automate. |
There was a problem hiding this comment.
very long workflow additive, we're recommended to keep the top-level SKILL.md at 300 LOC or less (and the build will fail otherwise)
worth considering if instructions can be followed directly out of reference files so top-level phase instructions aren't needed?
Extends the DSQL skill with PostgreSQL-to-DSQL migration knowledge that complements dsql_lint.
What's added
references/pg-migrations/(type mapping, PL/pgSQL patterns, FK replacement, index conversion, schema objects, function compatibility, OCC retry, data migration, multi-region)references/orm-guides/(Django, Hibernate, Rails)pg_migration_evals.json(70/70 expectations pass at 100%)Coverage
All 16 items from the gap analysis are implemented and tested:
ENUM→CHECK, PL/pgSQL→SQL, triggers, GIN/GiST/BRIN→btree, partial indexes, expression indexes, materialized views, COLLATE C, multi-schema, FK→validation functions, roles/IAM, OCC retry, ORM adapters, COPY→INSERT, uuid_generate_v4→gen_random_uuid, lastval→currval.
Design principle
No duplication with dsql_lint. The linter handles mechanical fixes. The skill handles semantic conversions the linter cannot automate (code generation, architectural guidance, ORM patterns).
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.