Skip to content

feat(dsql): Add PostgreSQL schema conversion and migration references#168

Open
pyraenix wants to merge 1 commit into
awslabs:mainfrom
pyraenix:feat/dsql-pg-migration-skill-extension
Open

feat(dsql): Add PostgreSQL schema conversion and migration references#168
pyraenix wants to merge 1 commit into
awslabs:mainfrom
pyraenix:feat/dsql-pg-migration-skill-extension

Conversation

@pyraenix
Copy link
Copy Markdown

@pyraenix pyraenix commented May 16, 2026

Extends the DSQL skill with PostgreSQL-to-DSQL migration knowledge that complements dsql_lint.

What's added

  • 9 reference files in references/pg-migrations/ (type mapping, PL/pgSQL patterns, FK replacement, index conversion, schema objects, function compatibility, OCC retry, data migration, multi-region)
  • 3 ORM guides in references/orm-guides/ (Django, Hibernate, Rails)
  • 13 new evals in pg_migration_evals.json (70/70 expectations pass at 100%)
  • Updated SKILL.md with new workflows (9: Full PG→DSQL Migration, 10: ORM Migration)

Coverage

All 16 items from the gap analysis are implemented and tested:
ENUM→CHECK, PL/pgSQL→SQL, triggers, GIN/GiST/BRIN→btree, partial indexes, expression indexes, materialized views, COLLATE C, multi-schema, FK→validation functions, roles/IAM, OCC retry, ORM adapters, COPY→INSERT, uuid_generate_v4→gen_random_uuid, lastval→currval.

Design principle

No duplication with dsql_lint. The linter handles mechanical fixes. The skill handles semantic conversions the linter cannot automate (code generation, architectural guidance, ORM patterns).


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

Extend the DSQL skill with migration knowledge that complements dsql_lint:
- PL/pgSQL transpilation (10 patterns)
- FK validation function generation
- GIN/GiST/BRIN index conversion
- ENUM to CHECK constraint conversion
- OCC retry patterns (Python/Node/Java/Go)
- ORM guides (Django, Hibernate, Rails)
- Data migration (COPY replacement)
- Multi-schema flattening
- Function compatibility matrix
- Multi-region design patterns

13 evals added, 70/70 expectations pass at 100%.
@pyraenix
Copy link
Copy Markdown
Author

Working with Aleksandar on this.

Copy link
Copy Markdown
Contributor

@amaksimo amaksimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. There are some build failures that you will need to address.

I think we should probably consider the tenet "dsql-lint is the source of truth" and thus handles everything possible and try to remove some redundant conversion tables.

For example I think:

Expression Index Conversion

section is useful because it is really tough to model that in a linter, but for converting X type into Y type, we should handle that in dsql-lint. If dsql-lint doesn't handle it, we should cut an issue for that, but maintaining a list here sort of defeats the purpose of dsql-lint.

In general, the steering docs should act as a layer on-top of dsql-lint and provide semantic guidance and tips that we cannot embed into a deterministic tool.

The main thing we want to avoid is having multiple sources of truth that drift or become redundant.

@@ -0,0 +1,293 @@
# Data Migration to DSQL
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the files above 150 lines, the general suggestion is to add a table of contents at the top to make it easier to index. Can you please do that?


**Phase 0 — Load reference material.** Load [pg-migrations/type-mapping.md](references/pg-migrations/type-mapping.md) and [pg-migrations/schema-objects.md](references/pg-migrations/schema-objects.md) before starting.

**Phase 1 — Lint first.** Run `dsql_lint(sql=source_sql, fix=true)` per Workflow 7. This handles SERIAL, JSON, FK removal, index ASYNC, transaction splitting mechanically.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we should avoid listing specifics here. When we add support it will be easier to maintain. Instead we can say "unsupported SQL" or similar.


#### [pg-migrations/data-migration.md](references/pg-migrations/data-migration.md)

**When:** Load when planning data migration from PostgreSQL to DSQL
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: some places we use MUST load, but here just load?

Generally we try to follow RFC language like MUST, SHOULD, MAY, etc.

| PostgreSQL Index Feature | DSQL Conversion | Notes |
|---|---|---|
| CREATE INDEX | CREATE INDEX ASYNC | `dsql_lint` handles |
| CREATE UNIQUE INDEX | CREATE UNIQUE INDEX ASYNC | Uniqueness preserved |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dsql_lint will also handle this.


---

## Conversion Rules Summary
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran all of these with dsql-lint.

They are all transformed other than:

  • WHERE (partial)
  • Expression Indexes
  • Operator class

Perhaps we can write a disclaimer at the top to try with dsql-lint first? Otherwise maybe these make sense as an issue cut to dsql-lint rather than a separate table here which will need to be maintained

@@ -0,0 +1,293 @@
# Data Migration to DSQL
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove/trim this given we have: https://github.com/aws-samples/aurora-dsql-loader

@@ -0,0 +1,381 @@
# OCC Retry Patterns for DSQL
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should belong in a section separate from pg-migrations. There are other actions we might want occ retries for.

I think given this PR is focused on migrations, we should probably leave this out for now.

Also a lot of our connectors have built-in opt-in occ retry at the pool/single connection level.

I do not think we need the per language examples here and can instead say something like:
"Use the DSQL Connector for your language (see language.md). If no connector is available, implement the following strategy: [parameters only]."


---

## Migration Decision Matrix
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment to what I had earlier.

I think this should be an issue cut to dsql-lint instead of a table we have to maintain separately.

Ideally this would be fully redundant.

-- Table is immediately available in Region 2 as well
```

### Performance
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this section?

@anwesham-lab
Copy link
Copy Markdown
Member

large volume of format errors that need to be fixed: https://github.com/awslabs/agent-plugins/actions/runs/25952230919/job/76575725027?pr=168#step:4:11

mise build should catch and capture those


### Workflow 9: Full PostgreSQL → DSQL Schema Migration

End-to-end conversion of a PostgreSQL schema to DSQL-compatible DDL with companion code generation. Complements `dsql_lint` by handling semantic conversions the linter cannot automate.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very long workflow additive, we're recommended to keep the top-level SKILL.md at 300 LOC or less (and the build will fail otherwise)

worth considering if instructions can be followed directly out of reference files so top-level phase instructions aren't needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants