Add missing indexes to duplicated columns #570

kvch · 2025-01-07T12:47:17Z

Previously, when an indexes column was duplicated, the index was lost during the duplication process. This PR adds the missing indexes to the duplicated columns.

The example I added removes NOT NULL constraint from a column that has a hash index with a custom fillfactor.

The schema of the table before starting the migration:

postgres=# \d fruits
                                          Table "public.fruits"
 Column |          Type          | Collation | Nullable |                    Default
--------+------------------------+-----------+----------+------------------------------------------------
 id     | integer                |           | not null | nextval('_pgroll_new_fruits_id_seq'::regclass)
 name   | character varying(255) |           | not null |
 size   | fruit_size             |           | not null | 'small'::fruit_size
Indexes:
    "_pgroll_new_fruits_pkey" PRIMARY KEY, btree (id) REPLICA IDENTITY
    "_pgroll_new_fruits_name_key" UNIQUE CONSTRAINT, btree (name)
    "idx_fruits_id_gt_10" btree (id) WHERE id > 10
    "idx_fruits_name" hash (name) WITH (fillfactor='70')
    "idx_fruits_unique_name" UNIQUE, btree (name)

Without the fix in my PR starting the example migration is incorrect. Notice that the duplicated column _pgroll_new_name is does not have a hash index:

postgres=# \d fruits
                                               Table "public.fruits"
      Column      |          Type          | Collation | Nullable |                    Default
------------------+------------------------+-----------+----------+------------------------------------------------
 id               | integer                |           | not null | nextval('_pgroll_new_fruits_id_seq'::regclass)
 name             | character varying(255) |           | not null |
 size             | fruit_size             |           | not null | 'small'::fruit_size
 _pgroll_new_name | character varying(255) |           |          |
Indexes:
    "_pgroll_new_fruits_pkey" PRIMARY KEY, btree (id) REPLICA IDENTITY
    "_pgroll_dup__pgroll_new_fruits_name_key" UNIQUE, btree (_pgroll_new_name)
    "_pgroll_new_fruits_name_key" UNIQUE CONSTRAINT, btree (name)
    "idx_fruits_id_gt_10" btree (id) WHERE id > 10
    "idx_fruits_name" hash (name) WITH (fillfactor='70')
    "idx_fruits_unique_name" UNIQUE, btree (name)
Triggers:
    _pgroll_trigger_fruits__pgroll_new_name BEFORE INSERT OR UPDATE ON fruits FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_fruits__pgroll_new_name()
    _pgroll_trigger_fruits_name BEFORE INSERT OR UPDATE ON fruits FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_fruits_name()

After completing the migration the hash index on name disappears:

postgres=# \d fruits
                                          Table "public.fruits"
 Column |          Type          | Collation | Nullable |                    Default
--------+------------------------+-----------+----------+------------------------------------------------
 id     | integer                |           | not null | nextval('_pgroll_new_fruits_id_seq'::regclass)
 size   | fruit_size             |           | not null | 'small'::fruit_size
 name   | character varying(255) |           |          |
Indexes:
    "_pgroll_new_fruits_pkey" PRIMARY KEY, btree (id) REPLICA IDENTITY
    "_pgroll_new_fruits_name_key" UNIQUE CONSTRAINT, btree (name)
    "idx_fruits_id_gt_10" btree (id) WHERE id > 10

With my change starting the migration creates the missing hash index on the duplicated column. See _pgroll_dup_idx_fruits_name index on _pgroll_new_name:

postgres=# \d fruits
                                               Table "public.fruits"
      Column      |          Type          | Collation | Nullable |                    Default
------------------+------------------------+-----------+----------+------------------------------------------------
 id               | integer                |           | not null | nextval('_pgroll_new_fruits_id_seq'::regclass)
 name             | character varying(255) |           | not null |
 size             | fruit_size             |           | not null | 'small'::fruit_size
 _pgroll_new_name | character varying(255) |           |          |
Indexes:
    "_pgroll_new_fruits_pkey" PRIMARY KEY, btree (id) REPLICA IDENTITY
    "_pgroll_dup__pgroll_new_fruits_name_key" UNIQUE, btree (_pgroll_new_name)
    "_pgroll_dup_idx_fruits_name" hash (_pgroll_new_name) WITH (fillfactor='70')
    "_pgroll_dup_idx_fruits_unique_name" UNIQUE, btree (_pgroll_new_name)
    "_pgroll_new_fruits_name_key" UNIQUE CONSTRAINT, btree (name)
    "idx_fruits_id_gt_10" btree (id) WHERE id > 10
    "idx_fruits_name" hash (name) WITH (fillfactor='70')
    "idx_fruits_unique_name" UNIQUE, btree (name)
Triggers:
    _pgroll_trigger_fruits__pgroll_new_name BEFORE INSERT OR UPDATE ON fruits FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_fruits__pgroll_new_name()
    _pgroll_trigger_fruits_name BEFORE INSERT OR UPDATE ON fruits FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_fruits_name()

Completing the migration preserves the index on name and removes the NOT NULL constraint:

postgres=# \d fruits
                                          Table "public.fruits"
 Column |          Type          | Collation | Nullable |                    Default
--------+------------------------+-----------+----------+------------------------------------------------
 id     | integer                |           | not null | nextval('_pgroll_new_fruits_id_seq'::regclass)
 size   | fruit_size             |           | not null | 'small'::fruit_size
 name   | character varying(255) |           |          |
Indexes:
    "_pgroll_new_fruits_pkey" PRIMARY KEY, btree (id) REPLICA IDENTITY
    "_pgroll_new_fruits_name_key" UNIQUE CONSTRAINT, btree (name)
    "idx_fruits_id_gt_10" btree (id) WHERE id > 10
    "idx_fruits_name" hash (name) WITH (fillfactor='70')
    "idx_fruits_unique_name" UNIQUE CONSTRAINT, btree (name)

Related to #227

andrew-farries

This looks good 👍

I left some minor comments and one potential problem about how pgroll will now convert unique indexes on duplicated columns into unique constraints.

examples/.ledger

pkg/migrations/duplicate_test.go

pkg/migrations/rename.go

Co-authored-by: Andrew Farries <[email protected]>

kvch added 3 commits January 7, 2025 13:42

duplicate indexes on duplicated columns

b47f0b3

add storage param extraction

e67a40c

add unit tests && rm unnecessary attr

9af86cc

kvch marked this pull request as ready for review January 7, 2025 14:13

kvch requested a review from andrew-farries January 7, 2025 14:13

andrew-farries reviewed Jan 7, 2025

View reviewed changes

examples/.ledger Outdated Show resolved Hide resolved

pkg/migrations/duplicate_test.go Outdated Show resolved Hide resolved

pkg/migrations/duplicate_test.go Outdated Show resolved Hide resolved

pkg/migrations/rename.go Outdated Show resolved Hide resolved

kvch and others added 4 commits January 7, 2025 17:56

Update pkg/migrations/duplicate_test.go

e3b5290

Co-authored-by: Andrew Farries <[email protected]>

Update pkg/migrations/duplicate_test.go

65609b7

Co-authored-by: Andrew Farries <[email protected]>

update file name in ledger

2774738

do not convert indexes, just constraints

0741c44

kvch requested a review from andrew-farries January 7, 2025 17:22

kvch enabled auto-merge (squash) January 8, 2025 11:30

andrew-farries approved these changes Jan 8, 2025

View reviewed changes

kvch merged commit 67f4a08 into xataio:main Jan 8, 2025
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing indexes to duplicated columns #570

Add missing indexes to duplicated columns #570

kvch commented Jan 7, 2025 •

edited

Loading

andrew-farries left a comment

Add missing indexes to duplicated columns #570

Add missing indexes to duplicated columns #570

Conversation

kvch commented Jan 7, 2025 • edited Loading

andrew-farries left a comment

Choose a reason for hiding this comment

kvch commented Jan 7, 2025 •

edited

Loading