Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When generating multiple migrations at once, subsequent migrations that contain the same time stamp are not run. #3698

Open
sollambert opened this issue Jul 14, 2023 · 10 comments · May be fixed by #4410

Comments

@sollambert
Copy link

Setup

Versions

  • Rust: 1.71.0-nightly
  • Diesel: 2.1.0
  • Database: SQLite3
  • Operating System PopOS

Feature Flags

  • diesel: sqlite, serde_json

Problem Description

When generating multiple migrations, they may be generated with the same timestamp in the folder name.
Despite having different names following the timestamp, subsequent migrations containing the same timestamp after the first migration are ignored and will not be run.

What are you trying to accomplish?

Creating multiple migrations by chaining commands and being able to run the migrations without having to manually edit the timestamp of subsequent migrations.

What is the expected output?

Migrations are all run even if they contain the same timestamp.

What is the actual output?

Only the first migration containing a specific timestamp is run.

Are you seeing any additional errors?

No errors were reported throughout the process.

Steps to reproduce

Generate two migrations at the same time by chaining commands to generate two migrations at the same timestamp.

One could also manually edit the timestamp of the migrations to make two that are equal.

After there are two migrations, delete your database and run

diesel migration run

Only the first migration will be run and the following migration will be ignored with no error.

Checklist

  • [ x] This issue can be reproduced on Rust's stable channel. (Your issue will be
    closed if this is not the case)
  • [ x] This issue can be reproduced without requiring a third party crate
@sollambert sollambert added the bug label Jul 14, 2023
@weiznich
Copy link
Member

Thanks for reporting this issue. Diesel migration uses a {version}_{name} schema for migration folder. That's documented here. As you noticed it will only use the version part to determine whether migrations are applied or not. It's by design that we consider the version part as unique, so if no unique version is provided we only run on migration out of a set with the same version.

Now diesel migration generate generates a version identifier based on the current timestamp as that makes it easy to sort the chronologically, but that's not the only thing that can be used as version there. The migration code is completely independent from what you are using as migration version as long as you can sort the somehow. The migration generate subcomand let exposes a --version flag that allows you to specify an explicit version.

I personally see not much what can done here to improve the default behavior to generate migrations with the same version by fast subsequent runs without breaking all existing migrations. If someone has concrete proposals on how to slightly tweak the existing behavior I'm open for suggestions, but please address my concerns around existing migrations.

@sollambert
Copy link
Author

I can only think of also adding the name identifier for determining if a migration should be run, but I assume that there's something preventing that as it seems pretty obvious and I haven't peered at the actual src for where the migration version is checked.

@weiznich
Copy link
Member

This would be a breaking change as it changes what is considered to be unique. I do not see how we could do that without a major diesel release, which is not planned anytime soon.

@Mingun
Copy link
Contributor

Mingun commented Jul 19, 2023

@weiznich , if I correctly understand the situation, you should just ensure that the generated identifiers for the migrations have different timestamps. So if you generate several identifiers at once you just need one timestamp and increase second/millisecond part (I do not think that you expect some meaning from the timestamps rather than that they are unique enough).

This is also not understand why you think that this will be a breaking change. It just generates some identifiers without actual meaning of a timestamp part rather than it is sorted and unique enough and so it is useful by that means. It is not even stored anywhere except the filename, no?

@weiznich
Copy link
Member

The problem there is that we need to thing about existing migrations as well. Suppose that the user has a set of migrations using the current timestamp format as version. If we now change the version format it needs to guarantee the following things:

  • Old migrations need to have exactly the same order as before
  • New migrations should be ordered after the old ones
  • New migrations should have a meaningful order on their own.

That might be done by adding a millisecond part, but again that might not necessarily help if you create the migrations too fast. Just incrementing the timestamp would be also an idea to try, but depending on the increment interval that might raise issues about the order of subsequent bursts about of generated migrations.

Again as written above: I'm open for contributions here, I just do not want to spend a lot of time fixing that myself as there is way to control that version from user scripts.

@mikeyhew
Copy link

I guess what I would do here if I was writing a script like that would be to sleep for one second in between calls to diesel migration generate. Not sure how much of a pain that would be in your use case @sollambert

Would it be reasonable for diesel_cli to check if there's an existing migration file with the same version number as the one you are abount to create, and then report an error if that is the case? That would at least catch the error earlier and loudly instead of silently. Also, on that note, maybe it would also be good to check for migration files with the same version when running migrations too.

@weiznich
Copy link
Member

Happy to accept a PR for both suggestions. I would likely even accept a PR that just makes migration generate wait a second in such cases, but then there should be some logic to abort if the process if we already waited once/twice/...

@weiznich weiznich added the cli label Nov 19, 2023
@AndreCostaaa
Copy link
Contributor

This sounds like a fun thing to take a crack at, will send a PR soon

@AndreCostaaa AndreCostaaa linked a pull request Dec 26, 2024 that will close this issue
@sollambert
Copy link
Author

Wouldn't it make sense to have migrations version sequentially instead of by timestamp if all that's needed is a sequential unique identifier? @weiznich Is there a specific reason timestamp is being used instead? I'd think a migration naming convention along the lines of V1__{migration name}, V2__{migration name} would work as long as previously generated timestamped migrations are run before migrations created under this schema. Current timestamp identifier would be run before if we're running migrations by alphanumeric value. Would be more along the lines of other migration management tools as well (thinking flyway here).

@AndreCostaaa
Copy link
Contributor

AndreCostaaa commented Dec 27, 2024

Using timestamps provides a couple of benefits over a version number

  1. No need to look at already existing versions, when generating a new migration
  2. They are unique and in order as long as we don't try to create multiple migrations "at the same time"
  3. To add on to 1, you avoid running into concurrency issues. Imagine instance 1 creates v1, instance 2 checks that v1 doesn't exist yet and tries to create it (we would run into the same issue you described)
  4. You can read the folder name and know exactly when it was created. ls -l doesn't really solve this if the migrations were created with another computer. You would have to look at git history to see when it was created
  5. You can't really use V1, V2, you would have to use something like V01 or V001, or V0001 else version 10 would be run before version 2. So then you have to ask yourself the question, how many digits ?

Having said that my proposal does 1. and solves 2. and 3.. If we're okay losing 4. and find a good way to deal with 5., switching to V1, V2, is very easily done and doesn't really break current ordering as you said

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants