sqlx Feat/add rerun on change

fixes #3111

Mar 12 '24 13:03 Tenshock

I voluntarily break the commits into more succinct ones in order to discuss about implementation easily

Mar 12 '24 13:03 Tenshock

I don't think it makes sense to have these at the same level as the others, especially since they're explicitly not versioned. Instead of .onchange.sql, I think it should be a subfolder,

I'd propose migrations/after-migrate/ for less ambiguity (what does "change" mean? it's not as obvious as you might think). This also gives us room to extend it in the future, e.g.:

migrations/after-migrate-up/, executed only after upgrading from a previous version.
migrations/after-migrate-down/, executed only after downgrading from a previous version.
migrations/before-migrate/ and -up and -down variants, for executing scripts before migrating (maybe to kick off a backup or something).

You should also not rely on the filesystem returning entries in any defined order. It can differ between platforms and filesystems. For this reason, I'd recommend using the same filename format and applying scripts by version number just like migrations are.

Mar 18 '24 08:03 abonander

Do we discuss about the choices and implementations here or in the related issue?

About the meaning of "change"

I may have poorly explained the process and the underlying logic in the related issue. As I think about it, a change is a modification of an entity. This leads to the question of what is an entity? An entity is a thing that has a unique identifier.

In the implementation of Simple, ReverseUp and ReverseDown migrations, the identifier of a script is its version, extracted from the file name numeric prefix. And the changes are calculated from its checksum. So, in this case, a change is a checksum change for a given version.

As we agree that OnChange migrations does not have a version as it is irrelevant, the proposed implementation uses the description of the file as its identifier. So, in this case, a change is a checksum change for a given description.

For the design choice between the file suffix `.onchange.sql` and the sub-folder strategy

I think I see your point.

As I understand it in the lib, I see the migrations folder as de default location where sqlx finds its scripts. It can be overridden. I think this is the user responsibility to organise its scripts as he wants to, I wouldn't base some behaviours based in the script location.

About the logic behind the migrations/[after|before]-migrate-[up|down], if we choose to use the proposed implementation, we can think of an Always migration with a .always.sql suffix and we leave the responsibility to the user to move its scripts to relevant folder and to proceed to a potential workflow as follows:

# Say the user has this structure
.
├── after_migrate
│   ├── clean_logs.always.sql
│   └── update_ref_table.always.sql
├── before_migrate
│   ├── backup_database.always.sql
└── migrations
    ├── 0001_my_simple_script.sql
    └── 0002_another_simple_script.sql

We can perfectly do:

sqlx migrate run --source before_migrate --ignore-missing
sqlx migrate run  # --source migrations
sqlx migrate run --source after_migrate --ignore-missing

The main downfall (if it is really one) with the current capacities of the sqlx CLI is the need of --ignore-missing option to not trigger the validation process as we use different folder but it works perfectly in this use-case. As we do not track versions for OnChange or Always migrations, it is perfectly justified and fine.

Overall, SQLx already provides quite all the mechanisms to do it.

About the order to apply `OnChange` migrations

There is two orders to think about. The first is, at which time do we apply the OnChange migrations? Before or after the regular ones? I think the answer is explained above with the proposed workflow. Another possibility if all the scripts are in the same folder is to add a parameter like --run-always-first, --run-always-last, or perhaps to implement a weight mechanism for these not versioned scripts.

For the order between the OnChange (and potential Always ) migrations that do not have a version, I see three possibilities:

We enforce a sequential or timestamp version like the current behaviour
We leave the choice to the user to add or not the possibility to add a version
The order is random, like you said, order can differ between platforms and filesystems.

If we enforce a prefix version, internally, the script identifier is still the description, the version only having responsibility to enforce an order.

I hope my explanations are clear, I truly want to enhance this great lib.

By the way, I changed the implementation for the stored internal version for OnChange scripts from the last billion before the i64 max value to negative versions. As it is marked as invalid in the source code for regular migration scripts.

Mar 19 '24 16:03 Tenshock

Hello @abonander, any news about the suggestions? :)

The main points are:

.onchange.sql scripts vs subfolder strategy
scripts execution order

Apr 09 '24 09:04 Tenshock

Feat/add rerun on change

About the meaning of "change"

For the design choice between the file suffix .onchange.sql and the sub-folder strategy

About the order to apply OnChange migrations

For the design choice between the file suffix `.onchange.sql` and the sub-folder strategy

About the order to apply `OnChange` migrations