server WIP: Implement major version upgrade testing with binary files

Description

Implement major version upgrade testing with binary files

In order to increase the confidence, validation and coverage of major version upgrade testing, this approach similarly follows the pre-existing method of loading dumps from prior versions into the data-directory and attempting an upgrade. Instead of using the dumps of entire schemas, we do the same with specific binary files of tables.

The PR to expand the versions and testing for the upgrading from dump approach can be found here: #3503.

Currently, only binary files from 10.11 are used. Furthermore, only a subset of all available binary files of tables are being included. The intent is to validate the approach of this method first and then further extend the scope of what is covered by including binary files from other versions and the binary files for the other mysql tables.

The act of loading binary files from prior version into the data directory and then performing operations upon them was inspired by #2170. Feedback this regarding this approach and ways to best implement testing for major version upgrades are welcomed and greatly appreciated.

Release Notes

N/A

How can this PR be tested?

Changes are only to testing, no functionality is changed. Introduced tests are passing.

main.mysql_binary_upgrade                w8 [ pass ]   1355

Basing the PR against the correct MariaDB version

[x] This is a change in testing that is targeted towards the current MariaDB LTS.

PR quality check

[x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
[x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

Oct 01 '24 19:10 anson1014

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Oct 01 '24 19:10 CLAassistant

Similar to #3503 which is also still pending review.

Oct 01 '24 21:10 anson1014

Whilst I get it, these tests are way more useful than the original tests behind #3503, I don't think this is a good idea here for several reasons. My arguments won't count for much, I know people will try and counter them, but it will be why I will abstain from reviewing it:

The on-disk binary format might be platform / endian dependent.
This should cover all engines and data types, not just the auth tables.
The matrix for this test will end up being so wide that it is better to do it as part of a CI system that installs the old version, sets up some tables and executes the upgrade, running some post-upgrade tests on the data. Note: If this is difficult to do, then that screams that there is a usability issue for users too, who are trying to automate these things as well.
Both this and the original tests behind #3503 don't really test data integrity. More "yea, that looks like the right number of rows in a table, we'll call that good".
As little binary data as possible should be in the repo. I'd prefer scripted generators instead (which I know will not be easy in this case).

As it is, I think these kind of tests are as good as a chocolate fireguard without a wider overhaul of them.

Oct 04 '24 12:10 LinuxJedi

I don't think this is a good idea here

I agree. The goal of the original tests behind 3503 has never been to test the binary compatibility or integrity. It is mysql_upgrade.test, it is meant to test mysql_upgrade -- the tool itself, that is. The major upgrade variation was added there in the scope of MDEV-22249, also with the same purpose -- to test that mysql_upgrade between the versions keeps working, which hasn't always been the case.

MTR is a bad tool for upgrade tests in general, it's one of its well-known weaknesses. Some tests can be added in some cases for specific bugfixes, but that's about it. Adding a few tables there doesn't increase confidence, it only increases the volume of code and packages.

The matrix for this test will end up being so wide that it is better to do it as part of a CI system that installs the old version, sets up some tables and executes the upgrade, running some post-upgrade tests on the data. Note: If this is difficult to do, then that screams that there is a usability issue for users too, who are trying to automate these things as well.

The reason we are not doing it in the official CIs is that the matrix is actually so big that it doesn't fit even there, at least not on per-push basis. We need a solution for scheduled runs to do it. I mean, it can be done for default installations, we have stubs for that in package upgrade tests, it would only need a bit more meaningful data. But it won't be a comprehensive matrix either.

In addition to engines and versions, the matrix also includes encryption, compression, some important engine-specific parameters, etc. I used to run such tests outside official CIs on schedule, and even a limited edition was taking several hours for one build. Basically, tests were running non-stop 24/7. Currently a subset of them is run on ad-hoc basis.

The time could be reduced if instead of installing old version and generating the data each time we had pre-stored old datadirs, but it considerably increases maintenance, as the new data will need to be generated often (and also needs to be generated separately for different platforms, etc). I tried that too, and found it not sustainable.

Oct 04 '24 13:10 elenst

The reason we are not doing it in the official CIs is that the matrix is actually so big that it doesn't fit even there, at least not on per-push basis. We need a solution for scheduled runs to do it. I mean, it can be done for default installations, we have stubs for that in package upgrade tests, it would only need a bit more meaningful data. But it won't be a comprehensive matrix either.

I assume by this you mean there is finite compute resources available? This is something that should be easy to parallelise across hundreds of VMs or containers so it all runs in minutes. I can even think of several optimisations that can be made to make it quicker, particularly with containers (although it would probably work with VMs too).

Oct 04 '24 14:10 LinuxJedi