Risky testing of versions of provisional extensions
OpenCL CTS includes testing for the provisional extensions cl_khr_command_buffer and cl_khr_command_buffer_mutable_dispatch. Both of these contain a warning:
This is a provisional extension and must be used with caution. See the description of provisional header files for enablement and stability details.
This warning is because as long as the extension has provisional status, future versions must be anticipated to potentially contain both ABI-incompatible and API-incompatible changes. Any code that wishes to use this extension must pay attention to the extension's version. This is not hypothetical: ABI-incompatible and API-incompatible changes have been made earlier this year.
For cl_khr_command_buffer, the extension version is checked by https://github.com/KhronosGroup/OpenCL-CTS/blob/918d561c6c42fb7f3e2ffe1d1574c0c37bb2b15b/test_conformance/extensions/cl_khr_command_buffer/basic_command_buffer.h#L102-L112
For cl_khr_command_buffer_mutable_dispatch, the extension version is checked by https://github.com/KhronosGroup/OpenCL-CTS/blob/918d561c6c42fb7f3e2ffe1d1574c0c37bb2b15b/test_conformance/extensions/cl_khr_command_buffer/cl_khr_command_buffer_mutable_dispatch/mutable_command_basic.h#L90-L95 and https://github.com/KhronosGroup/OpenCL-CTS/blob/918d561c6c42fb7f3e2ffe1d1574c0c37bb2b15b/test_conformance/extensions/cl_khr_command_buffer/cl_khr_command_buffer_mutable_dispatch/mutable_command_info.cpp#L127-L133
All three checks have the following in common: if the reported extension version is older than what OpenCL CTS tests, skip the test.
Please consider what happens when for whatever reason, a not fully up to date version of OpenCL CTS is used. Suppose today's version of OpenCL CTS is used to test a future OpenCL implementation that reports availability of version 0.9.7 of cl_khr_command_buffer. The result is that the current logic to skip the test fails, the current version of the test is run, and because 0.9.7 is permitted to break compatibility, we get implicitly undefined behavior that may manifest itself in any number of ways. If we are lucky, functions that are looked for will not be found. If we are unlucky, function that are looked for will be found, but will not have the expected behavior.
In addition, although it only rarely makes sense to run older versions of OpenCL CTS1, if people look for guidance on how to use these extensions by checking how they get tested in OpenCL CTS and the version checks that are used here are copied elsewhere, it becomes more likely that they are used in software that is not updated as frequently for newer versions of the extension.
I would like to prevent this from becoming an issue in the future, and would like to work to submit a pull request to make improvements here, but would like to request feedback on what behavior would be desired.
At time of writing, the most recent version of the cl_khr_command_buffer extension is 0.9.5.
What behavior is expected when running against an implementation that supports version 0.9.3 (an older version)?
At the moment, the result is that tests are skipped. It is possible to leave it that way, but a possible alternate behavior is to mark the test as failed.
What behavior is expected when running against an implementation that supports version 0.9.7 (a newer version)?
At the moment, the result is that tests are run and we risk undefined behavior. It is possible to leave it that way, but possible alternate behaviors are to mark the test as skipped, or to mark the test as failed.
My personal preference here is, in both cases, to mark the test as failed: the current OpenCL spec says that the version of the extension is 0.9.5, so an implementation that claims to implement this extension, but reports its version as something other than 0.9.5, does not implement the current OpenCL spec. That said, all options are justifiable.
Background: uncertainty over the API and ABI implications is why in oneAPI Construction Kit, we have not yet updated to the current version of the cl_khr_command_buffer extension. We would like to update to the current version. Doing so would break our compatibility with older versions of OpenCL CTS. We would like to say that we consider this an issue in those older versions of OpenCL CTS, and work to ensure that this issue does not arise again in the future, but we do not yet know whether this point of view will be shared.
1. A legitimate reason for running older versions of OpenCL CTS might be that a change that breaks compilation for some users is accidentally merged, and this goes unnoticed for a while.
We discussed this on the 11/11/24 teleconference, and the consensus was that using a version check for equality is preferable than the current behavior of checking for a more recent version.
We didn't discuss the point of when/if to mark the tests as failed rather than skipped if this check does not succeed, or whether those two cases of older/newer versions should be treated differently. i.e. we could do if the implementation is newer the test is marked skip, but if older then it's marked fail.
Something we might want to consider in this area is:
- Combine the extension check and the extension version check into a single new harness function.
- Add a harness command line to bypass extension checks, set to "false" by default.
- This will allow implementations that do not support an exact extension version to opt-in to testing, if desired.
- Eventually we could expand this to check for newer versions within a major non-beta version, so for example extension version 1.1 tests will run on a device that reports extension version 1.2, but not extension versions 0.9, 1.0, or 2.0.
We did something kind of similar for SPIR-V version checks, see:
https://github.com/KhronosGroup/OpenCL-CTS/blob/02e99f4554fb7f353562a09d1e922f19d2ff7117/test_conformance/spirv_new/spirvInfo.hpp#L23-L41