OpenCL-ICD-Loader Improved Unit Testing Feedback

I've been experimenting with alternate ways to test the OpenCL ICD loader, with the following goals in mind:

Use a modern unit testing framework.
Test as much as possible without needing to manually setup a stub ICD.
Automatically generate as much test content as possible.

I have something working and I'm looking for feedback before proceeding further, mostly because there is still some manual work needed to setup each test, and I won't spend the time doing this if it looks like things aren't moving in the right direction.

Some details:

My work is staged here:

https://github.com/bashbaug/OpenCL-ICD-Loader/tree/improved-testing

Most of the new code is currently in the "test_new" directory. I can create a WIP pull request if that is easier for review.
I'm using Googletest for now, which I am most familiar with, but I can switch to a different testing framework if desired.
I've managed to generate almost all of the "recorder" ICD, which I use for testing. The "recorder" ICD simply records each of the arguments it received in memory before returning. The test code can then ensure that the arguments that were recorded match the arguments that it passed. This is very similar to the existing "stub" ICD, except it records the arguments in memory rather than in a file.
I've managed to generate a "template" to simplify test development for each API, but some parts of the template still need to be filled in manually. If anyone has ideas to completely generate each test I'd love to hear them!
Things this won't test: any of the ICD discovery code, such as the code to scan the registry or /etc/OpenCL/vendors.

What I'm looking for:

Is this helpful? Should I keep going?
Are there any suggested tweaks to the test "template" before I start adding the remaining APIs?

Aside: I think we should be able to automatically generate a bunch of the ICD loader code itself, similar to the way I've generated the recorder ICD and test template. I plan to look at this at some point in the near future, but if someone else wants to take a look at it first, the gen_tester.py script could be a helpful place to start from.

Thanks!

Jun 20 '19 05:06 bashbaug

I like the way you are testing the loader here. I think I have an idea to completely automate the generation of 95% of the test. I have a couple of questions though, and one remark. Questions:

Is there any reason to avoid generating a mockup ICD? This would allow testing a 100% of the API while staying generated. I imagine the problem would lie with Windows, as on unixes I don't see any obstacles.
We could also use the same approach to create a test layer to check cases where the call can't dispatch, and the layer mechanism.
Do you mind if I pick-up where you left it?

Remark:

Using the same framework to generate both the loader and the tests, what is the intrinsic value of the tests? Wouldn't the tests have the same blind spots as the generated loader?

May 03 '21 22:05 Kerilk

Do you mind if I pick-up where you left it

No, not at all! I'm also happy to help if needed, just let me know what I can do.

Using the same framework to generate both the loader and the tests, what is the intrinsic value of the tests? Wouldn't the tests have the same blind spots as the generated loader?

This is a very interesting question now that we're generating most of the loader...

May 04 '21 04:05 bashbaug

Using the same framework to generate both the loader and the tests, what is the intrinsic value of the tests? Wouldn't the tests have the same blind spots as the generated loader?

One alternative is to test the generator itself, e.g. for input X does it produce expected output Y. The variety of patterns in cl.xml, however, means it would be hard to write tests with good coverage. Also if the bug is in cl.xml itself no test generation approach will catch it.

The current testing is pretty minimal. Pragmatically I think generated testing could still provide more safety than that, even with the risk of blind spots. For hand-written testing there is still the CTS providing more coverage, just with unknown timescales to discover issues because it's not suitable for use in CI here.

May 10 '21 22:05 alycm