Where should we put code use to generate test data
Currently the idl code to generate the test data is contained in inline comments in the test cases this will probably not be very sustainable, should there be a idl folder in the repo specifically for the code used to generate the test case data? We can always move it to a sub/separate repo later.
Current code for comparision plots is in this gist
@wtbarnes has done something nice with testing data and IDL in aiapy if I remember correctly, maybe he has thoughts.
So the aiapy approach is to provide a set of tests that will only run if a valid SSW install exists. If it does, those tests will actually run the IDL code (using hissw) and compare that output to the equivalent output in aiapy. My approach here is to just have a single fixture that is skipped if no valid SSW install is present and then that skip propagates down to all the comparison tests.
See this test as one example. I like this approach as it provides a quick and systematic way to check consistency with SSW. No manual checking needed.
However, it has the disadvantage that your answers are dependent on a given SSW installation which could vary wildly depending on where you're running the tests (hissw does set up all of the paths so that part is at least consistent). You also can't run these on a CI unless you've figured out some way to containerize ssw and IDL.
One possible approach: use the similar approach that we did in aiapy, but first check if the data exist on disk locally and if they do, return that instead of running the IDL code. If they don't, then run the code to generate the data and save the data. That way you preserve the code that generated that data, but avoid relying on running that IDL code every time.
Yea because of the issues of not having IDL/SSW and differing versions I was thinking about creating and running some idl code to create test case data and then uploading the resulting files (.sav or fits) to somewhere maybe data.sunpy.org and for use in our tests. We really need to be able to run our tests without relying on IDL/SSW but ensuring we maintain agreement with SSW results where applicable.
I like the idea of hissw providing a pytest plugin with data caching :laughing:
It could try and run the code, to generate arrays or download them from a URL in much the same way pytest-mpl does for matplotlib figures.