pytest gallery tests
🚀 Pull Request
Description
Uses pytest parametrization to remove the duplication from the gallery tests. We now only need one test file instead of 24 🙂. To get it working, I pulled the check_graphic method off the GraphicsTest class so we can call it as a stand-alone function.
Needs https://github.com/SciTools/iris-test-data/pull/79
~In draft for now because~
- ~Probably the context managers could be replaced with pytest fixtures.~ done
- ~I have inserted some deliberate fails into two examples so we can see that the tests work as expected.~ reverted
Relevant test output
[gw2] [ 4%] FAILED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_hovmoller]
[gw1] [ 8%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_wind_barbs]
[gw11] [ 12%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_polynomial_fit]
[gw13] [ 16%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_lineplot_with_legend]
[gw16] [ 20%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_anomaly_log_colouring]
[gw20] [ 25%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_polar_stereo]
[gw21] [ 29%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_cross_section]
[gw8] [ 33%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_load_nemo]
[gw6] [ 37%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_wind_speed]
[gw0] [ 41%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_COP_1d]
[gw23] [ 45%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_custom_aggregation]
[gw18] [ 50%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_projections_and_annotations]
[gw22] [ 54%] FAILED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_inset]
[gw19] [ 58%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_SOI_filtering]
[gw9] [ 62%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_atlantic_profiles]
[gw15] [ 66%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_rotated_pole_mapping]
[gw5] [ 70%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_TEC]
[gw17] [ 75%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_global_map]
[gw14] [ 79%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_custom_file_loading]
[gw3] [ 83%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_COP_maps]
[gw7] [ 87%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_deriving_phenomena]
[gw10] [ 91%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_orca_projection]
[gw12] [ 95%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_coriolis]
[gw4] [100%] PASSED docs/gallery_tests/test_gallery_examples.py::test_plot_example[plot_lagged_ensemble]
=================================== FAILURES ===================================
______________________ test_plot_example[plot_hovmoller] _______________________
[gw2] linux -- Python 3.8.13 /tmp/cirrus-ci-build/.nox/doctest/bin/python
example_code = 'plot_hovmoller'
@pytest.mark.parametrize("example_code", gallery_examples())
def test_plot_example(example_code):
with fail_any_deprecation_warnings():
with add_gallery_to_path():
module = importlib.import_module(example_code)
print(module.__file__)
with show_replaced_by_check_graphic(
f"gallery_tests.test_{example_code}"
):
> module.main()
docs/gallery_tests/test_gallery_examples.py:32:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
docs/gallery_code/meteorology/plot_hovmoller.py:22: in main
iris.util.approx_equal(3, 4)
lib/iris/util.py:413: in approx_equal
warn_deprecated(wmsg)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
msg = 'iris.util.approx_equal has been deprecated and will be removed, please use math.isclose instead.'
stacklevel = 2
def warn_deprecated(msg, stacklevel=2):
"""
Issue an Iris deprecation warning.
Calls :func:`warnings.warn', to emit the message 'msg' as a
:class:`warnings.warning`, of the subclass :class:`IrisDeprecationWarning`.
The 'stacklevel' keyword is passed through to warnings.warn. However by
default this is set to 2, which ensures that the identified code line is in
the caller, rather than in this routine.
See :mod:`warnings` module documentation.
For example::
>>> from iris._deprecation import warn_deprecated
>>> def arrgh():
... warn_deprecated('"arrgh" is deprecated since version 3.5.')
... return 1
...
>>> arrgh()
__main__:2: IrisDeprecation: "arrgh" is deprecated since version 3.5.
1
>>> arrgh()
1
>>>
"""
> warnings.warn(msg, IrisDeprecation, stacklevel=stacklevel)
E iris._deprecation.IrisDeprecation: iris.util.approx_equal has been deprecated and will be removed, please use math.isclose instead.
lib/iris/_deprecation.py:47: IrisDeprecation
----------------------------- Captured stdout call -----------------------------
/tmp/cirrus-ci-build/docs/gallery_code/meteorology/plot_hovmoller.py
________________________ test_plot_example[plot_inset] _________________________
[gw22] linux -- Python 3.8.13 /tmp/cirrus-ci-build/.nox/doctest/bin/python
example_code = 'plot_inset'
@pytest.mark.parametrize("example_code", gallery_examples())
def test_plot_example(example_code):
with fail_any_deprecation_warnings():
with add_gallery_to_path():
module = importlib.import_module(example_code)
print(module.__file__)
with show_replaced_by_check_graphic(
f"gallery_tests.test_{example_code}"
):
> module.main()
docs/gallery_tests/test_gallery_examples.py:32:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
docs/gallery_code/general/plot_inset.py:65: in main
qplt.show()
docs/gallery_tests/gallerytest_util.py:66: in replacement_show
check_graphic(unique_id)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
unique_id = 'gallery_tests.test_plot_inset.0'
def check_graphic(unique_id):
"""
Check the hash of the current matplotlib figure matches the expected
image hash for the current graphic test.
To create missing image test results, set the IRIS_TEST_CREATE_MISSING
environment variable before running the tests. This will result in new
and appropriately "<hash>.png" image files being generated in the image
output directory, and the imagerepo.json file being updated.
"""
from PIL import Image
import imagehash
dev_mode = os.environ.get("IRIS_TEST_CREATE_MISSING")
repo_fname = os.path.join(_RESULT_PATH, "imagerepo.json")
with open(repo_fname, "rb") as fi:
repo: Dict[str, List[str]] = json.load(codecs.getreader("utf-8")(fi))
try:
#: The path where the images generated by the tests should go.
image_output_directory = os.path.join(
os.path.dirname(__file__), "result_image_comparison"
)
if not os.access(image_output_directory, os.W_OK):
if not os.access(os.getcwd(), os.W_OK):
raise IOError(
"Write access to a local disk is required "
"to run image tests. Run the tests from a "
"current working directory you have write "
"access to to avoid this issue."
)
else:
image_output_directory = os.path.join(
os.getcwd(), "iris_image_test_output"
)
result_fname = os.path.join(
image_output_directory, "result-" + unique_id + ".png"
)
if not os.path.isdir(image_output_directory):
# Handle race-condition where the directories are
# created sometime between the check above and the
# creation attempt below.
try:
os.makedirs(image_output_directory)
except OSError as err:
# Don't care about "File exists"
if err.errno != 17:
raise
def _create_missing():
fname = "{}.png".format(phash)
base_uri = (
"https://scitools.github.io/test-iris-imagehash/"
"images/v4/{}"
)
uri = base_uri.format(fname)
hash_fname = os.path.join(image_output_directory, fname)
uris = repo.setdefault(unique_id, [])
uris.append(uri)
print("Creating image file: {}".format(hash_fname))
figure.savefig(hash_fname)
msg = "Creating imagerepo entry: {} -> {}"
print(msg.format(unique_id, uri))
lock = filelock.FileLock(
os.path.join(_RESULT_PATH, "imagerepo.lock")
)
# The imagerepo.json file is a critical resource, so ensure
# thread safe read/write behaviour via platform independent
# file locking.
with lock.acquire(timeout=600):
with open(repo_fname, "wb") as fo:
json.dump(
repo,
codecs.getwriter("utf-8")(fo),
indent=4,
sort_keys=True,
)
# Calculate the test result perceptual image hash.
buffer = io.BytesIO()
figure = plt.gcf()
figure.savefig(buffer, format="png")
buffer.seek(0)
phash = imagehash.phash(Image.open(buffer), hash_size=_HASH_SIZE)
if unique_id not in repo:
# The unique id might not be fully qualified, e.g.
# expects iris.tests.test_quickplot.TestLabels.test_contour.0,
# but got test_quickplot.TestLabels.test_contour.0
# if we find single partial match from end of the key
# then use that, else fall back to the unknown id state.
matches = [key for key in repo if key.endswith(unique_id)]
if len(matches) == 1:
unique_id = matches[0]
if unique_id in repo:
uris = repo[unique_id]
# Extract the hex basename strings from the uris.
hexes = [
os.path.splitext(os.path.basename(uri))[0] for uri in uris
]
# Create the expected perceptual image hashes from the uris.
to_hash = imagehash.hex_to_hash
expected = [to_hash(uri_hex) for uri_hex in hexes]
# Calculate hamming distance vector for the result hash.
distances = [e - phash for e in expected]
if np.all([hd > _HAMMING_DISTANCE for hd in distances]):
if dev_mode:
_create_missing()
else:
figure.savefig(result_fname)
msg = (
"Bad phash {} with hamming distance {} " "for test {}."
)
msg = msg.format(phash, distances, unique_id)
if _DISPLAY_FIGURES:
emsg = "Image comparison would have failed: {}"
print(emsg.format(msg))
else:
emsg = "Image comparison failed: {}"
> raise AssertionError(emsg.format(msg))
E AssertionError: Image comparison failed: Bad phash ebff6992b50096ad9267dac4d644943294924cdbc95d4b699d29952dcea46ad4 with hamming distance [20, 18, 6] for test gallery_tests.test_plot_inset.0.
lib/iris/tests/__init__.py:343: AssertionError
----------------------------- Captured stdout call -----------------------------
/tmp/cirrus-ci-build/docs/gallery_code/general/plot_inset.py
~Note:~
- ~this will not work so should either update or remove.~
- ~this advice may need updating (currently the Makefile in docs/src does not have gallerytest).~
Done.
~Possibly should have some version of the setUp and tearDown here.~
~[https://github.com/SciTools/iris/blob/4fa3f06caf2175b8654fdb4a2bcd83080b7a1418/lib/iris/tests/init.py#L1220-L1236]~
Done
I pulled the
check_graphicmethod off theGraphicsTestclass so we can call it as a stand-alone function.
Huh. I see @wjbenfold already did that at #4759.
I had a rethink on the test strategy: previously we created each figure and called check_graphic on it before creating the next figure. Current code creates all the figures for a particular example script and then loops through them to call check_graphic.
The new method is less efficient as we are holding more figures in memory and if the first one fails we have created the others unnecessarily. However, I think the new method is simpler/clearer (relevant commit is https://github.com/SciTools/iris/pull/4792/commits/63cdd66b9419b981284284b78bca3fa7841456ff).
Ready for review I think. For examples of how the failure would look if an image changed or if functionality was deprecated, see the CI for the penultimate commit.
@SciTools/iris-devs is this one likely to be reviewed before pillow is unpinned? I'm just wondering whether to fix the imagerep.json conflicts now or wait for it to change again.
@SciTools/iris-devs is this one likely to be reviewed before
pillowis unpinned? I'm just wondering whether to fix theimagerep.jsonconflicts now or wait for it to change again.
We're planning to unpin pillow pretty soon(tm), so I'd hang fire :)
Rebased following #4759 and #4826. recreate_imagerepo.py came in very handy :sunglasses:
Needs https://github.com/SciTools/iris-test-data/pull/79
Hi @wjbenfold great to “see” you again 😊
Thanks for the review. I didn’t see any errors when I checked locally but maybe I missed something. I suspect that #4871 needs prioritising so will wait for that to go in, rebase both branches, and check again.
OK I checked and yes, I do get failures for those examples. They match the gallery test failures in #4846. So once I rebase and pick up #4847, all should be well.
I'm kinda surprised the plot_lagged_ensemble didn't fall over in the pp-loading though :confused:
#4871 is abandoned for now, so I've gone ahead and rebased. Both gallery tests and main tests are passing for me locally. I've also added a whatsnew entry and bumped the IRIS_TEST_DATA_VERSION in anticipation of https://github.com/SciTools/iris-test-data/pull/79.
@bjlittle you're assigned to this: is this on your list before holidays or should someone else take a look?
When I run the tests I now get
lib/iris/__init__.py:104: in <module>
from ._version import version as __version__ # noqa: F401
E ModuleNotFoundError: No module named 'iris._version'
I'm sufficiently out of the loop to not understand what's broken here - I can't see a file named _version?
When I run the tests I now get
lib/iris/__init__.py:104: in <module> from ._version import version as __version__ # noqa: F401 E ModuleNotFoundError: No module named 'iris._version'I'm sufficiently out of the loop to not understand what's broken here - I can't see a file named
_version?
@wjbenfold this branch needs #4892
Thanks chaps. I have rebased again. Gallery tests all pass for me locally. In the main tests I see two failures
- the regridding one discussed at #4893
- ~something to do with a dot file, which I think I've seen before and isn't generally reproducible.~ Never mind, I think I've found the cause of that.
So I think CI should pass once https://github.com/SciTools/iris-test-data/pull/79 is merged.
Edit: I hadn't registered the _version thing either!
Rebased again. Tests all passing for me locally.
Thanks @wjbenfold! Is there more to do here or can this be merged?
Thanks @wjbenfold! Is there more to do here or can this be merged?
I was just checking through the conversations to check they were resolved. It's now in!
Thanks @wjbenfold! :tada:
Thanks so much @rcomer and @wjbenfold. We've been pretty thin on the ground during these summer weeks so your contribution is greatly appreciated 🥰