astroquery icon indicating copy to clipboard operation
astroquery copied to clipboard

MNT: tests should use temporary files and delete them

Open bsipocz opened this issue 4 years ago • 8 comments

When using pytest directly, rather than the astropy test runner, quite a few tests are creating files that are not cleaned up after the run. The repo should be left clean of these once the tests are run. I have these after a full run with --remote-data.

	J6FL25S4Q.vot
	J6FL25S4Q.vot.fits.gz
	J6FL25S4Q.vot.jpg
	X0MC5101T.vot
	astroquery/xmatch/tests/data/http_result.csv
	doctests.py
	file.png
	filename.tar
	int_data/
	moc.fits
	tarfile_lightcurves.tar
	target.xml
	w0ji0v01t_c2f.fits.gz

edit: maybe add a command at the end of the remote-data tests that lists everything that got generated during the run? maybe even test that no new files are hanging around?

bsipocz avatar Jul 06 '21 16:07 bsipocz

When trying to package astroquery 0.4.3 for Debian, some of these files end up being installed in the top-level of the Python library directory /usr/lib/python3/dist-packages, which is problematic.

ViviCoder avatar Aug 26 '21 16:08 ViviCoder

Hmm, don't think these ended up in the sdist, but let me know if that's the case. Do you run the test suite as part of the build/install process that generated the files?

bsipocz avatar Aug 26 '21 17:08 bsipocz

Yes, tests are run during the packaging process. I suppose the files come from there.

ViviCoder avatar Aug 26 '21 23:08 ViviCoder

Yes they are. If there is any chance to start from the fresh sdist again after the test are run and passed, it would be a suitable workaround.

OTOH, this is a very annoying issue, so I try to get back to it sooner rather than later unless someone else is quicker and opens a PR :)

bsipocz avatar Aug 27 '21 00:08 bsipocz

This issue also makes some tests fail with PermissionError when the package is installed in a system location. See for example https://ci.debian.net/data/autopkgtest/unstable/amd64/a/astroquery/21354351/log.gz

ViviCoder avatar May 04 '22 07:05 ViviCoder

cc @esdc-esac-esa-int

bsipocz avatar May 04 '22 15:05 bsipocz

After merge #2461 I cannot seem to recreate this dumping of files. Are you using this command to run the tests?

python setup.py test -P esa,esasky,gaia,utils.tap -R

javier-ballester avatar Jul 12 '22 07:07 javier-ballester

@javier-ballester - the files only show up when running the tests directly with pytest as it runs them directly from the repo source tree rather than from a temp copy like the astropy test runner (when you run it via setup.py)

bsipocz avatar Jul 12 '22 13:07 bsipocz

It seems that the astroquery.esasky.ESASky.get_images() method is responsible for the dumping of

$ ls int_data/scw/0042/004200100010.001/
compton_events.fits.gz		jmx2_anode_histo.fits.gz	sc_gti.fits.gz
ibis_deadtime.fits.gz		jmx2_deadtime.fits.gz		sc_hk.fits.gz
ibis_diag.fits.gz		jmx2_events.fits.gz		sc_modes.fits.gz
ibis_gti.fits.gz		jmx2_gain_corr.fits.gz		sc_orbit_param.fits.gz
ibis_hk.fits.gz			jmx2_gti.fits.gz		sc_time.fits.gz
ibis_mce.fits.gz		jmx2_hk.fits.gz			spi_acs.fits.gz
isgri_events.fits.gz		jmx2_mode.fits.gz		spi_block.fits.gz
jmx1_anode_histo.fits.gz	omc_gti.fits.gz			spi_gti.fits.gz
jmx1_deadtime.fits.gz		omc_hk.fits.gz			spi_hk.fits.gz
jmx1_events.fits.gz		omc_science_param.fits.gz	spi_oper.fits.gz
jmx1_gain_corr.fits.gz		omc_shots.fits.gz		spi_science_hk.fits.gz
jmx1_gti.fits.gz		picsit_histo_simh.fits.gz	spi_science_param.fits.gz
jmx1_hk.fits.gz			picsit_histo_sish.fits.gz	swg.fits
jmx1_mode.fits.gz		picsit_spti.fits.gz

into the cwd of the test runner, even if one uses tmp_path for the files the tests explicitly create. @imbasimba any idea where, how or why these files are produced? It's not clear to me at all in the code in esasky.core. Perhaps they are specified server side?

EDIT:

It seems that is is only produced in the test test_esasky_get_images_obs_id and not in the others that call the above method. What is it about querying using observation_ids that makes this behavior change?

Btw, this was debugged with a pytest autouse fixture I wrote to temporarily find the test that was dumping these files:

$ pytest -rsx --remote-data=any astroquery/esasky/tests/test_esasky_remote.py::TestESASky::test_esasky_get_images_obs_id -v --show-capture=no
=========== test session starts =============
platform darwin -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /Users/jdavies/miniconda3/envs/mast/bin/python
cachedir: .pytest_cache

Running tests with astroquery version 0.4.7.dev8069_testrun_testrun.

Running tests with astropy_helpers version 4.0.1.
Running tests in astroquery/esasky/tests/test_esasky_remote.py::TestESASky::test_esasky_get_images_obs_id.

Date: 2022-09-12T11:03:06

Platform: macOS-10.15.7-x86_64-i386-64bit

Executable: /Users/jdavies/miniconda3/envs/mast/bin/python

Full Python Version: 
3.10.4 (main, Mar 31 2022, 03:38:35) [Clang 12.0.0 ]

encodings: sys: utf-8, locale: UTF-8, filesystem: utf-8
byteorder: little
float info: dig: 15, mant_dig: 15

Package versions: 
Numpy: 1.23.2
Matplotlib: 3.5.3
Astropy: 5.1
APLpy: 2.1.0
pyregion: 2.1.1
regions: 0.6
pyVO: 1.3
mocpy: 0.10.1
astropy-healpix: 0.6
vamdclib: not available

Using Astropy options: remote_data: any.

hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/Users/jdavies/dev/astroquery/.hypothesis/examples')
rootdir: /Users/jdavies/dev/astroquery, configfile: setup.cfg
plugins: xdist-2.5.0, remotedata-0.3.3, forked-1.4.0, astropy-header-0.2.1, mock-3.8.2, dependency-0.5.1, astropy-0.10.0, filter-subpackage-0.1.1, hypothesis-6.54.3, openfiles-0.5.0, doctestplus-0.12.0, cov-3.0.0, arraydiff-0.5.0
collected 1 item                                                                                                            

astroquery/esasky/tests/test_esasky_remote.py::TestESASky::test_esasky_get_images_obs_id PASSED                       [100%]
astroquery/esasky/tests/test_esasky_remote.py::TestESASky::test_esasky_get_images_obs_id ERROR                        [100%]

========================================================== ERRORS ===========================================================
_______________________________ ERROR at teardown of TestESASky.test_esasky_get_images_obs_id _______________________________

    @pytest.fixture(autouse=True)
    def check_for_stragglers():
        try:
            yield
        finally:
            if os.path.exists("int_data"):
>               raise RuntimeError
E               RuntimeError

astroquery/esasky/tests/test_esasky_remote.py:22: RuntimeError
===================== 1 passed, 1 error in 7.15s ======================

EDIT 2:

It seems that only the INTEGRAL mission leaves this int_data directory laying around. Is this a feature or a bug? Perhaps it should put it into the same directory as download_dir?

jdavies-st avatar Sep 12 '22 08:09 jdavies-st

Nice troubleshooting @jdavies-st. It seems like a bug to me. I'm guessing this one is probably the offender in the case of int_data: https://github.com/astropy/astroquery/blob/a750c432d6000343157ad5d2a2a5fcd439c6fa8d/astroquery/esasky/core.py#L1495

If I'm reading it correctly, I think it should be something like: maps.append(fits.open(zip.extract(info.filename, path=mission_directory)))

imbasimba avatar Sep 12 '22 12:09 imbasimba

Thanks @imbasimba! That looks promising. I'll try it out. I have a branch I'm working on right now fixing some of these temp files that are left behind elsewhere in astroquery, so I'll give it a try.

jdavies-st avatar Sep 12 '22 13:09 jdavies-st