astroquery icon indicating copy to clipboard operation
astroquery copied to clipboard

TST: esasky tests take too long time and donwload too much data

Open bsipocz opened this issue 1 year ago • 8 comments

I suspect they might be the reason of the CI timeout, too, but I don't yet have evidence.

Locally the remote esasky tests are not running for more than an hour (still at 50% according to pytest) and downloaded hundreds of Mb of data. So, it would be nice to do some trimming.

(The (esasky) Herschel tests also seem to be failing consistently, but I don't yet have a traceback)

cc @imbasimba

bsipocz avatar Jun 15 '24 02:06 bsipocz

After 3+ hours and 1.8 Gb, I've cancelled the test run, it was still at 50% and seemed to be stuck in a JWST download (using M51 is probably not the best test case).

Then I noticed that the remote-data tests in CI are run including the ones already marked with the "bigdata" pytest marker. So, as a workaround I'll skip all the bigdata tests. Nevertheless, even with that workaround, there may be room for improving the tests and cutting back on data sizes.

bsipocz avatar Jun 15 '24 04:06 bsipocz

Hmm... One or more missions have most likely released new data in the areas we are downloading, resulting in these huge download sizes. We should be able to find significantly smaller download sizes than that by switching the area queried.

I should have some time to look at this in July or August, so I recommend disabling the problematic tests for now. Of course, you are more than welcome to try to address the issue as well :)

imbasimba avatar Jun 15 '24 07:06 imbasimba

I went ahead and disabled running the tests marked as "bigdata", so the immediate urgency has been resolved, and the job is now running without running into space or timeout issues.

That being said, the test is also running really slowly, which maybe a separate problem (I mean it shouldn't take 3+ hr to download ~2Gb of data)

bsipocz avatar Jun 15 '24 07:06 bsipocz

Indeed, it should be much faster than that. Did you happen to notice which missions were slow? I did some quick spot tests, including on JWST data, and did not notice any speed issues or timeouts.

@jespinosaar, are you aware of any current issues with the download speed or timeouts of the archives?

imbasimba avatar Jun 15 '24 08:06 imbasimba

It was stuck on JWST-MID-IR in the test_esasky_get_images test

bsipocz avatar Jun 15 '24 08:06 bsipocz

Oh, and here is a CI run that ended up in timeout: https://github.com/astropy/astroquery/actions/runs/9524655265/job/26257814175#step:5:2467

bsipocz avatar Jun 15 '24 08:06 bsipocz

Ok, I see the problem. There are 394 observations in that area now. 340 new observations since 2024-05-30, and another batch of 52 observations was conducted 1 year ago. When this test was created, there were only 2 observations in the area.

imbasimba avatar Jun 15 '24 08:06 imbasimba

340 new observations since 2024-05-30

That explains! CI got messed up sometimes between the 31st of May and the 7th of June :)

bsipocz avatar Jun 15 '24 08:06 bsipocz