Trixi.jl icon indicating copy to clipboard operation
Trixi.jl copied to clipboard

Make CI tests a bit less expensive

Open ranocha opened this issue 5 years ago • 16 comments

For example, here is a list of examples that are relatively expensive on Windows (2D)

  • examples\2d\elixir_advection_amr.jl, 93.2s
  • examples\2d\elixir_advection_amr_nonperiodic.jl, 47.3s
  • examples\2d\elixir_hypdiff_nonperiodic.jl, 38.9s
  • examples\2d\elixir_euler_shockcapturing.jl, 46.5s
  • examples\2d\elixir_euler_blast_wave_amr.jl, 62.8s
  • examples\2d\elixir_euler_sedov_blast_wave.jl, 61.1s
  • examples\2d\elixir_euler_positivity.jl, 57.6s
  • examples\2d\elixir_mhd_alfven_wave.jl, 46.6s
  • examples\2d\elixir_mhd_alfven_wave_mortar.jl, 74.7s
  • examples\2d\elixir_mhd_orszag_tang.jl, 45.8s
  • examples\2d\elixir_lbm_lid_driven_cavity.jl, 40.6s
  • examples\2d\elixir_mhd_rotor.jl, 192s
  • examples\2d\elixir_mhd_blast_wave.jl, 158s

and Ubuntu (3D)

  • examples/3d/elixir_advection_mortar.jl, 40.7s
  • examples/3d/elixir_hypdiff_nonperiodic.jl, 51.9s
  • examples/3d/elixir_euler_amr.jl, 111s
  • examples/3d/elixir_euler_shockcapturing.jl, 67.1s
  • examples/3d/elixir_euler_sedov_blast_wave.jl, 72.0s (although it's using only 5 time steps)
  • examples/3d/elixir_eulergravity_eoc_test.jl, 44.2s (although it's using only 9 time steps)

ranocha avatar Dec 08 '20 09:12 ranocha

Are these times with (pre-)compilation? Because if I run examples\2d\elixir_advection_amr.jl on my laptop, it is finished in <5 s.

sloede avatar Dec 08 '20 09:12 sloede

These are the times reported by the summary_callback after each test.

ranocha avatar Dec 08 '20 09:12 ranocha

Related to #62

efaulhaber avatar Apr 27 '21 16:04 efaulhaber

We might also want to discuss the following questions/options.

  • Do we need to run all 2D tests on Windows and Mac OS? Would it suffice to run only the MPI and threaded tests?
  • Can we remove the restart callback save_restart from many elixirs?
  • Split some expensive test sets into more CI jobs

ranocha avatar May 07 '21 12:05 ranocha

Do we need to run all 2D tests on Windows and Mac OS? Would it suffice to run only the MPI and threaded tests?

I think we said yes. Sometimes, in the past there have been weird macOS-related issues, and I think we should make sure that we exercise most of the core functionality of Trixi on all relevant platforms. At least as long as it does not become unbearable... If we want to save time during development, what about disabling the macOS and Windows tests on Draft PRs? This way one could have faster turnround times during most of a PR's lifetime, and only get the full checks once we are ready to merge.

Can we remove the restart callback save_restart from many elixirs?

Yes, I have no issue with this. IMHO, we can at least remove that from all but one elixir per dimension-mesh-solver-equation combination.

Split some expensive test sets into more CI jobs

Absolutely. In the past, I have suggested this before, but you (rightfully) warned that due to startup latency, this does not always make it faster.

sloede avatar May 08 '21 09:05 sloede

Split some expensive test sets into more CI jobs

Absolutely. In the past, I have suggested this before, but you (rightfully) warned that due to startup latency, this does not always make it faster.

Yeah, but we have a bunch of new equation and mesh types so that we can benefit less from re-using compiled code.

ranocha avatar May 14 '21 07:05 ranocha

Another (minor) aspect: Documenter is set up to fail when doctests fail, so we don't need to run doctests in https://github.com/trixi-framework/Trixi.jl/blob/ff549a5e67a7685f2ad3c97a0694c756160d79b4/test/test_unit.jl#L515-L517

ranocha avatar May 14 '21 07:05 ranocha

The tests really take way too long to run IMO. It's so far been 15 minutes and my tests are still running.

It would be nice to have a minimal set of tests which preserve "enough" code coverage so that any changes to Trixi base could be more quickly checked on a local machine.

jlchan avatar May 18 '21 19:05 jlchan

One possibility would be to create a testset intended for "local" testing, which could exclude some of the CI tests.

jlchan avatar May 18 '21 19:05 jlchan

That's definitely a good point. What I usually do when modifying Trixi is to include only a subset of tests locally, say test/test_examples_2d_advection.jl when I modified some 2D stuff. That's usually a good smoke test. However, it's a bit hard to cover (nearly) everything in a cheap test set using the current way of testing, I fear.

ranocha avatar May 19 '21 05:05 ranocha

Has anything significantly changed the timing since you reported them, @ranocha? I noticed that 3d/elixir_euler_amr.jl takes over 300s now in GitHub (and for some reason over 400s on my system, maybe that's because of Windows?). 2D and 3D tests regularly take over an hour now. Is it really necessary to let the simulation run that long? Would it be sufficient to let tests like this run to t=1 instead of t=10 (maybe use a different start time to still test that the blob is running over the periodic boundaries)?

efaulhaber avatar May 30 '21 08:05 efaulhaber

Yeas, something like that is definitely a good option from my point of view. A major impact on the CI run time was our more extensive use of Polyester, StrideArrays, and LoopVectorization. This combination is really good for runtime performance, but particularly demanding for CI when collecting coverage results.

ranocha avatar May 30 '21 08:05 ranocha

Finding good tests is always tricky. "As short as possible but as long as necessary" is our yardstick, but what exactly the latter part means in practice is often hard to tell.

I fully agree that we need to reduce the amount of time it takes for testing, but from past experience (especially from tests that didn't run long enough to uncover errors that were only found much later), I feel like this is in general a non-trivial task and requires some thinking and selective tweaking. This is also the only reason this hasn't been tackled yet - a lack of developer time :-/

sloede avatar May 30 '21 09:05 sloede

We are experiencing some problems with GitHub actions in the last few days - jobs are stuck at the queued stage although we have enough free capacity. One way to reduce problems like these could be to reduce the number of tests that need to run on all three OS. From my point of view, it should be sufficient to have some basic tests on all OS (including threads, MPI, p4est, and other binary dependencies), but we definitely do not need to test every 2D setup on Windows and Mac OS.

ranocha avatar Jul 29 '21 09:07 ranocha

We are experiencing some problems with GitHub actions in the last few days - jobs are stuck at the queued stage although we have enough free capacity. One way to reduce problems like these could be to reduce the number of tests that need to run on all three OS. From my point of view, it should be sufficient to have some basic tests on all OS (including threads, MPI, p4est, and other binary dependencies), but we definitely do not need to test every 2D setup on Windows and Mac OS.

IIRC, this hs been resolved by your efforts this year, hasn't it @ranocha?

sloede avatar Dec 22 '21 12:12 sloede

This particular problem, yes. However, I think CI is still too expensive

ranocha avatar Dec 22 '21 15:12 ranocha