Ben Wibking
Ben Wibking
I regularly run on this cluster. I suspect it may be OS noise, rather than something related to OpenMPI itself. For instance, running the "fixed work quantum" benchmark from http://www.unixer.de/research/netgauge/osnoise...
On this system, it looks like there are timing variations of ~100s microseconds on a loop iteration: ``` $ mpirun -np 48 ./netgauge -x noise # Info: (0): Netgauge v2.4.6...
> While our periodic benchmarks jobs haven't shown any degradation in performance, they're not exactly communication and latency bound and so might not have shown this up. @benmenadue Just a...
It fails on coarse step 2. Here is the log file: [run_3d.log](https://github.com/AMReX-Astro/Castro/files/9278011/run_3d.log)
That's odd. Is there any clue in the jobinfo file? [job_info.txt](https://github.com/AMReX-Astro/Castro/files/9285958/job_info.txt)
Ok, I've rerun with the same options and I get the same crash at coarse step 2, but with an extra `Erroneous arithmetic operation` message: ``` [Level 2 step 10]...
Here's the full log and jobinfo: [run_3d_debug_trap.log](https://github.com/AMReX-Astro/Castro/files/9286019/run_3d_debug_trap.log) [job_info.txt](https://github.com/AMReX-Astro/Castro/files/9286022/job_info.txt)
I don't have access to Perlmutter, but tested it on NCSA Delta and it builds now. The amrex submodule checked out by default does not yet include this fix.
Oh that makes more sense. Boxlib/AMReX does support particles, but my dataset does not have any. In yt==4.0.2, using the gas density instead (to avoid the vmin/vmax bug in 4.0.2),...
Why can't this be fixed? (For instance, VTK handles this situation correctly.) How is the rendering implemented in yt?