Derecho testlist for mizuRoute
Derecho is now available for general use and so we should add tests to Derecho for mizuRoute. First just assessing what tests work, and then eventually switching Cheyenne tests for Derecho.
There are three compilers for intel on Derecho: intel, intel-oneapi, and intel-classic. intel-classic will go away the soonest, so intel should be used for standard tests, but intel-oneapi for bleeding edge testing. There is also gnu, nvhpc, and nvhpc-gpu. The nvhpc compiler is the only one that can be used on the GPU's on Derecho. So we should also investigate running mizuRoute with GPU's on Derecho.
CTSM issue is here.
https://github.com/ESCOMP/CTSM/issues/1995
We will need a CTSM version that works with mizuRoute with updated externals for running on Derecho (and PE layouts for it).
PE layouts don't need to be done in mizuRoute, just in CTSM. The two tasks important for Derecho for mizuRoute are:
- testlist for Derecho (mostly same as Cheyenne, with some tweaks for compilers)
- Update to CTSM version that can run on Derecho (there isn't a version yet)
- Also need to add Derecho for the standalone build
We figure we'll wait for CTSM to have a tag before going down this. This also would be something most efficient for @ekluzek to do. And it shouldn't be too time consuming, but time consuming enough that it's not a dead simple task.
@nmizukami will work on the standalone build.
The tag ctsm5.1.dev159 is the tag to update to for CTSM. I've done the update, but running into problems because the file:
route/settings/mizuRoute_control.py
is using the "six" module and it's not avaialable in CIME anymore. It looks like it wasn't actually used, so removing it seems to work.
thanks Erik, I fetched add_mizuRoute branch from your repo and rebase it to my local add_mizuRoute. or should I fetch ctsm5.1.dev159 from ESCOMP/CTSM repo and then use this??
@nmizukami I haven't pushed the branches yet. I'm seeing if I can get some tests to run first before I do that. And yes you need to fetch the add_mizuRoute branch after I've pushed the updates.
Ah ok. I was wondering.. i fetched and looked at add_mizuRoute and it was update 8 weeks ago last time. so just wait, and I just changed python script (i have already merged)
OK, all of the tests fail on Derecho.
These fail because of a recognized problem with externals
Documented here: https://github.com/ESMCI/ccs_config_cesm/issues/130
ERP_D_Mmpi-serial_P1x25.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default MODEL_BUILD ERS_D_Mmpi-serial.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default MODEL_BUILD ERS_D_Mmpi-serial_P1x25.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default MODEL_BUILD SMS_D_Mmpi-serial.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default MODEL_BUILD SMS_Mmpi-serial_D_P1x25.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default MODEL_BUILD
And these fail with a timeout being given an overly generous 3:40 wallclock time...
ERI.nldas2_nldas2_rHDMA_mnldas2.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN ERI_Mmpi-serial.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN ERI_PS.f19_f19_rHDMAlk_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN ERS.f09_f09_mg17.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN ERS_PS.f19_f19_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN ERS_PS.f19_f19_mg17.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN ERS_PS.f19_f19_rHDMAlk_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN ERS_PS.f19_f19_rHDMAlk_mg17.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN ERS_PS.nldas2_nldas2_rHDMA_mnldas2.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN ERS_PS.nldas2_nldas2_rHDMA_mnldas2.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN ERS_PS.nldas2_nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN ERS_PS.nldas2_nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN PET_Mmpi-serial_P1x25.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN PET_Mmpi-serial_P1x25.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN PET_P215x8.nldas2_nldas2_rHDMA_mnldas2.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN PET_P215x8.nldas2_nldas2_rHDMA_mnldas2.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN PFS.f19_f19_rHDMA_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN PFS.f19_f19_rHDMA_mg17.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS.f09_f09_rHDMAlk_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS.f09_f09_rMERIT_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS.f19_f19_rMERIT_mg17.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS.f19_f19_rMERIT_mg17.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS_D.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS_D.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS_D.5x5_amazon_rHDMA.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS_D.nldas2_nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS_D_Mmpi-serial.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS_Mmpi-serial_D_P1x25.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS_P720x4.nldas2_nldas2_rMERIT_mnldas2.I2000Clm50SpMizGs.derecho_gnu.mizuroute-default RUN SMS_P720x4.nldas2_nldas2_rMERIT_mnldas2.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS_P80x18.f19_f19_rMERIT_mg17.I2000Clm50SpMizGs.derecho_intel.mizuroute-default RUN SMS_PS.hcru_hcru_mt13.I2000Clm50SpMizGs.derecho_intel.mizuroute-hcru RUN SMS_PS.hcru_hcru_rHDMAlk_mt13.I2000Clm50SpMizGs.derecho_intel.mizuroute-hcru RUN
Looking at /glade/derecho/scratch/erik/SMS_D.5x5_amazon_r05.I2000Clm50SpMizGs.derecho_intel.mizuroute-default.GC.mizu_c-cpln2_v211_ctsm51d159delist/run/cesm.log.2727922.desched1.240104-171614
I don't see a lot of information, but it seems to be hanging just after initialization.
I see this in traceback in the cesm log file - 590 rof_comp_nuopc.F90
Line 590 is
590 Mesh = ESMF_MeshCreate(filename=trim(cvalue), fileformat=ESMF_FILEFORMAT_ESMFMESH, rc=rc)
some problem in reading mesh file?