Update CSLAM Derecho performance issues in 2.2 release tag
Issue Type
Other (please describe below)
Issue Description
Derecho performance issues for CSLAM were resolved on the trunk https://github.com/ESCOMP/CAM/pull/845? but not on the 2.2 release tag. This is needed for running CSLAM with the 2.2 release tag.
There is also an issue with i/o via the mpich compiler, which will be resolved with a new 2.2 cime tag @fvitt. Ideally that tag wold be made first and would update the externals in this issue as well.
Will this change answers?
I Don't Know
Will you be implementing this yourself?
Any CAM SE can do this
Should we include the fix to this issue as well? https://github.com/ESCOMP/CAM/issues/876 See PR https://github.com/ESCOMP/CAM/pull/878
This is my cime PR: https://github.com/ESMCI/cime/pull/4559 And cime branch: https://github.com/fvitt/cime/tree/derecho_mods
My colleague using the cesm2.2.2 release tag is getting greater than 2X slow-down just by turning on 2 tapes of 3-hourly output. These are 1/8deg SE var-res simulations.
w/o 3 hourly:
Overall Metrics:
Model Cost: 83631.84 pe-hrs/simulated_year
Model Throughput: 0.33 simulated_years/day
w/ 3-hourly
Overall Metrics:
Model Cost: 180784.32 pe-hrs/simulated_year
Model Throughput: 0.15 simulated_years/day
That seems like an unreasonable slowdown to me. @fvitt should I ask him to try making these mods https://github.com/ESMCI/cime/pull/4559 ?
My colleague using the cesm2.2.2 release tag is getting greater than 2X slow-down just by turning on 2 tapes of 3-hourly output. These are 1/8deg SE var-res simulations.
w/o 3 hourly:
Overall Metrics: Model Cost: 83631.84 pe-hrs/simulated_year Model Throughput: 0.33 simulated_years/dayw/ 3-hourly
Overall Metrics: Model Cost: 180784.32 pe-hrs/simulated_year Model Throughput: 0.15 simulated_years/dayThat seems like an unreasonable slowdown to me. @fvitt should I ask him to try making these mods ESMCI/cime#4559 ?
Yes, try those changes
That did it! Thanks Francis.
w/ 3-hourly
Overall Metrics:
Model Cost: 90887.91 pe-hrs/simulated_year
Model Throughput: 0.30 simulated_years/day