CESM_postprocessing icon indicating copy to clipboard operation
CESM_postprocessing copied to clipboard

[timeseries] integer division or modulo by zero

Open lvankampenhout opened this issue 7 years ago • 1 comments

I'm attempting to make timeseries of some 1-year test run that I did on an external machine. I'm using one node with 24 cores and run into the following error message:

integer division or modulo by zero

I was able to trace this back to the function divide_comm() in cesm_tseries_generator.py where there is the following code:

    min_procs_per_spec = 36
    size = scomm.get_size()
    rank = scomm.get_rank()

    if l_spec == 1:
        num_of_groups = 1
    else:
        num_of_groups = size/min_procs_per_spec
(...)
        temp_color = (rank % num_of_groups)+1

In my case l_spec equals 17 so the number of groups is 24 divided by 36, which is rounded down to 0.

I imagine the solution could be to either raise a helpful error message ("number of cores must >= 36") or to remove the magic number 36, or both.

lvankampenhout avatar Nov 06 '18 15:11 lvankampenhout

@lvankampenhout - the decision to hardcode the min_procs_per_spec = 36 is based on optimizing the single variable timeseries generation for the CMIP6 experiments on cheyenne. We might be able to make this setting an XML variable so it's not so restrictive to just NCAR machines. Thanks for the bringing this up.

bertinia avatar Nov 07 '18 16:11 bertinia