Reduce in-place cube operations.
Like numpy, we now only have a very few "general" operations in the cube API that modify an existing cube rather than returning a new one.
For comprehensibilty, I think those should be reduced to an absolute minimum.
Whereas things like add_dim_coord, and .data = clearly must be in-place operations, other operations like transpose and rename are not.
The following are currently in-place, and perhaps could usefully not be (IMHO) :
- transpose
- rename
- convert_units
FWIW, I think 'transpose' is especially undesirable and unexpected.
Ideally I would both remove in-place operations wherever possible, and make it all as clear as possible by naming, e.g. cube.transposed((2,0,1)), cube.regridded(gribcube).
- so e.g. that's a vote for renaming
regridtoregridded!
Like #3429, this has clear parallels in other processing libraries, especially numpy. Numpy has very few in-place operations, which makes processing code very much clearer, but their naming is not at all consistent.
Note: milestone set to v3.0.0, as it's breaking change, but this is probably not practical
Just stumbled over an old issue : I already noted this one ! #2615 "cube.transpose is an in-place operation"
Notably there, the absence of cube data sharing was seen as an obstacle, for performance reasons. See #2549 + its followers
In my mind, rename is just a setter for what comes out of name. So maybe it would make more sense to formalise that and make name a property. So instead of cube.rename("thing") we just do cube.name = "thing". I can’t think of a reason to want two cubes that are identical apart from their names.
Note that there are equivalent rename and convert_units methods on coordinates. In the coordinate case, I think in place operations make practical sense. Otherwise something as simple as
cube.coord("time").convert_units("days since 1970-01-01")
has to become
new_coord = cube.coord("time").convert_units("days since 1970-01-01")
dims = cube.coord_dims("time")
cube.remove_coord("time")
cube.add_dim_coord(new_coord, dims)
Adding lines never feels like a win!
Making the cube methods work inconsistently from their equivalent coord methods is not going to help with the comprehensibility though.
In the coordinate case, I think in place operations make practical sense.
Good spot + a very telling point. Hmmm.... :thinking:
@pp-mo Whilst we're in this space, it would be a major step forward IMHO if we had a clear statement and enforced strategy for immutable cubes, coords et al, and zero-copy views. For me this all goes hand-in-hand with purging all in-place operations. Just sayin'...