xeofs
xeofs copied to clipboard
Broadcasting dimensions with `xr.Dataset`
Combining xr.Dataset as input with both multi-dimensional sample and feature dimensions will broadcast dimensions thus yielding components with inflated dimensions. The broadcasted dimensions are filled with NaN and results seem right. ideally, however, this broadcasting shouldn't happen and should be avoided.
In a nutshell, instead of obtaining components like the following
xarray.Dataset
Dimensions: (sample1: 2, feature1: 2, feature2: 3)
Coordinates:
sample1 (sample1) int64 1 2
feature1 (feature1) <U1 'a' 'b'
feature2 (feature2) int64 0 1 2
Data variables:
da1 (sample1, feature1, feature2) int64 0 1 2 3 4 5 6 7 8 9 10 11
da2 (sample1, feature1) int64 0 3 6 9
Indexes: (3)
Attributes: (0)
we currently get
xarray.Dataset
Dimensions: sample1: 2, feature1: 2, feature2: 3
Coordinates:
sample1 (sample1) int64 1 2
feature1 (feature1) <U1 'a' 'b'
feature2 (feature2) int 0 1 2
Data variables:
da1 (sample1, feature1, feature2) int64 0 1 nan 3 ... 9 10 nan
da2 (sample1, feature1, feature2) int64 nan nan 0 nan ... 6 nan nan 9
Indexes: (3)
Attributes: (0)
This arises from a potential inconsistency in xarray's to_stacked_array()/to_unstacked_dataset() methods (see discussion).