xarray icon indicating copy to clipboard operation
xarray copied to clipboard

A suggested solution to the `TypeError: Invalid value for attr:` error upon `.to_netcdf`

Open shaharkadmiel opened this issue 5 years ago • 3 comments

Even though netcdf conventions don't allow some data types in the attributes, it might be usefull to simply serialize those as strings rather than throw an error. Maybe add a force_serialization keyword argument to the .to_netcdf method.

Example: Setup a DataArray with bad values in the attributes:

import numpy as np
from xarray import DataArray, load_dataset
from pandas import Timestamp

from numbers import Number

valid_types = (str, Number, np.ndarray, np.number, list, tuple)

da = DataArray(
    name='bad_values',
    attrs=dict(
        bool_value=True,
        none_value=None,
        datetime_value=Timestamp.now()
    )
)

ds = da.to_dataset()
ds.bad_values.attrs

Output:

{'bool_value': True,
 'none_value': None,
 'datetime_value': Timestamp('2020-02-03 10:53:02.350105')}

The code in the except clause can be easily impolemented under _validate_attrs.

try:
    ds.to_netcdf('test.nc')
    # Fails with TypeError: Invalid value for attr: ...
except TypeError as e:
    print(e.__class__.__name__, e)
    for variable in ds.variables.values():
        for k, v in variable.attrs.items():
            if not isinstance(v, valid_types) or isinstance(v, bool):
                variable.attrs[k] = str(v)

    ds.to_netcdf('test.nc')  # Works as expected

ds_from_file = load_dataset('test.nc')
ds_from_file.bad_values.attrs

Output:

TypeError Invalid value for attr: None must be a number, a string, an ndarray or a list/tuple of numbers/strings for serialization to netCDF files

{'bool_value': 'True',
 'none_value': 'None',
 'datetime_value': '2020-02-03 10:43:38.479866'}

shaharkadmiel avatar Feb 03 '20 10:02 shaharkadmiel

Thanks for the suggestion. One issue here is that it's not round-trippable; i.e. it wouldn't get deserialized into an object on being loaded.

To the extent people don't think that's an issue, we could take a PR.

max-sixty avatar Mar 05 '20 19:03 max-sixty

This would be a handy feature. Especially for writing unit tests, where often it's okay if the attributes aren't reserialized exactly.

Shaunakde avatar Jan 28 '21 14:01 Shaunakde

would be nice better to wrap it and make it round trippable (especially for simple things like None )

arsenovic avatar Jul 25 '22 16:07 arsenovic