`hsload` fails on empty data sets (with a dimension of length 0)
Below is an error we encountered. The error is that hsload fails on data sets that have a dimension of length 0.
(While I'd expect HSDS to be able to handle this, incidentally this error was actually helpful to us! This alerted us to a case where simulation data unexpectedly wasn't produced due to an error in our code.)
Traceback (most recent call last):
File "h5py/h5o.pyx", line 302, in h5py.h5o.cb_obj_simple
File "/home/FCAM/crbmapi/.local/lib/python3.6/site-packages/h5py/_hl/group.py", line 591, in proxy
return func(name, self[name])
File "/usr/local/lib/python3.6/site-packages/h5pyd/_apps/utillib.py", line 674, in object_create_helper
create_dataset(obj, ctx)
File "/usr/local/lib/python3.6/site-packages/h5pyd/_apps/utillib.py", line 459, in create_dataset
fillvalue=fillvalue, scaleoffset=scaleoffset)
File "/usr/local/lib/python3.6/site-packages/h5pyd/_hl/group.py", line 337, in create_dataset
dsid = dataset.make_new_dset(self, shape=shape, dtype=dtype, **kwds)
File "/usr/local/lib/python3.6/site-packages/h5pyd/_hl/dataset.py", line 129, in make_new_dset
raise ValueError(errmsg)
ValueError: Chunk shape must not be greater than data shape in any dimension. (6, 256) is not compatible with (24, 0)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/hsload", line 33, in
sys.exit(load_entry_point('h5pyd==0.8.4', 'console_scripts', 'hsload')())
File "/usr/local/lib/python3.6/site-packages/h5pyd/_apps/hsload.py", line 314, in main
load_file(fin, fout, verbose=verbose, dataload=dataload, s3path=s3path, compression=compression, compression_opts=compression_opts)
File "/usr/local/lib/python3.6/site-packages/h5pyd/_apps/utillib.py", line 714, in load_file
fin.visititems(object_create_helper)
File "/home/FCAM/crbmapi/.local/lib/python3.6/site-packages/h5py/_hl/group.py", line 592, in visititems
return h5o.visit(self.id, proxy)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
Could be linked to https://github.com/HDFGroup/h5pyd/pull/114
Yes, #114 sounds very similar.
This should be fixed in this commit: https://github.com/HDFGroup/h5pyd/commit/5a9193af6ae99a204a7d277d9983431e712f7417.
I'll update the issue when this gets merged with master.
Fix is in master now.
Closing - fix is in the 0.12.0 release on PyPI.
Sorry, I was a bit late to check.
I still get an error when running hsload --link on a file containing an empty dataset (h5py.Empty):
File ".../h5pyd/_apps/utillib.py", line 737, in create_dataset
tgt_shape.extend(dobj.shape)
TypeError: 'NoneType' object is not iterable
Ah, I see - reopening.
This should fix it: https://github.com/HDFGroup/h5pyd/commit/866c0be4063a1d744df596a8296b95a2b505ee15.
Nope, still the same error.
Anyway, this is no big deal: hsload now works for scalar datasets and I think h5py.Empty is really uncommon.
@loichuder - where you testing from master? The commit above was in the aggregate branch. Anyway, I've merged the changes into master and pushed out a new release as 0.12.1.
Yes tried with the aggregate branch at the time and now with master, still the same issue of https://github.com/HDFGroup/h5pyd/issues/116#issuecomment-1336931275 since dobj.shape is None for h5py.Empty.
No big deal as I said but for the sake of it, here is what I have done to encounter the issue:
- Creation of the file containing an empty dataset
import h5py
with h5py.File('empty.h5', "w") as h5file:
h5file.create_dataset("empty", data=h5py.Empty)
- Loading with
hsload --link:
hsload --link [...] files/empty.h5 [...]
@loichuder - ok I see. This latest checkin should really fix it now! It's on master and in PyPI as version 0.12.2.
Closing this issue as it should be fixed in 0.12.2 and later.