Overridden dataset dtype and shape are not passed from parent to child
If a schema has two groups, e.g., PatchClampSeries and CurrentClampSeries defined as such:
groups:
- neurodata_type_def: PatchClampSeries
neurodata_type_inc: TimeSeries
doc: ...
datasets:
- name: data
dtype: numeric
dims:
- num_times
shape:
- null
doc: ...
- neurodata_type_def: CurrentClampSeries
neurodata_type_inc: PatchClampSeries
doc: ...
datasets:
- name: data
doc: Recorded voltage.
where the 'data' dataset in CurrentClampSeries overrides the 'data' dataset in the parent PatchClampSeries, then the dtype, dims, and shape keys of the parent PatchClampSeries 'data' dataset are not inherited by the child CurrentClampSeries 'data' dataset.
Is this intended behavior? @oruebel @ajtritt
If so, the NWB schema needs to be amended in a few places.
I encountered this while doing some validator testing. This also applies when the datasets are defined with specific inheritance, eg:
datasets:
- data_type_def: D1
dtype: numeric
dims:
- dim1
- dim2
shape:
- null
- null
doc: a base dataset
attributes:
- name: foo
dtype: text
doc: an attribute
- data_type_def: D2
data_type_inc: D1
doc: an inherited dataset
Only the attribute foo will be validated on a builder for type D2. dtype, dims, and shape are not passed down.
It looks like the default inheritance passes down specification elements defined on the dictionary accessible via BaseStorageSpec.attributes. However, DatasetSpec sets shape, dims, and dtype in the base dict that ConstructableDict inherits from (self['shape'] self['dims'], and self['dtype'] when accessed from within DatasetSpec), rather than on the .attributes field via BaseStorageSpec.set_attribute('shape', ...), etc.
It looks like the current implementation of #321 treats this by passing down the shape, etc of child attributes when a parent is inherited (e.g. PatchClampSeries.data.shape is passed to CurentClampSeries.data.shape). However, I'm not sure that would work for the above situation, even if we defined a group holding D2 as below:
groups:
- data_type_def: G1
doc: a group container
datasets:
- name: my_d2
data_type_def: D2
doc: a D2 dataset on group G1
It looks like #321 might be a bit stale, so maybe a lot has changed between then and hdmf 2.5.1. And I'm not entirely sure I'm clear on how everything is supposed to happen with spec inheritance, so maybe there's something I did wrong with my tests. But it'd be good to validate that #321 does resolve the above situation.
To be a bit more clear, here's a test case I was working from:
- create a spec for
D1andD2and save to file.
import os
from hdmf.spec import NamespaceBuilder, export_spec, AttributeSpec, DatasetSpec, GroupSpec
namespace_name = 'foo'
ns_builder = NamespaceBuilder(doc='Test namespace', name=namespace_name, version='0.1.0')
d1_spec = DatasetSpec(doc='d1', data_type_def='D1', shape=[None, None], dims=['dim1', 'dim2'], dtype='numeric',
attributes=[AttributeSpec(name='foo', doc='an attribute', dtype='text')])
d2_spec = DatasetSpec(doc='d2', data_type_def='D2', data_type_inc='D1')
g1_spec = GroupSpec(doc='g1', data_type_def='G1', datasets=[d2_spec])
output_dir = 'test-spec'
if not os.path.exists(output_dir):
os.mkdir(output_dir)
export_spec(ns_builder, [d1_spec, d2_spec, g1_spec], str(output_dir))
- load and compare the specs for
D1,D2, andG1:
from hdmf import common
common.load_namespaces(os.path.join(output_dir, f'{namespace_name}.namespace.yaml'))
ns_catalog = common.get_manager().namespace_catalog
ns = ns_catalog.get_namespace(name=namespace_name)
print(ns.get_spec('D1'))
print(ns.get_spec('D2'))
print(ns.get_spec('G1'))
Output:
{'shape': [None, None],
'dims': ['dim1', 'dim2'],
'dtype': 'numeric',
'doc': 'd1',
'data_type_def': 'D1',
'attributes': [{'doc': 'an attribute', 'name': 'foo', 'dtype': 'text'}]}
{'doc': 'd2',
'data_type_inc': 'D1',
'data_type_def': 'D2',
'attributes': [{'doc': 'an attribute', 'name': 'foo', 'dtype': 'text'}]}
{'datasets': [{'doc': 'd2',
'data_type_inc': 'D1',
'data_type_def': 'D2',
'attributes': [{'doc': 'an attribute', 'name': 'foo', 'dtype': 'text'}]}],
'doc': 'g1',
'data_type_def': 'G1'}
I would expect D2 to have shape, dims, and dtype available. I also am not sure #321 would fix the situation of G1.
This came up again in https://github.com/NeurodataWithoutBorders/nwb-schema/issues/622
The resolve_spec behavior in HDMF needs to be adjusted so that dtype, shape, requiredness, and other properties are inherited by the child spec if not explicitly changed (and if they are explicitly changed, we should ensure they are compatible, but perhaps that is a separate issue).