Add support to write multidimensional string arrays
Motivation
Fixes https://github.com/hdmf-dev/hdmf/issues/1137 and fixes https://github.com/hdmf-dev/hdmf/issues/1096.
On build, multidimensional string lists/arrays were being converted into 1-D lists/arrays:
[['aa', 'bb'], ['cc', 'dd']] -> ["['aa', 'bb']", "['cc', 'dd']"].
On write, the shape of a multidimensional string dataset was being defined incorrectly and so the dataset could not be written. The data shape definition was being caught by this conditional that I believe is intended to check for structured arrays / compound data types:
https://github.com/hdmf-dev/hdmf/blob/7426275cacc769a10ffca89836765df1355ba9db/src/hdmf/backends/hdf5/h5tools.py#L1472-L1473
I updated the conditional and get_data_shape is now used to set the shape of the dataset for multidimensional string lists/arrays.
How to test the behavior?
This came up in https://github.com/NeurodataWithoutBorders/pynwb/pull/1924.
Checklist
- [x] Did you update
CHANGELOG.mdwith your changes? - [X] Does the PR clearly describe the problem and the solution?
- [X] Have you reviewed our Contributing Guide?
- [X] Does the PR use "Fix #XXX" notation to tell GitHub to close the relevant issue numbered XXX when the PR is merged?
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 88.89%. Comparing base (
d50db92) to head (efc3b5d). Report is 27 commits behind head on dev.
Additional details and impacted files
@@ Coverage Diff @@
## dev #1173 +/- ##
=======================================
Coverage 88.88% 88.89%
=======================================
Files 45 45
Lines 9835 9839 +4
Branches 2795 2797 +2
=======================================
+ Hits 8742 8746 +4
Misses 776 776
Partials 317 317
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚨 Try these New Features:
- Flaky Tests Detection - Detect and resolve failed and flaky tests
@rly I realized that I had only updated the string conversion for datasets so I updated __convert_string to accommodate multidimensional string arrays as attributes as well.
For testing, I added an additional attr_array argument to the Bar class constructor, but I'm not sure if that was the best approach or if there's a better way to test it since the Bar class seemed to be used in multiple places.