hdmf icon indicating copy to clipboard operation
hdmf copied to clipboard

Add support to write multidimensional string arrays

Open stephprince opened this issue 1 year ago • 1 comments

Motivation

Fixes https://github.com/hdmf-dev/hdmf/issues/1137 and fixes https://github.com/hdmf-dev/hdmf/issues/1096.

On build, multidimensional string lists/arrays were being converted into 1-D lists/arrays:

[['aa', 'bb'], ['cc', 'dd']] -> ["['aa', 'bb']", "['cc', 'dd']"].

On write, the shape of a multidimensional string dataset was being defined incorrectly and so the dataset could not be written. The data shape definition was being caught by this conditional that I believe is intended to check for structured arrays / compound data types:

https://github.com/hdmf-dev/hdmf/blob/7426275cacc769a10ffca89836765df1355ba9db/src/hdmf/backends/hdf5/h5tools.py#L1472-L1473

I updated the conditional and get_data_shape is now used to set the shape of the dataset for multidimensional string lists/arrays.

How to test the behavior?

This came up in https://github.com/NeurodataWithoutBorders/pynwb/pull/1924.

Checklist

  • [x] Did you update CHANGELOG.md with your changes?
  • [X] Does the PR clearly describe the problem and the solution?
  • [X] Have you reviewed our Contributing Guide?
  • [X] Does the PR use "Fix #XXX" notation to tell GitHub to close the relevant issue numbered XXX when the PR is merged?

stephprince avatar Aug 15 '24 21:08 stephprince

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 88.89%. Comparing base (d50db92) to head (efc3b5d). Report is 27 commits behind head on dev.

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #1173   +/-   ##
=======================================
  Coverage   88.88%   88.89%           
=======================================
  Files          45       45           
  Lines        9835     9839    +4     
  Branches     2795     2797    +2     
=======================================
+ Hits         8742     8746    +4     
  Misses        776      776           
  Partials      317      317           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

codecov[bot] avatar Aug 15 '24 21:08 codecov[bot]

@rly I realized that I had only updated the string conversion for datasets so I updated __convert_string to accommodate multidimensional string arrays as attributes as well.

For testing, I added an additional attr_array argument to the Bar class constructor, but I'm not sure if that was the best approach or if there's a better way to test it since the Bar class seemed to be used in multiple places.

stephprince avatar Aug 19 '24 22:08 stephprince