Writing an Array Attribute of String
Two write an array of constant size string one can currently do as equivalent to:
import h5py
# [...]
meshes.create_group(b"E")
E = meshes[b"E"]
E.attrs["axisLabels"] = np.array([b"x", b"y", b"z"])
Write an array of three strings, example here of size 1, one can do in C-style notation:
typedef char MyChar2[2];
MyChar2 *axisLabels = new MyChar2[simDim];
ColTypeString ctAxisLabels(1); // this can also be longer, but all must have the same size (?)
for( uint32_t d = 0; d < simDim; ++d )
{
/* \todo is the order correct? */
axisLabels[d][0] = char(120 + d); // x, y, z
axisLabels[d][1] = '\0'; // terminator is important!
}
params->dataCollector->writeAttribute(params->currentStep,
ctAxisLabels, recordName.c_str(),
"axisLabels",
1u, Dimensions(simDim,0,0),
axisLabels);
which works.
I am not 100% sure currently if one can write something like the following python equivalent where the three const-size static lengths strings can vary between each other in size:
import h5py
# [...]
meshes.create_group(b"E")
E = meshes[b"E"]
E.attrs["axisLabels"] = np.array([b"short", b"middle long", b"very long description"])
It might be possible to do something like char** for the writeAttribute() argument axisLabels and then putting in each entry a different size c-string (null-terminated as usual). But I am not sure that will work since we only have one ColTypeString ctAxisLabels(N) and this would mean N can vary.
So if you can, make your labels of the same lengths. (In your case "spatial idx" and "frequen idx" / " omega idx ", add trailing spaces before the '\0' etc. ... nasty)
CCing @PrometheusPi
Update: I just checked what h5py does in such a case and it is, surprise surprise, NULL padding :D
ATTRIBUTE "axisLabels" {
DATATYPE H5T_STRING {
STRSIZE 4;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
DATA {
(0): "x1\000\000", "y22\000", "z333"
}
}
while else it results in
ATTRIBUTE "axisLabels" {
DATATYPE H5T_STRING {
STRSIZE 1;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
DATA {
(0): "x", "y", "z"
}
}
That means: choose N as large as the largest size of your strings and pad with zeros, too ;)
#include <cstring>
// ...
const uint N = 14;
typedef char MyCharN[N+1]; // +1 for trailing \0
ColTypeString ctAxisLabels(N);
MyCharN *axisLabels = new MyCharN[simDim];
// pre-pad all targets with NULLs (including NULL terminator for max length string!)
for( uint32_t d = 0; d < simDim; ++d )
{
memset( axisLabels[d], '\0', N+1 );
}
strcpy( axisLabels[0], "spatial idx" ); // only 12 chars
for( uint i = 11; i <= N; ++N )
axisLabels[0][i] = '\0';
strcpy( axisLabels[1], "frequency idx" ); // 14 chars
(let us wrap away the null-padding in some helper ;) )
Nevertheless, libSplash strings are currently using STRPAD H5T_STR_NULLTERM; instead of STRPAD H5T_STR_NULLPAD (h5py), we might need to change that.
ATTRIBUTE "axisLabels" {
DATATYPE H5T_STRING {
STRSIZE 2;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
DATA {
(0): "x", "y", "z"
}
}
nevertheless, NULL terminated should not care if the bytes behind it are undefined during access. else space padding and final NULL will still work as a work-around.
@ax3l Thanks for posting this information. Great work checking the result in python. :+1:
@ax3l Why do you need:
for( uint i = 11; i <= N; ++N )
axisLabels[0][i] = '\0';
You already pre-paded all targets with NULLs
for( uint32_t d = 0; d < simDim; ++d )
{
memset( axisLabels[d], '\0', N+1 );
}
What you are referring to is just a quick hack that I needed and documented for testing.
The usage that is of interest for you is in https://github.com/ComputationalRadiationPhysics/picongpu/pull/1323 and works a bit more sophisticated :)
migrated to future: the above linked helper in PIConGPU should be ported to libSplash.