Update documentation on nullable variable attributes
Hi team,
Could you update the python-section of the documentation? I have no clue how to approach nullable variable attributes without an example . https://docs.tiledb.com/main/how-to/arrays/writing-arrays/nullable-attributes
Many thanks, Michel
Hi @mamscience, at the moment, we support pandas nullable datatypes transparently in TileDB-Py. We'll get the docs there updated ASAP, but in the meantime please have a look at these tests. We're also planning to add a new API for writing and querying nullable attributes soon (it will accept or return a numpy bool vector). If you have a particular package/API of interest, please let us know, we may not be able to integrate directly in the short term, but we'll keep it in mind to try to maximize interoperability.
Thanks, apparently, changing the numpy type to float and passing "None" also did the trick.
schema = tiledb.ArraySchema(
domain=dom, sparse=True, attrs=[
tiledb.Attr(name="a, dtype=np.int32, nullable=True),
tiledb.Attr(name="b", dtype=np.float32, nullable=True)
]
)
snip
with tiledb.SparseArray(array_name, mode="w") as A:
I, J = [1, 1, 1], [1, 2, 3]
a = np.array([60, 65,64])
b = np.array([120, 122, None])
A[I, J] = {"val a":a, "val b":b}`
The underlying issue btw is that I must pass all attributes when writing to array, while (my) sparse arrays are mostly empty. Solving this issue by relaxing the constraints (https://github.com/TileDB-Inc/TileDB/issues/1162#issue-425631917) was proposed before .
But let's say I need to use integer instead of float attribute, how should one approach this? Did you mean that I compile and populate a pandas df and convert it to tiledb array? Or is there another way to create tiledb array with pd datatypes
thanks in advance