edgedb-python icon indicating copy to clipboard operation
edgedb-python copied to clipboard

Add more pgvector types.

Open vpetrovykh opened this issue 1 year ago • 1 comments

Add support for sparsevec and halfvec. halfvec is handled as a float array (converted from float32 to float16 or vice versa as needed). sparsevec is handled as a dict with non-zero values and "dimensions" specified in it.

vpetrovykh avatar Sep 27 '24 16:09 vpetrovykh

The halfvec codecs need more robust conversion for float16. This seems to outline a good algo that I can implement here: http://www.fox-toolkit.org/ftp/fasthalffloatconversion.pdf. As a bonus, I don't even need to handle NaN or Inf values because they are not valid in vectors (I checked with the actual pgvector types). So it's a little bit fewer corner cases to cover.

Sometimes _Float16 might be available via #include <float.h>, which should take care of casting between float/double and float16, but it would still need a fall-back.

vpetrovykh avatar Sep 27 '24 16:09 vpetrovykh