versatile-data-kit
versatile-data-kit copied to clipboard
vdk-postgres: support writing vectors to a postgres instance with pgvector installed using the VDK
What is the feature request? What problem does it solve?
The vdk sdk contains functions like send_tablular_data_for_ingestion. We need to suppose sending vectorized data for ingestion. This most likely won't require changing any function declarations. But it will require making changes to the postgres plugin to support writing vectors.
Definition of done
- Postres plugin supports writing vectors using pgvector
- Functional tests using example data job
if we represent vector column as string as in "[1,2,3]" pgvector would handle it automatically so maybe no change is needed. The story might remian for test be added
Currently you can do something like that:
data = dict(id="1", chunk="text", embedding="[1,2,3,4]")
job_input.send_object_for_ingestion(data=data, method="postgres")
and as long as embedding column is of type vector and pgvector is installed it should work