Implement vector visualization in the AI tab
The AI tab would benefit from having an embedding vector visualization. That is, whenever the user enables the AI extension and sets up a vector index, we could plot the resulting vectors in the UI using UMAP.
UMAP is a technique that enables us to project 2000-dimensional data to 2 or 3 dimensions, that is, points with 2 or 3 coordinates that we can plot. It preserves the local structure, so the user can see which pieces of text are similar to each other according to the embedding model.
- There is an explainer article with pictures
- Here's a live demo on real data (make sure to select UMAP in the bottom left corner)
- Here's a JS library the demo above is built on
A potential workflow such visualization could enable:
- The user creates a vector index
- They open up visualization and select a type to visualize
- They input a text query
- The query gets embedded via the API and then gets projected on visualization. The user is able to see what points in the index are the closest to their query.
Alternatively, they could browse the visualization, filter it by EdgeQL expressions and see what points cluster up together. This information would enable them to adjust the content of the property that gets indexed by the database.
There would have to be a cap of ~10000 samples that get visualized, otherwise it would take forever to calculate a projection. Those samples would need to be picked uniformly across all of the records.
As with the other one, please chat to @1st1 to clarify what this feature is about.