langchain-postgres Support for the sparse embeddings

The latest pgvector version supports sparsevec. However, langchain's PGVector supports only one embeddings column in langchain_pg_embedding table. It would be great to have a sparse_embedding column and sparse_embedding field in PGVector.

I have considered the alternative and that is to have 2 PGVector stores, 1 for dense and 1 for sparse vectors. However there are 2 problems with that:

PGVector has hardcoded table names for collection and embeddings
I would like to leverage excellent langchain indexer with SQL manager.

Jun 11 '24 10:06 magaton

hi @magaton I would be interested in collaborating on this, I would also like some kind of full-text/dense feature https://github.com/langchain-ai/langchain-postgres/issues/61

Jun 19 '24 08:06 gecBurton

Hello, would be interested also.

But I think each vector DB should be separated. So for a hybrid search it would be

One Dense embedding vector DB (using the current feature)
One Sparse Vector DB (using https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/cross_encoder.py)

And then rerank by using EnsembleRetriever (for example: https://python.langchain.com/docs/how_to/ensemble_retriever/ )

To achieve this we should also bump the pgvector python version: #82

Oct 23 '24 14:10 Freezaa9

hi, I could really do with this feature. I have made a very crude PR that suggests how this might be done, I would appreciate some help as I do not know this codebase well :) https://github.com/langchain-ai/langchain-postgres/pull/204

Apr 27 '25 16:04 gecBurton