haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Support FAISS in OpenSearch

Open masci opened this issue 3 years ago • 1 comments

Is your feature request related to a problem? Please describe. FAISS comes with very efficient and reliable ANN algorithms that would allow scaling up the number of docs for dense retrieval.

Approximate KNN search can be configured via index mappings and settings: https://opensearch.org/docs/latest/search-plugins/knn/knn-index/

We can split it into the following sub issues:

  • [x] support HNSW for dot product and l2 (S)
    • add param knn_engine with default "nmslib" to construtor
    • use the same hnsw params as for nmslib
  • [x] check whether existing index supports the requested method (S)
    • throw or fallback to exact knn
  • [ ] support HNSW for cosine (M)
    • normalize vectors manually
    • use dot_product space under the hood
  • [ ] support product quantization (S)
    • add param product_quantization_subvectors with default value None
    • if not None pq is activated with m=product_quantization_subvectors
    • omit code_size for now as it is only supported with IVF
  • [ ] support IVF (S)
    • add ivf option to index_type constructor param
    • use default values
    • support product quantization

masci avatar Jul 14 '22 09:07 masci

We should have a look at #3102 while working on this epic. It's about allowing to set custom values for ef_search in OpenSearchDocumentStore, as it is the case for FAISSDocumentStore. Currently, this ef_search is hard-coded to a value of 20.

bogdankostic avatar Sep 02 '22 12:09 bogdankostic

With merging #3850, all specified tasks have been completed. We can close this epic.

bogdankostic avatar Feb 20 '23 14:02 bogdankostic