Support FAISS in OpenSearch
Is your feature request related to a problem? Please describe. FAISS comes with very efficient and reliable ANN algorithms that would allow scaling up the number of docs for dense retrieval.
Approximate KNN search can be configured via index mappings and settings: https://opensearch.org/docs/latest/search-plugins/knn/knn-index/
We can split it into the following sub issues:
- [x] support HNSW for dot product and l2 (S)
- add param
knn_enginewith default "nmslib" to construtor - use the same hnsw params as for nmslib
- add param
- [x] check whether existing index supports the requested method (S)
- throw or fallback to exact knn
- [ ] support HNSW for cosine (M)
- normalize vectors manually
- use dot_product space under the hood
- [ ] support product quantization (S)
- add param
product_quantization_subvectorswith default value None - if not None
pqis activated withm=product_quantization_subvectors - omit
code_sizefor now as it is only supported with IVF
- add param
- [ ] support IVF (S)
- add
ivfoption toindex_typeconstructor param - use default values
- support product quantization
- add
We should have a look at #3102 while working on this epic. It's about allowing to set custom values for ef_search in OpenSearchDocumentStore, as it is the case for FAISSDocumentStore. Currently, this ef_search is hard-coded to a value of 20.
With merging #3850, all specified tasks have been completed. We can close this epic.