llmops-handbook
llmops-handbook copied to clipboard
Extending the vector database article
[Outline] I would like to add to the existing section of vector databases. The general points I would like to add are:
- How vector databases are different from traditional databases
- Embeddings (I can just link to the already existing section)
- A section on the architecture, namely storage, indexing and query processing
- A primer on Similarity metrics such as Euclidean distance and cosine similarity and an overview of indexing techniques such as HNSW
- A small tutorial on getting started with an opensource database like ChromaDB, covering
- various chunking techniques
- importance of chunk sizes and other hyperparameters related to pushing data into the database.