microllama
microllama copied to clipboard
The smallest possible LLM API
e.g. ```bash pip install spacy python -m spacy download en_core_web_sm ``` and in `Dockerfile`: ```dockerfile RUN pip install spacy RUN python -m spacy download en_core_web_sm ``` In `microllama.py`: ```python from...
Try out Pinecone as an optional alternative to FAISS. Expected pros: smaller container, lower memory use. Expected cons: slower indexing and querying because of network latency, cost for large document...
- Adds llm and llm-openai as dependencies. - Refactors answer() and streaming_answer() to use llm.get_model() and model.chat() instead of openai.ChatCompletion. - Updates README, Dockerfile, and deploy_instructions() to reflect new dependencies,...
Currently, `microllama` uses `openai` and `langchain` directly for interacting with language models and managing embeddings/vector stores. We should investigate switching to Simon Willison's `llm` library (https://llm.datasette.io/) as a more general...