agent-zero
agent-zero copied to clipboard
Unified LLM Provider & Model Discovery with LiteLLM
This pull request introduces a major refactoring of the LLM provider integration by replacing direct API calls with LiteLLM. This change unifies how we interact with different LLM providers, simplifies the codebase, and adds new capabilities for model discovery.
Key Changes:
-
LiteLLM Integration:
- The
models.pymodule has been completely refactored to uselitellmas the single interface for all LLM providers. - This removes the need for provider-specific code (e.g., for OpenAI, Anthropic, Groq) and replaces it with a unified
get_modelfunction. -
LiteLLMChatWrapperandLiteLLMEmbeddingWrapperclasses have been implemented to ensure seamless compatibility with the existing LangChain-based architecture.
- The
-
Dynamic Model Picker:
- New API endpoints (
/api/models_listand/api/models_all) have been created to dynamically fetch and list available models from various providers. - The system can now discover models from:
- Local installations like Ollama and LM Studio.
- Cloud providers supported by LiteLLM (OpenAI, Anthropic, Google, etc.).
- Static configuration lists as a fallback.
- This feature allows users to easily select and switch between different models without manual configuration.
- New API endpoints (
-
Simplified Configuration:
- The environment variable handling has been updated to support
litellm's configuration standards. - New variables for custom base URLs have been added to
example.envfor providers like Ollama, LM Studio, and OpenRouter.
- The environment variable handling has been updated to support
Benefits:
- Unified API: A single, consistent way to interact with a wide range of LLM providers.
- Extensibility: Adding new providers supported by LiteLLM is now trivial and requires no code changes.
-
Maintainability: The
models.pyfile is significantly cleaner and easier to maintain. - Enhanced User Experience: The new model picker provides users with more flexibility and control over model selection.
How to Test:
- Ensure you have the new
litellmdependency installed (pip install -r requirements.txt). - Set up your
.envfile with API keys for the providers you want to test. - Run the application and test the new model selection UI (if available).
- Verify that chat and embedding functionalities work as expected with different providers (e.g., OpenAI, Anthropic, Ollama).
- Check the server logs for any errors related to model loading or API calls.
This PR marks a significant step towards making Agent-Zero more flexible, powerful, and easier to maintain.
I think it is a great idea to use LiteLLM. I have a couple of suggestions:
- remove from PR unrelated changes, like removal of debug prints.
- don't change model.py, implement the new logic in the new files, and a feature flag to switch between old and new approach.
- move changes to the UI side to a separate PR
Merged manually.