agent-zero Unified LLM Provider & Model Discovery with LiteLLM

This pull request introduces a major refactoring of the LLM provider integration by replacing direct API calls with LiteLLM. This change unifies how we interact with different LLM providers, simplifies the codebase, and adds new capabilities for model discovery.

Key Changes:

LiteLLM Integration:
- The models.py module has been completely refactored to use litellm as the single interface for all LLM providers.
- This removes the need for provider-specific code (e.g., for OpenAI, Anthropic, Groq) and replaces it with a unified get_model function.
- LiteLLMChatWrapper and LiteLLMEmbeddingWrapper classes have been implemented to ensure seamless compatibility with the existing LangChain-based architecture.
Dynamic Model Picker:
- New API endpoints (/api/models_list and /api/models_all) have been created to dynamically fetch and list available models from various providers.
- The system can now discover models from:
  - Local installations like Ollama and LM Studio.
  - Cloud providers supported by LiteLLM (OpenAI, Anthropic, Google, etc.).
  - Static configuration lists as a fallback.
- This feature allows users to easily select and switch between different models without manual configuration.
Simplified Configuration:
- The environment variable handling has been updated to support litellm's configuration standards.
- New variables for custom base URLs have been added to example.env for providers like Ollama, LM Studio, and OpenRouter.

Benefits:

Unified API: A single, consistent way to interact with a wide range of LLM providers.
Extensibility: Adding new providers supported by LiteLLM is now trivial and requires no code changes.
Maintainability: The models.py file is significantly cleaner and easier to maintain.
Enhanced User Experience: The new model picker provides users with more flexibility and control over model selection.

How to Test:

Ensure you have the new litellm dependency installed (pip install -r requirements.txt).
Set up your .env file with API keys for the providers you want to test.
Run the application and test the new model selection UI (if available).
Verify that chat and embedding functionalities work as expected with different providers (e.g., OpenAI, Anthropic, Ollama).
Check the server logs for any errors related to model loading or API calls.

This PR marks a significant step towards making Agent-Zero more flexible, powerful, and easier to maintain.

Jun 15 '25 15:06 TerminallyLazy

I think it is a great idea to use LiteLLM. I have a couple of suggestions:

remove from PR unrelated changes, like removal of debug prints.
don't change model.py, implement the new logic in the new files, and a feature flag to switch between old and new approach.
move changes to the UI side to a separate PR

Jun 19 '25 15:06 Sklavit

Merged manually.

Jul 12 '25 11:07 frdel