holmesgpt icon indicating copy to clipboard operation
holmesgpt copied to clipboard

Error using WatsonX LiteLLM with Holmes

Open ssthom opened this issue 1 year ago • 3 comments

I'm attempting to use WatsonX.AI LLM with Holmes. WatsonX is supported by LiteLLM https://docs.litellm.ai/docs/providers/watsonx and using the reference in the LiteLLM docs I was able to get it working

WATSONX_URL=https://us-south.ml.cloud.ibm.com
WATSONX_APIKEY=<redacted>
WATSONX_PROJECT_ID=<redacted>
WATSONX_DEPLOYMENT_SPACE_ID=<redacted>
from litellm import completion

response = completion(
  model="watsonx/ibm/granite-13b-chat-v2",
  messages=[{ "content": "what is your favorite colour?","role": "user"}]
)
print(response)

response = completion(
  model="watsonx/meta-llama/llama-3-8b-instruct",
  messages=[{ "content": "what is your favorite colour?","role": "user"}]
)
print(response)

Response:

$ python3.12 litellm.py

ModelResponse(id='chatcmpl-06cc9bb3-2fc7-467e-9594-40d354613f0a', choices=[Choices(finish_reason='length', index=0, message=Message(content="\nI don't have a favorite color. Why do you ask?\n\nI don't have", role='assistant', tool_calls=None, function_call=None))], created=1731008970, model='ibm/granite-13b-chat-v2', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=20, prompt_tokens=10, total_tokens=30, completion_tokens_details=None, prompt_tokens_details=None))

ModelResponse(id='chatcmpl-ab3c4c56-623f-4a5d-becc-3b46e8efafb2', choices=[Choices(finish_reason='length', index=0, message=Message(content="I'm just an AI, I don't have personal preferences, including favorite colors. I can provide", role='assistant', tool_calls=None, function_call=None))], created=1731008972, model='meta-llama/llama-3-8b-instruct', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=20, prompt_tokens=16, total_tokens=36, completion_tokens_details=None, prompt_tokens_details=None))

But when I try with Holmes I get an error around Exception: model watsonx/ibm/granite-13b-chat-v2 requires the following environment variables: []. There is nothing listed in the array. So not sure what I am missing? Could I get some help with this error?

$ holmes ask "what pods are in crashloopbackoff in my cluster and why?" -vvv --model=watsonx/ibm/granite-13b-chat-v2

Starting AI session with tools: ['kubectl_describe', 'kubectl_get', 'kubectl_get_all', 'kubectl_find_resource', 'kubectl_get_yaml', 'kubectl_previous_logs', 'kubectl_logs',         config.py:139
'kubectl_top_pods', 'kubectl_top_nodes', 'get_prometheus_target', 'kubectl_lineage_children', 'kubectl_lineage_parents', 'helm_list', 'helm_values', 'helm_status', 'helm_history',
'helm_manifest', 'helm_hooks', 'helm_chart', 'helm_notes']
Checking LiteLLM model watsonx/ibm/granite-13b-chat-v2                                                                                                                                   llm.py:65
╭────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ───────────────────────────────────────────────────────────────────────────────╮
│ in ask:275                                                                                                                                                                                     │
│                                                                                                                                                                                                │
│ in create_toolcalling_llm:151                                                                                                                                                                  │
│                                                                                                                                                                                                │
│ in _get_llm:273                                                                                                                                                                                │
│                                                                                                                                                                                                │
│ in __init__:62                                                                                                                                                                                 │
│                                                                                                                                                                                                │
│ in check_llm:79                                                                                                                                                                                │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: model watsonx/ibm/granite-13b-chat-v2 requires the following environment variables: []
[PYI-65847:ERROR] Failed to execute script 'holmes' due to unhandled exception

ssthom avatar Nov 07 '24 19:11 ssthom

Hi, we're looking into this! Will update.

aantn avatar Nov 10 '24 13:11 aantn

@ssthom The issue you encountered comes from from litellm's validate_environment method, which does not currently support WatsonX (mentioned here https://github.com/BerriAI/litellm/issues/6664). I’ve updated holmes to check the required envs mentioned here litellm’s documentation (https://docs.litellm.ai/docs/providers/watsonx) and updated litellm to the latest version.

It’s important to note that not all WatsonX models listed in litellm's docs (https://docs.litellm.ai/docs/providers/) support the text/chat endpoint used by litellm's completion function. Models that support chat completion are detailed in WatsonX’s docs here: https://www.ibm.com/docs/en/watsonx/saas?topic=solutions-adding-generative-chat-your-apps.

So the error should be resolved in the next Holmes release after the merge of this PR: https://github.com/robusta-dev/holmesgpt/pull/200

itisallgood avatar Nov 15 '24 17:11 itisallgood

Hi @ssthom, does the fix work for you?

aantn avatar Dec 03 '24 06:12 aantn