[Bug]: Incompatibilities with OpenTelemetry LLM semantics pending release

Open codefromthecrypt opened this issue 1 year ago • 0 comments

What happened?

I work on the OpenTelemetry LLM semantics SIG, and did an evaluation of the SDK based on the following sample code and what the semantics pending release 1.27.0 will define.

Note: I'm doing this unsolicited on all the various python instrumentation for openai, so this is not a specific call out that LiteLLM is notably different here. I wanted to warn you about some drift and ideally you'll be in a position to adjust once the release occurs, or clarify if that's not a goal. I would welcome you to join the #otel-llm-semconv-wg slack and any SIG meetings if you find this relevant!

Sample code

import os
from openai import OpenAI

def main():
    base_url = os.getenv('LITELLM_API_BASE', 'http://localhost:4000')
    client = OpenAI(base_url=base_url, api_key='unused')
    messages = [
      {
        'role': 'user',
        'content': '<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>',
      },
    ]
    chat_completion = client.chat.completions.create(model='ollama/codegemma:2b-code', messages=messages)
    print(chat_completion.choices[0].message.content)

if __name__ == '__main__':
    main()

proxy config.yaml

model_list:
  - model_name: ollama/codegemma:2b-code
    litellm_params:
      model: ollama/codegemma:2b-code
      api_base: os.environ/OLLAMA_API_BASE

litellm_settings:
  drop_params: true
  callbacks: ["otel"]

environment_variables:
  OTEL_SERVICE_NAME: litellm-proxy-otel-ollama
  OTEL_EXPORTER: otlp_http

Evaluation

The parent span (Span #0) and child span (Span #1) are not evaluated as there's currently only guidance for one type of LLM span.

Semantic evaluation on spans.

compatible:

attributes['gen_ai.system']='ollama'
attributes['gen_ai.request.model']='codegemma:2b-code'
attributes['gen_ai.response.id']='chatcmpl-596ade39-e659-479e-b9a1-0db80996e159'
attributes['gen_ai.response.model']='ollama/codegemma:2b-code'

missing:

attributes['gen_ai.operation.name']='chat'

incompatible:

kind=internal (should be client)
name=litellm_request (should be 'chat codegemma:2b-code')
attributes['SpanAttributes.LLM_PROMPTS.0.role']: 'user' (should be embedded in the event attribute gen_ai.prompt)
attributes['SpanAttributes.LLM_PROMPTS.0.content']: '<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>' (should be embedded in the event attribute 'gen_ai.prompt')
attributes['SpanAttributes.LLM_COMPLETIONS.0.finish_reason']: 'stop' (should be index zero of 'gen_ai.response.finish_reasons')
attributes['SpanAttributes.LLM_COMPLETIONS.0.role']: 'assistant' (should be embedded in the event attribute gen_ai.completion)
attributes['SpanAttributes.LLM_COMPLETIONS.0.content']: 'def hello_world():' (should be embedded in the event attribute 'gen_ai.completion')
attributes['gen_ai.usage.completion_tokens']=13 (should be 'gen_ai.usage.input_tokens')
attributes['gen_ai.usage.prompt_tokens']=29 (should be 'gen_ai.usage.output_tokens')

not yet defined in the standard:

attributes['llm.is_streaming']=false
attributes['llm.usage.total_tokens']=42

vendor specific:

attributes['metadata.user_api_key']='unused'
attributes['metadata.litellm_api_version']='1.41.28'
attributes['metadata.user_api_key_spend']=0
attributes['metadata.endpoint']='http://localhost:4000/chat/completions'
attributes['metadata.requester_ip_address']=''
attributes['metadata.model_group']='ollama/codegemma:2b-code'
attributes['metadata.deployment']='ollama/codegemma:2b-code'
attributes['metadata.api_base']='http://ollama:11434'

Semantic evaluation on metrics:

N/A as no metrics are currently recorded

Relevant log output

otel-collector      | 2024-07-24T11:00:39.173Z  info    TracesExporter  {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 3}
otel-collector      | 2024-07-24T11:00:39.173Z  info    ResourceSpans #0
otel-collector      | Resource SchemaURL: 
otel-collector      | Resource attributes:
otel-collector      |      -> service.name: Str(litellm-proxy-otel-ollama)
otel-collector      |      -> deployment.environment: Str(production)
otel-collector      |      -> model_id: Str(litellm-proxy-otel-ollama)
otel-collector      | ScopeSpans #0
otel-collector      | ScopeSpans SchemaURL: 
otel-collector      | InstrumentationScope litellm 
otel-collector      | Span #0
otel-collector      |     Trace ID       : 4834916375d15d3b328cf52578f5de2b
otel-collector      |     Parent ID      : 613b6d09a3915f22
otel-collector      |     ID             : b6c19a0d5f59038b
otel-collector      |     Name           : raw_gen_ai_request
otel-collector      |     Kind           : Internal
otel-collector      |     Start time     : 2024-07-24 11:00:36.983702016 +0000 UTC
otel-collector      |     End time       : 2024-07-24 11:00:37.602364928 +0000 UTC
otel-collector      |     Status code    : Ok
otel-collector      |     Status message : 
otel-collector      | Span #1
otel-collector      |     Trace ID       : 4834916375d15d3b328cf52578f5de2b
otel-collector      |     Parent ID      : 25e7762a4271e069
otel-collector      |     ID             : 613b6d09a3915f22
otel-collector      |     Name           : litellm_request
otel-collector      |     Kind           : Internal
otel-collector      |     Start time     : 2024-07-24 11:00:36.983702016 +0000 UTC
otel-collector      |     End time       : 2024-07-24 11:00:37.602364928 +0000 UTC
otel-collector      |     Status code    : Ok
otel-collector      |     Status message : 
otel-collector      | Attributes:
otel-collector      |      -> metadata.user_api_key: Str(unused)
otel-collector      |      -> metadata.litellm_api_version: Str(1.41.28)
otel-collector      |      -> metadata.user_api_key_spend: Double(0)
otel-collector      |      -> metadata.endpoint: Str(http://localhost:4000/chat/completions)
otel-collector      |      -> metadata.requester_ip_address: Str()
otel-collector      |      -> metadata.model_group: Str(ollama/codegemma:2b-code)
otel-collector      |      -> metadata.deployment: Str(ollama/codegemma:2b-code)
otel-collector      |      -> metadata.api_base: Str(http://ollama:11434)
otel-collector      |      -> gen_ai.request.model: Str(codegemma:2b-code)
otel-collector      |      -> gen_ai.system: Str(ollama)
otel-collector      |      -> llm.is_streaming: Str(False)
otel-collector      |      -> SpanAttributes.LLM_PROMPTS.0.role: Str(user)
otel-collector      |      -> SpanAttributes.LLM_PROMPTS.0.content: Str(<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>)
otel-collector      |      -> SpanAttributes.LLM_COMPLETIONS.0.finish_reason: Str(stop)
otel-collector      |      -> SpanAttributes.LLM_COMPLETIONS.0.role: Str(assistant)
otel-collector      |      -> SpanAttributes.LLM_COMPLETIONS.0.content: Str(def hello_world():)
otel-collector      |      -> gen_ai.response.id: Str(chatcmpl-596ade39-e659-479e-b9a1-0db80996e159)
otel-collector      |      -> gen_ai.response.model: Str(ollama/codegemma:2b-code)
otel-collector      |      -> llm.usage.total_tokens: Int(40)
otel-collector      |      -> gen_ai.usage.completion_tokens: Int(11)
otel-collector      |      -> gen_ai.usage.prompt_tokens: Int(29)
otel-collector      | Span #2
otel-collector      |     Trace ID       : 4834916375d15d3b328cf52578f5de2b
otel-collector      |     Parent ID      : 
otel-collector      |     ID             : 25e7762a4271e069
otel-collector      |     Name           : Received Proxy Server Request
otel-collector      |     Kind           : Internal
otel-collector      |     Start time     : 2024-07-24 11:00:36.98197504 +0000 UTC
otel-collector      |     End time       : 2024-07-24 11:00:37.603086848 +0000 UTC
otel-collector      |     Status code    : Unset
otel-collector      |     Status message : 
otel-collector      |   {"kind": "exporter", "data_type": "traces", "name": "debug"}

Twitter / LinkedIn details

No response

Jul 18 '24 06:07 codefromthecrypt