[Bug]: Incompatibilities with OpenTelemetry LLM semantics pending release
What happened?
I work on the OpenTelemetry LLM semantics SIG, and did an evaluation of the SDK based on the following sample code and what the semantics pending release 1.27.0 will define.
Note: I'm doing this unsolicited on all the various python instrumentation for openai, so this is not a specific call out that LiteLLM is notably different here. I wanted to warn you about some drift and ideally you'll be in a position to adjust once the release occurs, or clarify if that's not a goal. I would welcome you to join the #otel-llm-semconv-wg slack and any SIG meetings if you find this relevant!
Sample code
import os
from openai import OpenAI
def main():
base_url = os.getenv('LITELLM_API_BASE', 'http://localhost:4000')
client = OpenAI(base_url=base_url, api_key='unused')
messages = [
{
'role': 'user',
'content': '<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>',
},
]
chat_completion = client.chat.completions.create(model='ollama/codegemma:2b-code', messages=messages)
print(chat_completion.choices[0].message.content)
if __name__ == '__main__':
main()
proxy config.yaml
model_list:
- model_name: ollama/codegemma:2b-code
litellm_params:
model: ollama/codegemma:2b-code
api_base: os.environ/OLLAMA_API_BASE
litellm_settings:
drop_params: true
callbacks: ["otel"]
environment_variables:
OTEL_SERVICE_NAME: litellm-proxy-otel-ollama
OTEL_EXPORTER: otlp_http
Evaluation
The parent span (Span #0) and child span (Span #1) are not evaluated as there's currently only guidance for one type of LLM span.
Semantic evaluation on spans.
compatible:
- attributes['gen_ai.system']='ollama'
- attributes['gen_ai.request.model']='codegemma:2b-code'
- attributes['gen_ai.response.id']='chatcmpl-596ade39-e659-479e-b9a1-0db80996e159'
- attributes['gen_ai.response.model']='ollama/codegemma:2b-code'
missing:
- attributes['gen_ai.operation.name']='chat'
incompatible:
- kind=internal (should be client)
- name=litellm_request (should be 'chat codegemma:2b-code')
- attributes['SpanAttributes.LLM_PROMPTS.0.role']: 'user' (should be embedded in the event attribute
gen_ai.prompt) - attributes['SpanAttributes.LLM_PROMPTS.0.content']: '<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>' (should be embedded in the event attribute 'gen_ai.prompt')
- attributes['SpanAttributes.LLM_COMPLETIONS.0.finish_reason']: 'stop' (should be index zero of 'gen_ai.response.finish_reasons')
- attributes['SpanAttributes.LLM_COMPLETIONS.0.role']: 'assistant' (should be embedded in the event attribute
gen_ai.completion) - attributes['SpanAttributes.LLM_COMPLETIONS.0.content']: 'def hello_world():' (should be embedded in the event attribute 'gen_ai.completion')
- attributes['gen_ai.usage.completion_tokens']=13 (should be 'gen_ai.usage.input_tokens')
- attributes['gen_ai.usage.prompt_tokens']=29 (should be 'gen_ai.usage.output_tokens')
not yet defined in the standard:
- attributes['llm.is_streaming']=false
- attributes['llm.usage.total_tokens']=42
vendor specific:
- attributes['metadata.user_api_key']='unused'
- attributes['metadata.litellm_api_version']='1.41.28'
- attributes['metadata.user_api_key_spend']=0
- attributes['metadata.endpoint']='http://localhost:4000/chat/completions'
- attributes['metadata.requester_ip_address']=''
- attributes['metadata.model_group']='ollama/codegemma:2b-code'
- attributes['metadata.deployment']='ollama/codegemma:2b-code'
- attributes['metadata.api_base']='http://ollama:11434'
Semantic evaluation on metrics:
N/A as no metrics are currently recorded
Relevant log output
otel-collector | 2024-07-24T11:00:39.173Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 3}
otel-collector | 2024-07-24T11:00:39.173Z info ResourceSpans #0
otel-collector | Resource SchemaURL:
otel-collector | Resource attributes:
otel-collector | -> service.name: Str(litellm-proxy-otel-ollama)
otel-collector | -> deployment.environment: Str(production)
otel-collector | -> model_id: Str(litellm-proxy-otel-ollama)
otel-collector | ScopeSpans #0
otel-collector | ScopeSpans SchemaURL:
otel-collector | InstrumentationScope litellm
otel-collector | Span #0
otel-collector | Trace ID : 4834916375d15d3b328cf52578f5de2b
otel-collector | Parent ID : 613b6d09a3915f22
otel-collector | ID : b6c19a0d5f59038b
otel-collector | Name : raw_gen_ai_request
otel-collector | Kind : Internal
otel-collector | Start time : 2024-07-24 11:00:36.983702016 +0000 UTC
otel-collector | End time : 2024-07-24 11:00:37.602364928 +0000 UTC
otel-collector | Status code : Ok
otel-collector | Status message :
otel-collector | Span #1
otel-collector | Trace ID : 4834916375d15d3b328cf52578f5de2b
otel-collector | Parent ID : 25e7762a4271e069
otel-collector | ID : 613b6d09a3915f22
otel-collector | Name : litellm_request
otel-collector | Kind : Internal
otel-collector | Start time : 2024-07-24 11:00:36.983702016 +0000 UTC
otel-collector | End time : 2024-07-24 11:00:37.602364928 +0000 UTC
otel-collector | Status code : Ok
otel-collector | Status message :
otel-collector | Attributes:
otel-collector | -> metadata.user_api_key: Str(unused)
otel-collector | -> metadata.litellm_api_version: Str(1.41.28)
otel-collector | -> metadata.user_api_key_spend: Double(0)
otel-collector | -> metadata.endpoint: Str(http://localhost:4000/chat/completions)
otel-collector | -> metadata.requester_ip_address: Str()
otel-collector | -> metadata.model_group: Str(ollama/codegemma:2b-code)
otel-collector | -> metadata.deployment: Str(ollama/codegemma:2b-code)
otel-collector | -> metadata.api_base: Str(http://ollama:11434)
otel-collector | -> gen_ai.request.model: Str(codegemma:2b-code)
otel-collector | -> gen_ai.system: Str(ollama)
otel-collector | -> llm.is_streaming: Str(False)
otel-collector | -> SpanAttributes.LLM_PROMPTS.0.role: Str(user)
otel-collector | -> SpanAttributes.LLM_PROMPTS.0.content: Str(<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>)
otel-collector | -> SpanAttributes.LLM_COMPLETIONS.0.finish_reason: Str(stop)
otel-collector | -> SpanAttributes.LLM_COMPLETIONS.0.role: Str(assistant)
otel-collector | -> SpanAttributes.LLM_COMPLETIONS.0.content: Str(def hello_world():)
otel-collector | -> gen_ai.response.id: Str(chatcmpl-596ade39-e659-479e-b9a1-0db80996e159)
otel-collector | -> gen_ai.response.model: Str(ollama/codegemma:2b-code)
otel-collector | -> llm.usage.total_tokens: Int(40)
otel-collector | -> gen_ai.usage.completion_tokens: Int(11)
otel-collector | -> gen_ai.usage.prompt_tokens: Int(29)
otel-collector | Span #2
otel-collector | Trace ID : 4834916375d15d3b328cf52578f5de2b
otel-collector | Parent ID :
otel-collector | ID : 25e7762a4271e069
otel-collector | Name : Received Proxy Server Request
otel-collector | Kind : Internal
otel-collector | Start time : 2024-07-24 11:00:36.98197504 +0000 UTC
otel-collector | End time : 2024-07-24 11:00:37.603086848 +0000 UTC
otel-collector | Status code : Unset
otel-collector | Status message :
otel-collector | {"kind": "exporter", "data_type": "traces", "name": "debug"}
Twitter / LinkedIn details
No response