add OpenAI responses support
Summary
This PR adds instrumentation support for the OpenAI Responses API (structured outputs) to the opentelemetry-instrumentation-openai-v2 library, following the same monkeypatching pattern used for chat completions.
Background
The OpenAI SDK introduced the Responses API (client.responses.create) for structured outputs in version 1.66.0. This API was not previously instrumented, meaning calls to it would not generate telemetry data (spans, logs, or metrics).
Changes
This PR instruments both synchronous and asynchronous versions of the Responses API:
from openai import OpenAI
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
OpenAIInstrumentor().instrument()
client = OpenAI()
# Now automatically instrumented!
response = client.responses.create(
model="gpt-4o-mini",
input="Write a short poem on open telemetry.",
)
client.conversations.create()
items = client.conversations.items.list(conversation_id=conversation.id)
# Print all the items
for item in items:
display_conversation_item(item)
Implementation Details
Version Checking:
- Added
_is_responses_api_supported()function to detect if OpenAI SDK >= 1.66.0 - Instrumentation only wraps responses API when supported version is detected
- Chat completions instrumentation is always enabled (no version requirement)
- Uses
packaging.versionfor reliable version comparison
New wrapper functions in patch.py:
-
responses_create()- Wraps synchronousResponses.createmethod -
async_responses_create()- Wraps asynchronousAsyncResponses.createmethod -
_set_responses_attributes()- Sets span attributes for responses -
_record_responses_metrics()- Records metrics for responses API calls
Instrumentation hooks in __init__.py:
- Added conditional
wrap_function_wrappercalls foropenai.resources.responses.responses.Responses.create - Added conditional
wrap_function_wrappercalls foropenai.resources.responses.responses.AsyncResponses.create - Added corresponding conditional
unwrapcalls in_uninstrument()method
Telemetry Captured
The instrumentation captures (when responses API is available):
- Spans with attributes including operation name, model, response ID, service tier, and token usage
-
Span events for input/output messages (when
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true) - Metrics for operation duration and token usage (input/output tokens)
Tests
Added comprehensive test coverage with version-aware skipping:
-
test_responses.py- Tests for synchronous responses API with/without content capture (skipped if OpenAI < 1.66.0) -
test_async_responses.py- Tests for asynchronous responses API with/without content capture (skipped if OpenAI < 1.66.0) - 'test_conversations.py' - Tests for synchronous conversations API with/without content capture (skipped if OpenAI < 1.101.0)
- 'test_async_conversations.py' - Tests for asynchronous conversations API with/without content capture (skipped if OpenAI < 1.101.0)
Documentation
Updated documentation to include responses API examples:
-
README.rst- Added usage example showing both chat completions and responses API - Module docstring in
__init__.py- Added responses API example
Bug Fixes
- Fixed ChatCompletion imports to use
openai.types.chatinstead ofopenai.resources.chat.completions
Testing
Verified that:
- All methods are correctly wrapped after instrumentation
- All methods are correctly unwrapped after uninstrumentation
- Spans capture correct attributes (model, tokens, service tier)
- Events capture input/output based on content capture setting
- Metrics are recorded for duration and token usage
- Implementation follows existing code patterns and style
- Version checking correctly detects supported/unsupported OpenAI versions (1.66.0 threshold)
- Tests are automatically skipped when OpenAI version doesn't support responses API
- ChatCompletion imports are correct and use the proper type location
Compatibility
- OpenAI SDK: >= 1.26.0 (minimum version), >= 1.66.0 (for responses API support)
- Python: >= 3.9
- OpenTelemetry API: ~= 1.37
Backward Compatibility
This implementation maintains full backward compatibility. Users with OpenAI SDK versions < 1.66.0 will continue to have chat completions instrumented while responses API instrumentation is gracefully skipped.
Original prompt
Add support to the openai v2 instrumentation library for the openai responses API. Use the same pattern (monkeypatching) as is done for the chat completions.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.