π Feature: Support Anthropic's Tools
Which component is this feature for?
Anthropic Instrumentation
π Feature description
Anthropic released support for calling tools with Claude: https://docs.anthropic.com/claude/docs/tool-use We should support adding this to spans, similar to OpenAI
π€ Why is this feature needed ?
Completeness of Anthropic instrumentation
βοΈ How do you aim to achieve this?
Similar to this: https://github.com/traceloop/openllmetry/blob/9c32de87ddde8c16d653efcf0f23308acd661fb7/packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/init.py#L88
ποΈ Additional Information
No response
π Have you spent some time to check if this feature request has been raised before?
- [X] I checked and didn't find similar issue
Are you willing to submit PR?
None
@nirga is anyone working on this?. if not can I take a shot at this?
@peachypeachyy no one is working on it, go ahead! ππΌ
Going to try this task
Hey @apuwk I'm already working on this task
Gotcha, no worries. Missed that.
Hey @nirga . I am working on this and I have some code ready. I need your assistance to get some more clarity :
- In order to test whether my code is working correctly, I am thinking of creating a sample application using the format in [1], is this the correct approach?
- If I need to create a sample application, do I use
os.environ["ANTHROPIC_API_KEY"]and make a API call to Anthropic or do we have any other mechanism?. I read somewhere on usingvcr.py, not sure how that works. - I created a feature branch in my local repo, should I issue a pull request to the main branch on traceloop/openllmetry or any other branch?
[1]
messages = [{}, {}, {}] /** rough structure **/
tools = [{}, {}, {}] /** rough structure **/
client.messages.create(
model=claude-3-opus-20240229,
tools=tools,
messages=messages
)
Hey @peachypeachyy thanks for the update!
- You can, but ideally we'll also want to add a proper test under the anthropic instrumentation package
- So vcr.py is what we use to record the HTTP calls when we run our tests. Basically, when you set
ANTHROPIC_API_KEYlocally and runpoetry run pytest --record-mode=once -vvit will record the calls you make to Anthropic so that next time the test can run against these pre-recorded calls. - Yes, that's perfect!
Added code against PR #1371
@nirga I have made some changes, I have written the function definition of set_tools_attributes and have called it within the _set_input_attributes function in packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py file.
I wrote a basic test to check working of the code in the tests folder as referenced in #1371
However I am getting the following error:
E TypeError: Messages.create() got an unexpected keyword argument 'tools'
Following is the full trace with the versions of packages being used:
sid@MSI:~/openllmetry/packages/opentelemetry-instrumentation-anthropic$ poetry show && poetry run pytest -s --record-mode=once -vv tests/
annotated-types 0.6.0 Reusable constraint types to use with typing.Annotated
anthropic 0.25.6 The official Python library for the anthropic API
anyio 4.3.0 High level compatibility layer for multiple asynchronous event loop implementations
autopep8 2.1.1 A tool that automatically formats Python code to conform to the PEP 8 style guide
certifi 2024.2.2 Python package for providing Mozilla's CA Bundle.
charset-normalizer 3.3.2 The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
deprecated 1.2.14 Python @deprecated decorator to deprecate old python classes, functions or methods.
distro 1.9.0 Distro - an OS platform information API
exceptiongroup 1.2.0 Backport of PEP 654 (exception groups)
filelock 3.13.4 A platform independent file lock.
flake8 7.0.0 the modular source code checker: pep8 pyflakes and co
fsspec 2024.3.1 File-system specification
h11 0.14.0 A pure-Python, bring-your-own-I/O implementation of HTTP/1.1
httpcore 1.0.5 A minimal low-level HTTP client.
httpx 0.27.0 The next generation HTTP client.
huggingface-hub 0.22.2 Client library to download and publish models, datasets and other repos on the huggingface.co hub
idna 3.7 Internationalized Domain Names in Applications (IDNA)
importlib-metadata 7.0.0 Read metadata from Python packages
iniconfig 2.0.0 brain-dead simple config-ini parsing
mccabe 0.7.0 McCabe checker, plugin for flake8
multidict 6.0.5 multidict implementation
opentelemetry-api 1.25.0 OpenTelemetry Python API
opentelemetry-instrumentation 0.46b0 Instrumentation Tools & Auto Instrumentation for OpenTelemetry Python
opentelemetry-sdk 1.25.0 OpenTelemetry Python SDK
opentelemetry-semantic-conventions 0.46b0 OpenTelemetry Semantic Conventions
opentelemetry-semantic-conventions-ai 0.2.0 OpenTelemetry Semantic Conventions Extension for Large Language Models
packaging 24.0 Core utilities for Python packages
pluggy 1.4.0 plugin and hook calling mechanisms for python
pycodestyle 2.11.1 Python style guide checker
pydantic 2.6.4 Data validation using Python type hints
pydantic-core 2.16.3
pyflakes 3.2.0 passive checker of Python programs
pytest 8.1.1 pytest: simple powerful testing with Python
pytest-asyncio 0.23.6 Pytest support for asyncio
pytest-recording 0.13.1 A pytest plugin that allows you recording of network interactions via VCR.py
pytest-sugar 1.0.0 pytest-sugar is a plugin for pytest that changes the default look and feel of pytest (e.g. progressbar, show tests that fail instantly).
pyyaml 6.0.1 YAML parser and emitter for Python
requests 2.32.0 Python HTTP for Humans.
setuptools 69.2.0 Easily download, build, install, upgrade, and uninstall Python packages
sniffio 1.3.1 Sniff out which async library your code is running under
termcolor 2.4.0 ANSI color formatting for output in terminal
tokenizers 0.15.2
tomli 2.0.1 A lil' TOML parser
tqdm 4.66.3 Fast, Extensible Progress Meter
typing-extensions 4.11.0 Backported and Experimental Type Hints for Python 3.8+
urllib3 2.2.1 HTTP library with thread-safe connection pooling, file post, and more.
vcrpy 6.0.1 Automatically mock your HTTP interactions to simplify and speed up testing
wrapt 1.16.0 Module for decorators, wrappers and monkey patching.
yarl 1.9.4 Yet another URL library
zipp 3.18.1 Backport of pathlib-compatible object wrapper for zip files
Test session starts (platform: linux, Python 3.10.14, pytest 8.1.1, pytest-sugar 1.0.0)
cachedir: .pytest_cache
rootdir: /home/sid/openllmetry/packages/opentelemetry-instrumentation-anthropic
configfile: pyproject.toml
plugins: sugar-1.0.0, recording-0.13.1, asyncio-0.23.6, anyio-4.3.0
asyncio: mode=strict
collected 7 items
tests/test_completion.py::test_anthropic_completion β 14% ββ
tests/test_completion.py::test_anthropic_message_create β 29% βββ
tests/test_completion.py::test_anthropic_multi_modal β 43% βββββ
tests/test_completion.py::test_anthropic_message_streaming β 57% ββββββ
tests/test_completion.py::test_async_anthropic_message_create β 71% ββββββββ
tests/test_completion.py::test_async_anthropic_message_streaming β 86% βββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ test_anthropic_tools ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
exporter = <opentelemetry.sdk.trace.export.in_memory_span_exporter.InMemorySpanExporter object at 0x7fa5cff6f400>, reader = <opentelemetry.sdk.metrics._internal.export.InMemoryMetricReader object at 0x7fa5cff6f640>
@pytest.mark.vcr
def test_anthropic_tools(exporter, reader):
client = Anthropic()
> client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, either 'celsius' or 'fahrenheit'"
}
},
"required": ["location"]
}
},
{
"name": "get_time",
"description": "Get the current time in a given time zone",
"input_schema": {
"type": "object",
"properties": {
"timezone": {
"type": "string",
"description": "The IANA time zone name, e.g. America/Los_Angeles"
}
},
"required": ["timezone"]
}
}
],
messages=[
{
"role": "user",
"content": "What is the weather like right now in New York? Also what time is it there?"
}
]
)
tests/test_completion.py:548:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
opentelemetry/instrumentation/anthropic/__init__.py:377: in wrapper
return func(
opentelemetry/instrumentation/anthropic/__init__.py:478: in _wrap
raise e
opentelemetry/instrumentation/anthropic/__init__.py:464: in _wrap
response = wrapped(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
args = (<anthropic.resources.messages.Messages object at 0x7fa5cf6a27d0>,)
kwargs = {'max_tokens': 1024, 'messages': [{'content': 'What is the weather like right now in New York? Also what time is it th...name, e.g. America/Los_Angeles', 'type': 'string'}}, 'required': ['timezone'], 'type': 'object'}, 'name': 'get_time'}]}
i = 0, _ = <anthropic.resources.messages.Messages object at 0x7fa5cf6a27d0>, key = 'messages', variant = ['max_tokens', 'messages', 'model'], matches = True
@functools.wraps(func)
def wrapper(*args: object, **kwargs: object) -> object:
given_params: set[str] = set()
for i, _ in enumerate(args):
try:
given_params.add(positional[i])
except IndexError:
raise TypeError(
f"{func.__name__}() takes {len(positional)} argument(s) but {len(args)} were given"
) from None
for key in kwargs.keys():
given_params.add(key)
for variant in variants:
matches = all((param in given_params for param in variant))
if matches:
break
else: # no break
if len(variants) > 1:
variations = human_join(
["(" + human_join([quote(arg) for arg in variant], final="and") + ")" for variant in variants]
)
msg = f"Missing required arguments; Expected either {variations} arguments to be given"
else:
assert len(variants) > 0
# TODO: this error message is not deterministic
missing = list(set(variants[0]) - given_params)
if len(missing) > 1:
msg = f"Missing required arguments: {human_join([quote(arg) for arg in missing])}"
else:
msg = f"Missing required argument: {quote(missing[0])}"
raise TypeError(msg)
> return func(*args, **kwargs)
E TypeError: Messages.create() got an unexpected keyword argument 'tools'
.venv/lib/python3.10/site-packages/anthropic/_utils/_utils.py:277: TypeError
tests/test_completion.py::test_anthropic_tools β¨― 100% ββββββββββ
==================================================================================================== warnings summary ====================================================================================================
.venv/lib/python3.10/site-packages/opentelemetry/instrumentation/dependencies.py:4
/home/sid/openllmetry/packages/opentelemetry-instrumentation-anthropic/.venv/lib/python3.10/site-packages/opentelemetry/instrumentation/dependencies.py:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
from pkg_resources import (
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================================ short test summary info =================================================================================================
FAILED tests/test_completion.py::test_anthropic_tools - TypeError: Messages.create() got an unexpected keyword argument 'tools'
I'd request your help to understand where the issue is? Am I calling the function from a wrong entry point within _set_input_attributes?.
Do I need to update the anthropic package? In this run, it is using anthropic == 0.25.6. However when I am running code on my base machine(not using packages and pytest from this repo) with anthropic == 0.28.0, I am getting an output.
@peachypeachyy yes, it looks like you need to upgrade the anthropic package inpyproject.toml
@nirga You were right, after changing the version, things are working now.
Submitted PR #1372 Please review.
@nirga My friend Dwisha is working on the sample app for this as well, just FYI
This is the sample app
import os
import anthropic
from traceloop.sdk import Traceloop
client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
Traceloop.init()
response = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
tools=[
{
"name": "get_property_prices",
"description": "Get the current average property prices in France",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city, e.g. Paris"
},
"unit": {
"type": "string",
"enum": ["Euro", "Francs"],
"description": "The unit of property prices is either \"Euro\" or \"Francs\""
}
},
"required": ["location"]
}
}
],
messages=[{"role": "user", "content": "What is the current prices in Marseille?"}]
)
print(response)
@nirga Since this requires an updated version of anthropic, when I did poetry update anthropic after changing version of anthropic in pyproject.toml, It updated many different files and also updated the opentelemetry-semantic-conventions-ai (0.2.0 -> 0.3.1). Since this updated many many other packages which may impact other sample apps as well, I have simply pasted the code here. Do let me know if it needs to be provided in a separateΒ pullΒ request.
@dwisha-kulkarni the updates are ok, don't worry about them - can you open a separate PR with this code?