openllmetry 🚀 Feature: Support Anthropic's Tools

Which component is this feature for?

Anthropic Instrumentation

🔖 Feature description

Anthropic released support for calling tools with Claude: https://docs.anthropic.com/claude/docs/tool-use We should support adding this to spans, similar to OpenAI

🎤 Why is this feature needed ?

Completeness of Anthropic instrumentation

✌️ How do you aim to achieve this?

Similar to this: https://github.com/traceloop/openllmetry/blob/9c32de87ddde8c16d653efcf0f23308acd661fb7/packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/init.py#L88

🔄️ Additional Information

No response

👀 Have you spent some time to check if this feature request has been raised before?

[X] I checked and didn't find similar issue

Are you willing to submit PR?

None

Apr 22 '24 17:04 nirga

@nirga is anyone working on this?. if not can I take a shot at this?

Apr 29 '24 07:04 peachypeachyy

@peachypeachyy no one is working on it, go ahead! 🙌🏼

Apr 29 '24 07:04 nirga

Going to try this task

May 13 '24 01:05 apuwk

Hey @apuwk I'm already working on this task

May 13 '24 01:05 peachypeachyy

Gotcha, no worries. Missed that.

May 13 '24 04:05 apuwk

Hey @nirga . I am working on this and I have some code ready. I need your assistance to get some more clarity :

In order to test whether my code is working correctly, I am thinking of creating a sample application using the format in [1], is this the correct approach?
If I need to create a sample application, do I use os.environ["ANTHROPIC_API_KEY"] and make a API call to Anthropic or do we have any other mechanism?. I read somewhere on using vcr.py, not sure how that works.
I created a feature branch in my local repo, should I issue a pull request to the main branch on traceloop/openllmetry or any other branch?

[1]

messages = [{}, {}, {}] /** rough structure **/
tools = [{}, {}, {}] /** rough structure **/
client.messages.create(
        model=claude-3-opus-20240229,
        tools=tools,
        messages=messages
    )

Jun 02 '24 08:06 peachypeachyy

Hey @peachypeachyy thanks for the update!

You can, but ideally we'll also want to add a proper test under the anthropic instrumentation package
So vcr.py is what we use to record the HTTP calls when we run our tests. Basically, when you set ANTHROPIC_API_KEY locally and run poetry run pytest --record-mode=once -vv it will record the calls you make to Anthropic so that next time the test can run against these pre-recorded calls.
Yes, that's perfect!

Jun 03 '24 09:06 nirga

Added code against PR #1371

@nirga I have made some changes, I have written the function definition of set_tools_attributes and have called it within the _set_input_attributes function in packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py file.

I wrote a basic test to check working of the code in the tests folder as referenced in #1371 However I am getting the following error: E TypeError: Messages.create() got an unexpected keyword argument 'tools'

Following is the full trace with the versions of packages being used:

sid@MSI:~/openllmetry/packages/opentelemetry-instrumentation-anthropic$ poetry show && poetry run pytest -s --record-mode=once -vv tests/
annotated-types                       0.6.0    Reusable constraint types to use with typing.Annotated
anthropic                             0.25.6   The official Python library for the anthropic API
anyio                                 4.3.0    High level compatibility layer for multiple asynchronous event loop implementations
autopep8                              2.1.1    A tool that automatically formats Python code to conform to the PEP 8 style guide
certifi                               2024.2.2 Python package for providing Mozilla's CA Bundle.
charset-normalizer                    3.3.2    The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
deprecated                            1.2.14   Python @deprecated decorator to deprecate old python classes, functions or methods.
distro                                1.9.0    Distro - an OS platform information API
exceptiongroup                        1.2.0    Backport of PEP 654 (exception groups)
filelock                              3.13.4   A platform independent file lock.
flake8                                7.0.0    the modular source code checker: pep8 pyflakes and co
fsspec                                2024.3.1 File-system specification
h11                                   0.14.0   A pure-Python, bring-your-own-I/O implementation of HTTP/1.1
httpcore                              1.0.5    A minimal low-level HTTP client.
httpx                                 0.27.0   The next generation HTTP client.
huggingface-hub                       0.22.2   Client library to download and publish models, datasets and other repos on the huggingface.co hub
idna                                  3.7      Internationalized Domain Names in Applications (IDNA)
importlib-metadata                    7.0.0    Read metadata from Python packages
iniconfig                             2.0.0    brain-dead simple config-ini parsing
mccabe                                0.7.0    McCabe checker, plugin for flake8
multidict                             6.0.5    multidict implementation
opentelemetry-api                     1.25.0   OpenTelemetry Python API
opentelemetry-instrumentation         0.46b0   Instrumentation Tools & Auto Instrumentation for OpenTelemetry Python
opentelemetry-sdk                     1.25.0   OpenTelemetry Python SDK
opentelemetry-semantic-conventions    0.46b0   OpenTelemetry Semantic Conventions
opentelemetry-semantic-conventions-ai 0.2.0    OpenTelemetry Semantic Conventions Extension for Large Language Models
packaging                             24.0     Core utilities for Python packages
pluggy                                1.4.0    plugin and hook calling mechanisms for python
pycodestyle                           2.11.1   Python style guide checker
pydantic                              2.6.4    Data validation using Python type hints
pydantic-core                         2.16.3   
pyflakes                              3.2.0    passive checker of Python programs
pytest                                8.1.1    pytest: simple powerful testing with Python
pytest-asyncio                        0.23.6   Pytest support for asyncio
pytest-recording                      0.13.1   A pytest plugin that allows you recording of network interactions via VCR.py
pytest-sugar                          1.0.0    pytest-sugar is a plugin for pytest that changes the default look and feel of pytest (e.g. progressbar, show tests that fail instantly).
pyyaml                                6.0.1    YAML parser and emitter for Python
requests                              2.32.0   Python HTTP for Humans.
setuptools                            69.2.0   Easily download, build, install, upgrade, and uninstall Python packages
sniffio                               1.3.1    Sniff out which async library your code is running under
termcolor                             2.4.0    ANSI color formatting for output in terminal
tokenizers                            0.15.2   
tomli                                 2.0.1    A lil' TOML parser
tqdm                                  4.66.3   Fast, Extensible Progress Meter
typing-extensions                     4.11.0   Backported and Experimental Type Hints for Python 3.8+
urllib3                               2.2.1    HTTP library with thread-safe connection pooling, file post, and more.
vcrpy                                 6.0.1    Automatically mock your HTTP interactions to simplify and speed up testing
wrapt                                 1.16.0   Module for decorators, wrappers and monkey patching.
yarl                                  1.9.4    Yet another URL library
zipp                                  3.18.1   Backport of pathlib-compatible object wrapper for zip files
Test session starts (platform: linux, Python 3.10.14, pytest 8.1.1, pytest-sugar 1.0.0)
cachedir: .pytest_cache
rootdir: /home/sid/openllmetry/packages/opentelemetry-instrumentation-anthropic
configfile: pyproject.toml
plugins: sugar-1.0.0, recording-0.13.1, asyncio-0.23.6, anyio-4.3.0
asyncio: mode=strict
collected 7 items                                                                                                                                                                                                        

 tests/test_completion.py::test_anthropic_completion ✓                                                                                                                                                      14% █▌        
 tests/test_completion.py::test_anthropic_message_create ✓                                                                                                                                                  29% ██▉       
 tests/test_completion.py::test_anthropic_multi_modal ✓                                                                                                                                                     43% ████▍     
 tests/test_completion.py::test_anthropic_message_streaming ✓                                                                                                                                               57% █████▊    
 tests/test_completion.py::test_async_anthropic_message_create ✓                                                                                                                                            71% ███████▎  
 tests/test_completion.py::test_async_anthropic_message_streaming ✓                                                                                                                                         86% ████████▋ 

―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― test_anthropic_tools ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――

exporter = <opentelemetry.sdk.trace.export.in_memory_span_exporter.InMemorySpanExporter object at 0x7fa5cff6f400>, reader = <opentelemetry.sdk.metrics._internal.export.InMemoryMetricReader object at 0x7fa5cff6f640>

    @pytest.mark.vcr
    def test_anthropic_tools(exporter, reader):
        client = Anthropic()
>       client.messages.create(
            model="claude-3-opus-20240229",
            max_tokens=1024,
            tools=[
                {
                    "name": "get_weather",
                    "description": "Get the current weather in a given location",
                    "input_schema": {
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "The city and state, e.g. San Francisco, CA"
                            },
                            "unit": {
                                "type": "string",
                                "enum": ["celsius", "fahrenheit"],
                                "description": "The unit of temperature, either 'celsius' or 'fahrenheit'"
                            }
                        },
                        "required": ["location"]
                    }
                },
                {
                    "name": "get_time",
                    "description": "Get the current time in a given time zone",
                    "input_schema": {
                        "type": "object",
                        "properties": {
                            "timezone": {
                                "type": "string",
                                "description": "The IANA time zone name, e.g. America/Los_Angeles"
                            }
                        },
                        "required": ["timezone"]
                    }
                }
            ],
            messages=[
                {
                    "role": "user",
                    "content": "What is the weather like right now in New York? Also what time is it there?"
                }
            ]
        )

tests/test_completion.py:548: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
opentelemetry/instrumentation/anthropic/__init__.py:377: in wrapper
    return func(
opentelemetry/instrumentation/anthropic/__init__.py:478: in _wrap
    raise e
opentelemetry/instrumentation/anthropic/__init__.py:464: in _wrap
    response = wrapped(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<anthropic.resources.messages.Messages object at 0x7fa5cf6a27d0>,)
kwargs = {'max_tokens': 1024, 'messages': [{'content': 'What is the weather like right now in New York? Also what time is it th...name, e.g. America/Los_Angeles', 'type': 'string'}}, 'required': ['timezone'], 'type': 'object'}, 'name': 'get_time'}]}
i = 0, _ = <anthropic.resources.messages.Messages object at 0x7fa5cf6a27d0>, key = 'messages', variant = ['max_tokens', 'messages', 'model'], matches = True

    @functools.wraps(func)
    def wrapper(*args: object, **kwargs: object) -> object:
        given_params: set[str] = set()
        for i, _ in enumerate(args):
            try:
                given_params.add(positional[i])
            except IndexError:
                raise TypeError(
                    f"{func.__name__}() takes {len(positional)} argument(s) but {len(args)} were given"
                ) from None
    
        for key in kwargs.keys():
            given_params.add(key)
    
        for variant in variants:
            matches = all((param in given_params for param in variant))
            if matches:
                break
        else:  # no break
            if len(variants) > 1:
                variations = human_join(
                    ["(" + human_join([quote(arg) for arg in variant], final="and") + ")" for variant in variants]
                )
                msg = f"Missing required arguments; Expected either {variations} arguments to be given"
            else:
                assert len(variants) > 0
    
                # TODO: this error message is not deterministic
                missing = list(set(variants[0]) - given_params)
                if len(missing) > 1:
                    msg = f"Missing required arguments: {human_join([quote(arg) for arg in missing])}"
                else:
                    msg = f"Missing required argument: {quote(missing[0])}"
            raise TypeError(msg)
>       return func(*args, **kwargs)
E       TypeError: Messages.create() got an unexpected keyword argument 'tools'

.venv/lib/python3.10/site-packages/anthropic/_utils/_utils.py:277: TypeError

 tests/test_completion.py::test_anthropic_tools ⨯                                                                                                                                                          100% ██████████
==================================================================================================== warnings summary ====================================================================================================
.venv/lib/python3.10/site-packages/opentelemetry/instrumentation/dependencies.py:4
  /home/sid/openllmetry/packages/opentelemetry-instrumentation-anthropic/.venv/lib/python3.10/site-packages/opentelemetry/instrumentation/dependencies.py:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
    from pkg_resources import (

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================================ short test summary info =================================================================================================
FAILED tests/test_completion.py::test_anthropic_tools - TypeError: Messages.create() got an unexpected keyword argument 'tools'

I'd request your help to understand where the issue is? Am I calling the function from a wrong entry point within _set_input_attributes?.

Do I need to update the anthropic package? In this run, it is using anthropic == 0.25.6. However when I am running code on my base machine(not using packages and pytest from this repo) with anthropic == 0.28.0, I am getting an output.

Jun 20 '24 08:06 peachypeachyy

@peachypeachyy yes, it looks like you need to upgrade the anthropic package inpyproject.toml

Jun 20 '24 11:06 nirga

@nirga You were right, after changing the version, things are working now.

Submitted PR #1372 Please review.

Jun 21 '24 04:06 peachypeachyy

@nirga My friend Dwisha is working on the sample app for this as well, just FYI

Jun 21 '24 10:06 peachypeachyy

This is the sample app

import os
import anthropic
from traceloop.sdk import Traceloop

client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

Traceloop.init()

response = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    tools=[
        {
            "name": "get_property_prices",
            "description": "Get the current average property prices in France",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city, e.g. Paris"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["Euro", "Francs"],
                        "description": "The unit of property prices is either \"Euro\" or \"Francs\""
                    }
                },
                "required": ["location"]
            }
        }
    ],
    messages=[{"role": "user", "content": "What is the current prices in Marseille?"}]
)

print(response)

@nirga Since this requires an updated version of anthropic, when I did poetry update anthropic after changing version of anthropic in pyproject.toml, It updated many different files and also updated the opentelemetry-semantic-conventions-ai (0.2.0 -> 0.3.1). Since this updated many many other packages which may impact other sample apps as well, I have simply pasted the code here. Do let me know if it needs to be provided in a separate pull request.

Jun 21 '24 14:06 dwisha-kulkarni

@dwisha-kulkarni the updates are ok, don't worry about them - can you open a separate PR with this code?

Jun 21 '24 16:06 nirga