browser-use icon indicating copy to clipboard operation
browser-use copied to clipboard

How to use OpenRouter API for DeepSeek?

Open agn-7 opened this issue 11 months ago • 12 comments

Bug Description

I'm trying to use DeepSeek model on Browser-use, however, I don't have a direct API key from Deepseek, instead, I have an API key from Open Router that supports all LLM models. Now, based on the Open Router API doc and the browser-use doc regarding customizing the LLM model I have reached the following code. However, the browser-use opens up a browser instance, but nothing happens and the page remains empty!

Reproduction Steps

Here are the corresponding logs:

INFO     [browser_use] BrowserUse logging setup complete with level info
INFO     [root] Anonymized telemetry enabled. See https://docs.browser-use.com/development/telemetry for more information.
INFO     [agent] 🚀 Starting task: 1. Go to https://www.reddit.com/r/LocalLLaMA 2. Search for 'browser use' in the search bar3. Click on first result4. Return the first comment
INFO     [agent] 📍 Step 1
ERROR    [agent] ❌ Result failed 1/3 times:
 Could not parse response.
INFO     [agent] 📍 Step 1
ERROR    [agent] ❌ Result failed 2/3 times:
 {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': '{\n  "error_msg": "Authentication Fails. Multiple 401 errors detected. Please wait for 1 minute before trying again."\n}', 'provider_name': 'DeepSeek'}}
INFO     [agent] 📍 Step 1
ERROR    [agent] ❌ Result failed 3/3 times:
 {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': '{\n  "error_msg": "Authentication Fails. Multiple 401 errors detected. Please wait for 1 minute before trying again."\n}', 'provider_name': 'DeepSeek'}}
ERROR    [agent] ❌ Stopping due to 3 consecutive failures
INFO     [agent] Created GIF at agent_history.gif

Code Sample

Here is the code:

import asyncio
import os

from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

from browser_use import Agent

# dotenv
load_dotenv()

api_key = "sk-or-v1-xxx"
if not api_key:
	raise ValueError('DEEPSEEK_API_KEY is not set')


async def run_search():
	agent = Agent(
		task=(
			'1. Go to https://www.reddit.com/r/LocalLLaMA '
			"2. Search for 'browser use' in the search bar"
			'3. Click on first result'
			'4. Return the first comment'
		),
		llm=ChatOpenAI(
			base_url='https://openrouter.ai/api/v1',
			model='deepseek/deepseek-chat',
			api_key=api_key,
		),
		use_vision=False,
	)

	await agent.run()


if __name__ == '__main__':
	asyncio.run(run_search())

Version

0.1.35

LLM Model

DeepSeek Coder

Operating System

Ubuntu 22.04

agn-7 avatar Feb 05 '25 09:02 agn-7

I am facing the same issue. Any help would be appreciated.

techlism avatar Feb 06 '25 03:02 techlism

I am facing a simialr issue, with both openai/chatgpt-4o-latest and deepseek/deepseek-r1:

Error code: 404 - {'error': {'message': 'No endpoints found that support tool use. To learn more about provider routing, visit: https://openrouter.ai/docs/provider-routing', 'code': 404}}

the deepseek/deepseek-chat was working yesterday, but not working today with this:

ERROR    [agent] ❌ Result failed 1/3 times:
 Could not parse response.

prigozhinfan avatar Feb 06 '25 17:02 prigozhinfan

I think gemini works well and is free. There is no advantage of other models it seems.

kadavilrahul avatar Feb 06 '25 20:02 kadavilrahul

@kadavilrahul

  • Is the Gemini API key really free?
  • What is your opinion between gpt-4o and Gemini?
  • does Gemini support vision?

agn-7 avatar Feb 07 '25 09:02 agn-7

Gemini is totally free. You don't need to think much about the model because this script only scrapes and processes data. Any lite weight model works well. I have made a readymade script with terminal inputs for gemini. You may check it. https://github.com/kadavilrahul/browser-use-shell

kadavilrahul avatar Feb 07 '25 09:02 kadavilrahul

Hi, is there any solution with this? I am facing the same issue here, is it possible using another API endpoint? like deepinfra, openrouter, etc. Cause I can't find any reference with the supported models here: https://docs.browser-use.com/customize/supported-models

machinedev avatar Feb 13 '25 07:02 machinedev

came with the same problem haha, i cant use it from China without extra gymnastics so trying to go the openrouter route, have you found anything ?

Hi, is there any solution with this? I am facing the same issue here, is it possible using another API endpoint? like deepinfra, openrouter, etc. Cause I can't find any reference with the supported models here: https://docs.browser-use.com/customize/supported-models

kazGuido avatar Feb 14 '25 02:02 kazGuido

same here.

firstcomeuropeag avatar Feb 17 '25 13:02 firstcomeuropeag

same here. my findings:

  • disable the usage of vision ( for some reason it doesnt work with deepseek, but i tried gemini flask 2.0 and it worked )
  • use another model from Openrouter

kazGuido avatar Feb 17 '25 13:02 kazGuido

same here. my findings:

  • disable the usage of vision ( for some reason it doesnt work with deepseek, but i tried gemini flask 2.0 and it worked )
  • use another model from Openrouter

thanks mate @kazGuido, will try it

machinedev avatar Feb 20 '25 02:02 machinedev

@kazGuido can you share your code regarding the integration of Openrouter and browser-use to use gemini 2.0 flash? I tested it with this model and I got the same error that I get with the deepseek model!

agn-7 avatar Mar 03 '25 07:03 agn-7

I had the same problem and eventually found this workaround

Required Modifications

To integrate OpenRouter with browser-use, several modifications are necessary:

1. Installing Dependencies

pip install langchain-community

2. Configuring the Model in Your Main Script

Replace the import and initialization of the LLM model with OpenRouter integration via the OpenAI-compatible API:

# Replace this:
from langchain_google_genai import ChatGoogleGenerativeAI

# With this:
from langchain_community.chat_models import ChatOpenAI

# Then, instead of this:
llm = ChatGoogleGenerativeAI(
    model='gemini-2.0-flash-exp',
    api_key=SecretStr(api_key),
    temperature=0.0
)

# Use this:
llm = ChatOpenAI(
    model="google/gemini-2.0-pro-exp-02-05:free",  # Or any other model available on OpenRouter
    openai_api_key=openrouter_api_key,
    openai_api_base="https://openrouter.ai/api/v1",
    temperature=0.0,
    max_tokens=4096,
    model_kwargs={
        "tool_choice": "auto"  # Allows the model to decide when to use tools
    }
)

3. Modifying Message Handling in browser-use

For browser-use to work correctly with OpenRouter, you need to modify how messages are processed. Here are the changes to make to the agent/message_manager/utils.py file:

def convert_input_messages(input_messages: list[BaseMessage], model_name: Optional[str]) -> list[BaseMessage]:
    """Convert input messages to a format that is compatible with the planner model"""
    if model_name is None:
        return input_messages
    # Handle deepseek models
    if model_name == 'deepseek-reasoner' or model_name.startswith('deepseek-r1'):
        converted_input_messages = _convert_messages_for_non_function_calling_models(input_messages)
        merged_input_messages = _merge_successive_messages(converted_input_messages, HumanMessage)
        merged_input_messages = _merge_successive_messages(merged_input_messages, AIMessage)
        return merged_input_messages
    # Handle OpenRouter/Gemini models
    if 'gemini' in str(model_name).lower() or str(model_name).startswith('google/'):
        logger.info(f"Converting messages for OpenRouter/Gemini model: {model_name}")
        converted_input_messages = _convert_messages_for_non_function_calling_models(input_messages)
        return converted_input_messages
    return input_messages

4. Modifying the get_next_action Method in the Agent Class

To avoid errors related to function calling, modify the get_next_action method in the agent/service.py file:

@time_execution_async('--get_next_action (agent)')
async def get_next_action(self, input_messages: list[BaseMessage]) -> AgentOutput:
    """Get next action from LLM based on current state"""
    input_messages = self._convert_input_messages(input_messages)

    # Special handling for OpenRouter/Gemini models
    if 'gemini' in str(self.model_name).lower() or str(self.model_name).startswith('google/'):
        logger.info(f"Using special handling for OpenRouter/Gemini model: {self.model_name}")
        # Use raw mode for OpenRouter/Gemini
        output = self.llm.invoke(input_messages)
        output.content = self._remove_think_tags(str(output.content))
        try:
            parsed_json = extract_json_from_model_output(output.content)
            parsed = self.AgentOutput(**parsed_json)
        except (ValueError, ValidationError) as e:
            logger.warning(f'Failed to parse model output: {output} {str(e)}')
            raise ValueError('Could not parse response.')
    elif self.tool_calling_method == 'raw':
        # Existing code for raw mode...
    elif self.tool_calling_method is None:
        # Existing code for tool_calling_method None...
    else:
        # Existing code for other methods...

Environment Configuration

Create or modify your .env file to include your OpenRouter API key:

# Configuration for the OpenRouter API
# Replace the value below with your OpenRouter API key
# You can get an API key at https://openrouter.ai/keys
OPENROUTER_API_KEY=your_openrouter_api_key_here

Then, in your main script, make sure to load this key:

from dotenv import load_dotenv
load_dotenv()

# Get the OpenRouter API key from .env file
openrouter_api_key = os.getenv("OPENROUTER_API_KEY")
if not openrouter_api_key:
    raise ValueError("OPENROUTER_API_KEY not found in environment variables. Please add it to your .env file.")

Known Limitations

  1. Some models via OpenRouter may not support all browser-use features, particularly complex function calls.
  2. Performance may vary depending on the chosen model.
  3. Some errors may occur when processing model responses, requiring additional error handling.

UnixSafe avatar Mar 10 '25 12:03 UnixSafe

It would be great to see this OOTB since OpenRouter offers free models + quick ability to test different models.

mtworth avatar Jun 24 '25 23:06 mtworth

https://github.com/browser-use/browser-use/blob/main/examples/models/openrouter.py

MagMueller avatar Sep 07 '25 02:09 MagMueller