agentscope Enabling `to_dist=True` does not significantly enhance the response speed of multiple agents

Hello! I'm working on incorporating distributed design into my project to enhance speed, as multiple agents independently respond to a single issue during an interaction. A minimal reproducible example is:

import os
import time

from dotenv import load_dotenv

from agentscope.agents import AgentBase
from agentscope.message import Msg
from agentscope.models.openai_model import OpenAIChatWrapper
from agentscope.parsers.json_object_parser import MarkdownJsonDictParser

load_dotenv()
api_key = os.getenv("DEEPSEEK_API_KEY")
base_url = os.getenv("DEEPSEEK_API_BASE_URL")


class TestAgent(AgentBase):
    def __init__(self, name: str) -> None:
        super().__init__(name=name)
        self._set_model()
        self._set_parser()

    def _set_model(self) -> None:
        assert api_key is not None, "API key is not available"
        self.model = OpenAIChatWrapper(
            config_name=f"config_{self.name}",
            model_name="deepseek-chat",
            api_key=api_key,
            client_args={"base_url": base_url},
        )

    def _set_parser(self) -> None:
        self.parser = MarkdownJsonDictParser(
            content_hint={"number": "Integer between 1 and 1000"},
            required_keys=["number"],
            keys_to_memory=["number"],
            keys_to_content=["number"],
        )

    def reply(self, msg: Msg) -> Msg:
        prompt = self.model.format(
            msg, Msg("system", self.parser.format_instruction, "system")
        )
        response = self.model(prompt)
        parsed_response = self.parser.parse(response).parsed
        if parsed_response is None:
            raise ValueError("Response parsing failed")
        return Msg(
            self.name,
            content=str(self.parser.to_content(parsed_response)),
            role="assistant",
        )


def run(agents: list[TestAgent], query: Msg) -> list[Msg]:
    start = time.time()
    responses = []
    for investor in agents:
        print(f"Investor: {investor.name}")
        responses.append(investor(query))

    for response in responses:
        print(response.content)
    end = time.time()
    print(f"Time taken: {end - start:.2f} seconds")
    return responses


if __name__ == "__main__":
    dist_agents = [TestAgent(name=f"test_agent_{i}", to_dist=True) for i in range(5)]
    no_dist_agents = [
        TestAgent(name=f"test_agent_{i}", to_dist=False) for i in range(5)
    ]

    query = Msg(
        name="Moderator",
        role="assistant",
        content="give me a random number between 1 and 1000, do not output any other text",
    )

    print("Running with distributed agents:")
    run(dist_agents, query)

    print("\nRunning with non-distributed agents:")
    run(no_dist_agents, query)

Running this code produced the following output:

Running with distributed agents:
Investor: test_agent_0
Investor: test_agent_1
Investor: test_agent_2
Investor: test_agent_3
Investor: test_agent_4
{'number': 742}
{'number': 427}
2025-04-13 15:55:25.481 | DEBUG    | agentscope.rpc.retry_strategy:retry:89 - Attempt 1 at [D:\WorkShop\Projects\250331_AgentSimu\code\agentscope\src\agentscope\rpc\rpc_client.py:242] failed:
<_InactiveRpcError of RPC that terminated with:
        status = StatusCode.DEADLINE_EXCEEDED
        details = "Deadline Exceeded"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Deadline Exceeded", grpc_status:4, created_time:"2025-04-13T07:55:25.3159091+00:00"}"
>.
Retrying in 6.73 seconds...
{'number': 427}
{'number': 427}
{'number': 742}
Time taken: 20.44 seconds

Running with non-distributed agents:
Investor: test_agent_0
Investor: test_agent_1
Investor: test_agent_2
Investor: test_agent_3
Investor: test_agent_4
{'number': 427}
{'number': 427}
{'number': 427}
{'number': 427}
{'number': 427}
Time taken: 26.04 seconds

I’m feeling a bit puzzled by a few things:

I expected to see a performance boost of around five times, but the actual speed increase is quite minimal.
The DEADLINE_EXCEEDED exception in the middle is concerning. I'm not very familiar with gRPC and I’m unsure why it’s occurring.
I read the "Distributed" section in the documentation, and the example code works fine on my computer. The only significant difference I see is that my agent is connected to a real LLM, while the example code only simulates this process. I’m not sure why this would lead to a discrepancy.

I apologize for bringing up such a lengthy and complex issue. By the way, I’m developing on a Windows 11 platform, using agentscope version 0.1.3.dev0, which I installed from the source along with the necessary components for distributed functionality. The LLM I use is DeepSeek-V3.

Apr 13 '25 08:04 BeitianMa

The main reason seems to be the API concurrency limitation. For remote APIs, providers generally limit the access frequency, which results in the inability to further increase the concurrency. In extreme cases, the parallel access rate may even decrease. In this case, you generally need to control the number of parallel agents based on the concurrency that the API can support, or directly use a locally deployed LLM service.

Apr 13 '25 08:04 pan-x-c

这版本是多少呢？

May 20 '25 02:05 zhaojiangbing

这版本是多少呢？

The version of agentscope was 0.1.3.dev0 as I mentioned, I think @pan-x-c is correct, it doesn't seem to be an issue with agentscope itself. By the way, the example code using dist mode is functioning properly and provides a speed increase of about 7 to 8 times now. Perhaps there have been some policy changes in deepseek.

Jun 05 '25 09:06 BeitianMa

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

Aug 18 '25 09:08 github-actions[bot]

Close this stale issue.

Aug 22 '25 09:08 github-actions[bot]