Rate limit tokens per minute
Hi!
First of all thanks for what seems like a great framework! Seems like so much thought was put into this, well done!
Secondly, I'm hitting a rate limit quota when using google Gemini and the playwright MCP. The moment the rate limit is hit, the execution stops. Is there a way of limiting the speed of the agent execution in order to not hit the ceiling of tokens per minute?
Agent setup:
Agent(
instructions="You are an autonomous agent...",
api_key = api_key,
llm="gemini/gemini-2.5-flash",
tools=MCP(
command="npx",
args=["@playwright/mcp@latest"],
timeout=120,
),
self_reflect=True,
verbose=True
)
The error I'm getting:
{
"error": {
"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
"status": "RESOURCE_EXHAUSTED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.QuotaFailure",
"violations": [
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_paid_tier_input_token_count",
"quotaId": "GenerateContentPaidTierInputTokensPerModelPerMinute",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-flash"
},
"quotaValue": "1000000"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Learn more about Gemini API quotas",
"url": "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.RetryInfo",
"retryDelay": "58s"
}
]
}
}
@claude review this issue and do a detailed analysis and fix this if the existing code doesn't have the solution implemented. Making sure it has backward compatibility, no existing features removed. After making those changes, again review the applied changes. Use @web to search if you dont know any information or to find the latest documentation or to find the latest version. Run the code if you think you need to run it to test it. Minimal code change to start with if required any changes. MAINLY IT SHOULD NOT IMPACT ON THE CURRENT SPEED OF EXECUTION of existing features, not increasing the overhead. Please create a PR using gh tool with your changes.