crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: `RateLimiter` can not modify backoff behaviour of `perform_completion_with_backoff`

Open lance6716 opened this issue 7 months ago • 7 comments

crawl4ai version

current master

Expected Behavior

https://github.com/unclecode/crawl4ai/blob/02f3127deda707b948e1970699fceae214677b86/crawl4ai/utils.py#L1645C5-L1645C17

it can not be changed by outside function

Current Behavior

it can not be changed by outside function

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce


Code snippets


OS

not related

Python version

not related

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

lance6716 avatar Jul 04 '25 05:07 lance6716

Hey, thanks for raising this issue!

The RateLimiter class and perform_completion_with_backoff serve different purposes and are designed for different types of requests:

  • RateLimiter class: Designed for HTTP web crawling with domain-based state tracking and async operations
  • perform_completion_with_backoff: Designed for LLM API calls (via litellm) with provider-based retry logic and sync operations

These systems have incompatible interfaces:

  • Different delay mechanisms (asyncio.sleep() vs time.sleep())
  • Different error handling (HTTP status codes vs RateLimitError from litellm)
  • Different state management (per-domain vs no persistent state)
  • Different configuration approaches (configurable vs hardcoded values)

Proposed Solution

We can enhance perform_completion_with_backoff to accept rate limiting configuration parameters, similar to how RateLimiter works:

def perform_completion_with_backoff(
    provider,
    prompt_with_variables,
    api_token,
    json_response=False,
    base_url=None,
    # New rate limiting parameters
    base_delay: float = 2.0,
    max_delay: float = 60.0,
    max_retries: int = 3,
    exponential_factor: float = 2.0,
    **kwargs,
):
    # ... existing implementation with configurable parameters

This approach would:

  1. Maintain backward compatibility
  2. Allow users to configure rate limiting behavior
  3. Provide similar flexibility to the RateLimiter class
  4. Keep the systems separate (as they should be)

Alternative: Dedicated LLM Rate Limiter

We could also create a dedicated LLMRateLimiter class specifically for LLM API calls, but this might be overkill for the current use case.

SohamKukreti avatar Aug 03 '25 07:08 SohamKukreti

Hi, any movement on this issue? I'd like to configure retry & wait parameters when making LLM API calls. Thanks.

smajoseph avatar Nov 21 '25 19:11 smajoseph

Hi, any movement on this issue? I'd like to configure retry & wait parameters when making LLM API calls. Thanks.

@SohamKukreti tagging for visibility

smajoseph avatar Nov 21 '25 20:11 smajoseph

Hey @smajoseph, we'll add this in the next release, thanks!

SohamKukreti avatar Nov 24 '25 16:11 SohamKukreti

@smajoseph Are you currently making LLM API calls through a crawl4ai feature such as LLMExtraction or LLMContentFilter or are you directly making the calls?

SohamKukreti avatar Nov 26 '25 16:11 SohamKukreti

@SohamKukreti

Using the LLMExtraction feature, and the hardcoded ~~timeout~~ backoff values are too low to alleviate rate limit errors I'm experiencing.

smajoseph avatar Nov 26 '25 17:11 smajoseph

@SohamKukreti

It also may be worth noting that I'm experiencing rate limiting errors due in part to this issue: https://github.com/unclecode/crawl4ai/issues/1178

But regardless, it would be helpful to have control over backoff parameters for LLM API calls.

smajoseph avatar Nov 26 '25 17:11 smajoseph