crawl4ai [Bug]: `RateLimiter` can not modify backoff behaviour of `perform_completion_with

crawl4ai version

current master

Expected Behavior

https://github.com/unclecode/crawl4ai/blob/02f3127deda707b948e1970699fceae214677b86/crawl4ai/utils.py#L1645C5-L1645C17

it can not be changed by outside function

Current Behavior

it can not be changed by outside function

Is this reproducible?

Yes

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

not related

Python version

not related

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

Jul 04 '25 05:07 lance6716

Hey, thanks for raising this issue!

The RateLimiter class and perform_completion_with_backoff serve different purposes and are designed for different types of requests:

RateLimiter class: Designed for HTTP web crawling with domain-based state tracking and async operations
perform_completion_with_backoff: Designed for LLM API calls (via litellm) with provider-based retry logic and sync operations

These systems have incompatible interfaces:

Different delay mechanisms (asyncio.sleep() vs time.sleep())
Different error handling (HTTP status codes vs RateLimitError from litellm)
Different state management (per-domain vs no persistent state)
Different configuration approaches (configurable vs hardcoded values)

Proposed Solution

We can enhance perform_completion_with_backoff to accept rate limiting configuration parameters, similar to how RateLimiter works:

def perform_completion_with_backoff(
    provider,
    prompt_with_variables,
    api_token,
    json_response=False,
    base_url=None,
    # New rate limiting parameters
    base_delay: float = 2.0,
    max_delay: float = 60.0,
    max_retries: int = 3,
    exponential_factor: float = 2.0,
    **kwargs,
):
    # ... existing implementation with configurable parameters

This approach would:

Maintain backward compatibility
Allow users to configure rate limiting behavior
Provide similar flexibility to the RateLimiter class
Keep the systems separate (as they should be)

Alternative: Dedicated LLM Rate Limiter

We could also create a dedicated LLMRateLimiter class specifically for LLM API calls, but this might be overkill for the current use case.

Aug 03 '25 07:08 SohamKukreti

Hi, any movement on this issue? I'd like to configure retry & wait parameters when making LLM API calls. Thanks.

Nov 21 '25 19:11 smajoseph

Hi, any movement on this issue? I'd like to configure retry & wait parameters when making LLM API calls. Thanks.

@SohamKukreti tagging for visibility

Nov 21 '25 20:11 smajoseph

Hey @smajoseph, we'll add this in the next release, thanks!

Nov 24 '25 16:11 SohamKukreti

@smajoseph Are you currently making LLM API calls through a crawl4ai feature such as LLMExtraction or LLMContentFilter or are you directly making the calls?

Nov 26 '25 16:11 SohamKukreti

@SohamKukreti

Using the LLMExtraction feature, and the hardcoded ~~timeout~~ backoff values are too low to alleviate rate limit errors I'm experiencing.

Nov 26 '25 17:11 smajoseph

@SohamKukreti

It also may be worth noting that I'm experiencing rate limiting errors due in part to this issue: https://github.com/unclecode/crawl4ai/issues/1178

But regardless, it would be helpful to have control over backoff parameters for LLM API calls.

Nov 26 '25 17:11 smajoseph

[Bug]: `RateLimiter` can not modify backoff behaviour of `perform_completion_with_backoff`

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Proposed Solution

Alternative: Dedicated LLM Rate Limiter