crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: arun_many error when using BFSDeepCrawlingStrategy

Open egianetto opened this issue 10 months ago • 3 comments

crawl4ai version

0.5.0.post8

Expected Behavior

As far as I understand, arun_many() should work with any strategy. I've tested, and when running with default strategy, it works.

Current Behavior

But when I add a deep_crawl_strategy=BFSDeepCrawlStrategy, then it throws the error: Error Message: 'async_generator' object has no attribute 'status_code'

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce

To see the expected behaviour, run: python arun_many_bug.py

Then to see the error, run: python arun_many_bug.py bfs

Code snippets

import asyncio

from crawl4ai import (
    AsyncWebCrawler,
    CrawlerRunConfig,
    BFSDeepCrawlStrategy,
    CacheMode,
)

async def basic_deep_crawl(bfs: bool = False):

    if bfs:
        print('-- BFS config --')
        config = CrawlerRunConfig(
            deep_crawl_strategy=BFSDeepCrawlStrategy(
                max_depth=1,
                max_pages=2,
                include_external=False,

            ),
            cache_mode=CacheMode.BYPASS,
            stream=True,
        )
    else:
        print('-- Default config --')
        config = CrawlerRunConfig(
            cache_mode=CacheMode.BYPASS,
            stream=True,
        )

    async with AsyncWebCrawler() as crawler:

        async for result in await crawler.arun_many(urls=['https://crawl4ai.com'], config=config):
            print('len(HTML): ', len(result.html))
            print('Success: ', result.success)
            print('Error Message: ', result.error_message)


if __name__ == "__main__":
    import sys
    if len(sys.argv) > 1 and sys.argv[1] == 'bfs':
        asyncio.run(basic_deep_crawl(True))
    else:
        asyncio.run(basic_deep_crawl(False))

OS

macOS

Python version

3.12.4

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

When running with default strategy:

Image

When running with BFSDeepCrawlStrategy:

Image

egianetto avatar Apr 04 '25 20:04 egianetto

Hi @egianetto Thanks for the heads-up! Just a quick note, we’ve already fixed this bug in the next branch.

ntohidi avatar Apr 07 '25 15:04 ntohidi

Awesome! Thanks!

egianetto avatar Apr 07 '25 15:04 egianetto

Hi @egianetto Thanks for the heads-up! Just a quick note, we’ve already fixed this bug in the branch.next

good,I just was confused with this

DongDong20050214 avatar Apr 07 '25 17:04 DongDong20050214

This is not fixed on 0.6.2 nor on the next branch (installed via pip install git+https://github.com/unclecode/crawl4ai.git@next)

Still getting 'async_generator' object has no attribute 'status_code' when using a deep_crawl_strategy with arun_many

Brenndoerfer avatar May 09 '25 16:05 Brenndoerfer

I'm also still seeing this error in v0.6.2

greatwitenorth avatar May 10 '25 20:05 greatwitenorth

I'm also seeing this on v0.6.2. Looks like Crawl4AI could do with some unit tests and CI/CD pipelines. I'd be happy to help build some!

DevJake avatar Jun 22 '25 08:06 DevJake

yes, I can confirm that it's still happening in 0.6.3. i will open the issue @aravindkarnam

ntohidi avatar Jul 03 '25 10:07 ntohidi

See https://github.com/unclecode/crawl4ai/issues/1205 for details on the issue.

peterkostadinov avatar Jul 08 '25 14:07 peterkostadinov