[Bug]: arun_many error when using BFSDeepCrawlingStrategy
crawl4ai version
0.5.0.post8
Expected Behavior
As far as I understand, arun_many() should work with any strategy. I've tested, and when running with default strategy, it works.
Current Behavior
But when I add a deep_crawl_strategy=BFSDeepCrawlStrategy, then it throws the error: Error Message: 'async_generator' object has no attribute 'status_code'
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
To see the expected behaviour, run: python arun_many_bug.py
Then to see the error, run: python arun_many_bug.py bfs
Code snippets
import asyncio
from crawl4ai import (
AsyncWebCrawler,
CrawlerRunConfig,
BFSDeepCrawlStrategy,
CacheMode,
)
async def basic_deep_crawl(bfs: bool = False):
if bfs:
print('-- BFS config --')
config = CrawlerRunConfig(
deep_crawl_strategy=BFSDeepCrawlStrategy(
max_depth=1,
max_pages=2,
include_external=False,
),
cache_mode=CacheMode.BYPASS,
stream=True,
)
else:
print('-- Default config --')
config = CrawlerRunConfig(
cache_mode=CacheMode.BYPASS,
stream=True,
)
async with AsyncWebCrawler() as crawler:
async for result in await crawler.arun_many(urls=['https://crawl4ai.com'], config=config):
print('len(HTML): ', len(result.html))
print('Success: ', result.success)
print('Error Message: ', result.error_message)
if __name__ == "__main__":
import sys
if len(sys.argv) > 1 and sys.argv[1] == 'bfs':
asyncio.run(basic_deep_crawl(True))
else:
asyncio.run(basic_deep_crawl(False))
OS
macOS
Python version
3.12.4
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
When running with default strategy:
When running with BFSDeepCrawlStrategy:
Hi @egianetto
Thanks for the heads-up! Just a quick note, we’ve already fixed this bug in the next branch.
Awesome! Thanks!
Hi @egianetto Thanks for the heads-up! Just a quick note, we’ve already fixed this bug in the branch.
next
good,I just was confused with this
This is not fixed on 0.6.2 nor on the next branch (installed via pip install git+https://github.com/unclecode/crawl4ai.git@next)
Still getting 'async_generator' object has no attribute 'status_code' when using a deep_crawl_strategy with arun_many
I'm also still seeing this error in v0.6.2
I'm also seeing this on v0.6.2. Looks like Crawl4AI could do with some unit tests and CI/CD pipelines. I'd be happy to help build some!
yes, I can confirm that it's still happening in 0.6.3. i will open the issue @aravindkarnam
See https://github.com/unclecode/crawl4ai/issues/1205 for details on the issue.