[Bug]: Remote Browser Control with CDP
crawl4ai version
0.4.248
Expected Behavior
In the readme it says crawl4ai supports: 🔄 Remote Browser Control: Connect to Chrome Developer Tools Protocol for remote, large-scale data extraction.
However, none of the the documentation mentioned how to use crawl4ai with a remote browser via CDP.
@jlia0 You are right. There's no proper documentation for this. Good news is we are working on a command line utility that makes managing this entire remote browser control with Crawl4AI a breeze. I'll update the docs soon as that's ready and comment here too.
Hi @aravindkarnam, is there any progress? Thanks!!
@jlia0 Still testing it! Will update here soon as it's ready for release.
can I use the remote browser feature with steel.dev?
@jlia0 Still testing it! Will update here soon as it's ready for release.
Hey @aravindkarnam! Is this released yet?
@jlia0 I think you can pass cdp url in BrowserConfig.
- Ref: https://github.com/unclecode/crawl4ai/blob/897e0173618d20fea5d8952ccdbcdad0febc0fee/crawl4ai/async_configs.py#L344
I haven't tried it yet, I'll try and reply here with a sample code if that works.
Edit: Here is a sample code that worked for me:
import asyncio
from crawl4ai import AsyncWebCrawler, BrowserConfig
# Replace with your actual CDP WebSocket URL (e.g. from browserless)
CDP_URL = "wss://your-remote-host.com/devtools/browser/<id>?token=your-token"
async def main():
async with AsyncWebCrawler(
config=BrowserConfig(
cdp_url=CDP_URL,
headless=True
)
) as crawler:
result = await crawler.arun(url="https://www.google.com")
print(result)
if __name__ == "__main__":
asyncio.run(main())