crawlee-python
crawlee-python copied to clipboard
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
[](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [@typescript-eslint/eslint-plugin](https://typescript-eslint.io/packages/eslint-plugin) ([source](https://togithub.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin)) | [`7.15.0` -> `7.16.1`](https://renovatebot.com/diffs/npm/@typescript-eslint%2feslint-plugin/7.15.0/7.16.1) |...
- Implement `SqliteStorageClient` as an alternative to `MemoryStorageClient`. - Some reasoning can be found in https://github.com/apify/crawlee-python/issues/860 - It should be implemented via an ORM, to support multiple dialects out of...
Describe how to use curl impersonate, how to wrap it so it can be used as alternative to httpx
This update enhances the JSON format for the extracted data, providing more structure and additional details. Changes include: - Added meta description extraction. - Included all page links. - Added...
### Description The purpose of the PR is to fix the indentation of statistics logging. It was originally 8 space indentation but now it is changed to be all on...
Currently we only support poetry, but it'd be beneficial to also have a "raw pip" flavor of the templates.
## Goal state - Correctly render docstrings in Docusaurus. - We use [Google style](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html). - Types should be rendered from type annotations (not docstrings). ### Arguments of (public) functions &...
[](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [mypy](https://www.mypy-lang.org/) ([source](https://togithub.com/python/mypy), [changelog](https://mypy-lang.blogspot.com/)) | `~1.10.0` -> `~1.11.0`...