actor-scraper
actor-scraper copied to clipboard
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
From customer: > While we were working on a task created from the apify/puppeteer-scraper (build 2.0.8), we encountered a possible bug in the preNavigationHooks. Although we were setting a "pageLoadTimeoutSecs"...
Happens more often with web-scraper, some people try to return a jQuery instance (like `$('selector')` instead of `$('selector').text()`) or a DOM node on the return object, and it either fails...
https://apify.com/apify/cheerio-scraper - link to https://sdk.apify.com/docs/api/autoscaledpool https://apify.com/apify/puppeteer-scraper - links to https://sdk.apify.com/docs/api/autoscaledpool and https://sdk.apify.com/docs/api/puppeteerpool
Web Scraper and Cheerio Scraper already have READMEs and schemas of sufficient quality, but Puppeteer Scraper is lacking. The structure and format should be exactly the same as the existing...
https://www.tripadvisor.com.au/Restaurant_Review-g255060-d12544090-Reviews-The_Meat_Wine_Co-Sydney_New_South_Wales.html https://my.apify.com/view/runs/czTCEblwfeqrA5r33
JSON input is not user-friendly. This needs to be implemented at the platform first and enabled in input schema.
* The getting started guide doesn't mention how to enqueue requests manually, I found it somewhere in the SDK docs. Maybe giving an example of enqueueing requests manually in the...
Bumps [http-cache-semantics](https://github.com/kornelski/http-cache-semantics) from 4.1.0 to 4.1.1. Commits 2449650 Update mocha 560b2d8 Don't use regex to trim whitespace b1bdb92 Remove linting package zoo c20dc7e Cache 308 See full diff in compare...