http-crawler icon indicating copy to clipboard operation
http-crawler copied to clipboard

Allow user to choose which pages to extract links from

Open inglesp opened this issue 9 years ago • 0 comments

We currently extract links from all pages that are on the same domain as the original URL that is passed to crawl.

This might be too narrow (for instance, a site may be spread over several subdomains) or too broad (for instance, somebody might be only interested in pages that are children of a particular URL).

We should find a way to allow the user to configure which pages to extract links from

inglesp avatar Jun 09 '16 11:06 inglesp