http-crawler
http-crawler copied to clipboard
Allow user to choose how links are extracted from responses
We currently extract links from HTML by looking for src and href attributes, and from CSS by looking for @import rules and URI tokens.
A user might want to extract links from other places, such as data- attributes in HTML, or values in JSON objects.
We should find a way to allow the user to configure how links are extracted from responses.
I would like to tackle this issue. 😃