nodejs-web-scraper icon indicating copy to clipboard operation
nodejs-web-scraper copied to clipboard

exclude content based on a list of tags, instead of removeStyleAndScriptTags

Open mariusa opened this issue 4 years ago • 1 comments

There are a few use cases which aren't covered by the removeStyleAndScriptTags option:

  1. exclude other noise content, eg <svg>
  2. exclude style, but leave scripts
  3. exclude scripts which don't have

Would you please consider adding a more generic option, removeTags? Ideally, it would also support attributes, but just tag names would be very useful anyway.

Usage:

const config = {
        removeTags: ['style', 'link', 'script', 'svg']
    }

Thanks

mariusa avatar Oct 07 '21 11:10 mariusa

Hey, feel free to maybe do it yourself, and make a pull request, thus contributing :D

ibrod83 avatar Oct 07 '21 15:10 ibrod83