nodejs-web-scraper
nodejs-web-scraper copied to clipboard
CollectContent: option to get attributes, besides node content
Hi, thanks for this useful library!
I'm trying to get the href value of this element:
<a class="spec" href="spec.pdf"><svg>download icon</svg></a>
Tried
const spec = new CollectContent('a[class="spec"]', { name: 'spec', contentType: 'html' })
but this gets the a node content (<svg> ...), without its attributes.
Cherio has .attr('href'). Would it be possible to please support this in CollectContent, similar to contentType:'text' ?
eg
const spec = new CollectContent('a[class="spec"]', { name: 'spec', contentType: 'text', attributes: ['href'] })
This would include in result the listed attributes, besides name, data...
Thanks!
@mariusa
Maybe this helps: https://github.com/ibrod83/nodejs-web-scraper/issues/29#issue-1303198018