nodejs-web-scraper icon indicating copy to clipboard operation
nodejs-web-scraper copied to clipboard

CollectContent: option to get attributes, besides node content

Open mariusa opened this issue 4 years ago • 1 comments

Hi, thanks for this useful library!

I'm trying to get the href value of this element:

<a class="spec" href="spec.pdf"><svg>download icon</svg></a>

Tried

const spec = new CollectContent('a[class="spec"]', { name: 'spec', contentType: 'html' })

but this gets the a node content (<svg> ...), without its attributes. Cherio has .attr('href'). Would it be possible to please support this in CollectContent, similar to contentType:'text' ? eg

const spec = new CollectContent('a[class="spec"]', { name: 'spec', contentType: 'text', attributes: ['href'] })

This would include in result the listed attributes, besides name, data...

Thanks!

mariusa avatar Oct 07 '21 10:10 mariusa

@mariusa

Maybe this helps: https://github.com/ibrod83/nodejs-web-scraper/issues/29#issue-1303198018

LydiaF avatar Jul 13 '22 09:07 LydiaF