node-open-graph icon indicating copy to clipboard operation
node-open-graph copied to clipboard

Requests for og being tagged as robot requests

Open YasharF opened this issue 9 years ago • 4 comments

Some sites recognize the requests from this module as robot requests and respond with a stripped down version of the page which is missing og data.

Examples: https://www.sciencedaily.com/releases/2016/03/160310082606.htm

YasharF avatar Mar 29 '16 17:03 YasharF

I think setting a user agent in the requests may resolve this problem, but I'm not sure.

YasharF avatar Mar 29 '16 17:03 YasharF

Hmm, interesting. Haven't thought about this. Is it good for the internet to pretend to not be a robot; is the module a robot or not. I guess it would depend on how it's used.

Maybe an option to set a user agent would help with this. I'll have to think about it.

samholmes avatar Aug 25 '16 20:08 samholmes

Do we have a user agent defined for requests going through this?

I am working on a site which does not provide the og tags because the user agent is none in the listed below:

"googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator|Viber|WhatsApp|Telegram"

Request to set a user agent from one of this, or perhaps a new one?

theunexpected1 avatar Oct 06 '18 10:10 theunexpected1

Until new APIs are decided to supplement the need to customize requests, you can make your own requests for the HTML using a library like request or axios. Once you have the HTML, you can pass that to the parse function:

og.parse(html)

This synchronous function will return the open graph meta data object without making any HTTP requests.

samholmes avatar Oct 08 '18 20:10 samholmes