Unhandled JSON.parse errors
Describe the bug JSON.parse errors are not handled correctly, causing error result to be thrown rather than returned in the Promise.
To Reproduce Try it on the html output of for e.g. https://www.headlightsdepot.com because one of the JSON+LD script tags in there has line breaks which causes JSON.parse to throw.
Expected behavior
Expected to be able to use the error result object to handle the error, but had to wrap the call to await ogs in try/catch and handle the error result in the catch branch.
Actual behavior
The call to await ogs({html}) throws the ErrorResult
ogs version: 6.8.1
I'm getting UND_ERR_HEADERS_OVERFLOW( HeadersOverflowError: Headers Overflow Error) errors from this page. Are these the errors you are seeing? If yes, you will need to open a issue on undici.
Example code:
const { fetch } = require('undici');
const getAPI = async () => {
const request = await fetch('https://www.headlightsdepot.com');
const text = await request.text();
console.log('text:', text);
};
getAPI();
No,i'm seeing this error, but I'm running this from within a headless browser, where i've already loaded the URL, so I am passing the html option rather than the url:
{
success: false,
requestUrl: undefined,
error: 'Bad control character in string literal in JSON at position 1004',
errorDetails: SyntaxError: Bad control character in string literal in JSON at position 1004
at JSON.parse (<anonymous>)
at Element.<anonymous> (/Users/cjr/dev/node_modules/open-graph-scraper/dist/esm/lib/extract.js:105:30)
at LoadedCheerio.each (/Users/cjr/dev/node_modules/cheerio/lib/api/traversing.js:519:26)
at extractMetaTags (/Users/cjr/dev/node_modules/open-graph-scraper/dist/esm/lib/extract.js:97:21)
at setOptionsAndReturnOpenGraphResults (/Users/cjr/dev/node_modules/open-graph-scraper/dist/esm/lib/openGraphScraper.js:24:48)
at run (/Users/cjr/dev/node_modules/open-graph-scraper/dist/esm/index.js:26:56)
}
To be honest I think the confusion I had was that the type of the main ogs function is Promise<SuccessResult | ErrorResult> but actually I can see that if there is an ErrorResult you're throwing it, so its not quite the same thing as the return type suggests?
Hello, thanks for the full error. This should be fixed in [email protected].
Closing this for now. Please open a new ticket if you see any other issues.