Improvements in data structure (keywords and image added) and corrections in href logic
Hello, @thomastuts,
I make some changes in article-extractor to improve de data structure returned from its main extractArticle function.
First I included two new attributes:
-
keywords Obtained from tags related to "keywords" name and "swiftype > keys" variant (common used in most articles in internet (see. Engadget.com and all Vox Media articles pages)
-
image Obtained from two sources: a scored rank from all
<img>from<body>or<main>section.; otherwise from tags related to "swiftype > image" variant.
Also, make changes in obtaining author data using tags related to "swiftype > blogger_name" variant.
Note: author image can be obtained from
blogger_imageand may be pushed to a newmetadataproperty in future improvement.
The documentation was slightly improved with these new fields and a increment in minor version was made: 1.1.0