corpuscrawler
corpuscrawler copied to clipboard
Portuguese: doubt about the corpus result
I was analyzing the exit file and I realized the text for each "news" is only the title, the headline, and the 1st paragraph. It must be correct? I'm using the crawler for "pt" language.
Please don’t hesitate to make changes for improving the current state! Your pull requests would certainly be welcome.