article-code icon indicating copy to clipboard operation
article-code copied to clipboard

the number of articles

Open eckolemon opened this issue 6 years ago • 2 comments

Hi, I wonder how many articles have you extracted using this python file?

eckolemon avatar Nov 08 '19 08:11 eckolemon

@jeffheaton

eckolemon avatar Nov 08 '19 08:11 eckolemon

@eckolemon

I ran this script over the english corpus dump of 2019/01/01 and got these results:

Total pages: 19,096,287
Template pages: 639,391
Article pages: 8,788,731
Redirect pages: 9,668,165
Elapsed time: 0:37:17.66

ianomad avatar Nov 17 '19 01:11 ianomad