quac icon indicating copy to clipboard operation
quac copied to clipboard

document that pagecount files which pass metadata still might not parse

Open reidpr opened this issue 12 years ago • 0 comments

Guaranteeing that all pagecount files which pass metadata will parse 100% correctly means excluding quite a lot of files, for example all of February and half of March 2013.

For example, see line 83681 of pagecounts-20130201-000000.gz.

wp-update-metadata is updated, but the documentation isn't.

Note that one could put in a simple input filter, e.g. based on grep, to filter out non-parsing files before they hit Python.

reidpr avatar Apr 02 '14 18:04 reidpr