Cross-Language-Dataset icon indicating copy to clipboard operation
Cross-Language-Dataset copied to clipboard

Meaning of file name in masks

Open veronica320 opened this issue 6 years ago • 0 comments

Hello again,

Another question about the mask file: in the following line from readme,

{"_id":{"$oid":"56bdbf0fe405a41c1f8b4569"},"0":0,"1":2,"2":"1462817114-25","3":"727911955-101"}

what does 25 in 1462817114-25 denote? (similarly, 101 in 727911955-101?) The readme says they are file names, but I checked the corpus and found there're only file names such as 1462817114 and 727911955, not including the ending part. Does 25 and 101 refer to line index (but oftentimes this number exceeds the total number of lines)?

Thanks again for your help!

veronica320 avatar Apr 08 '19 07:04 veronica320