_read_data function skipping the last sentence
Dear authors,
I was going over the code to prepare the input sentences for fine-tuning NER. The _read_data function I think has a small bug which makes the last sentence not added to the lines array. Here is the output for the first two sentences of the BC2GM train split.
[['O O O O O B I I O O O O O O O O B I I O O O O O O O O O O O', 'Immunohistochemical staining was positive for S - 100 in all 9 cases stained , positive for HMB - 45 in 9 ( 90 % ) of 10 , and negative'], ['O B O O O O O O O O O O O O O O O O', 'for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections .']]
Expected output after fixing the bug:
[['O O O O O B I I O O O O O O O O B I I O O O O O O O O O O O', 'Immunohistochemical staining was positive for S - 100 in all 9 cases stained , positive for HMB - 45 in 9 ( 90 % ) of 10 , and negative'], ['O B O O O O O O O O O O O O O O O O', 'for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections .'], ['B I O O O O O B O O O O O O O B I O O O B I I O O O O O O O', 'Chloramphenicol acetyltransferase assays examining the ability of IE86 to repress activity from the HCMV major IE promoter or activate the HCMV early promoter for the 2 . 2 - kb'], ['O O O O O O O O O B I O', 'class of RNAs demonstrated the functional integrity of the IE86 protein .']]