MedKhem comments

Results 14 comments of


                                            MedKhem

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

Hi! thank you for pointing this out. But it shouldn't represent an issue for the training :)

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

could you please upload the data of both cases on GitHub repo so I can have a look?

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

Please checkout the latest version and let me know if I can close this issue

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

ok. so this means the wrongly escaped line breaks is fixed ;) can you send me the corresponding pdf?

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

yeah. The fix was for issue #24 . Sorry for the confusion. Regarding this "issue", it's not actually an issue. In fact, for the annotation, the training files are not...

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

This shouldn't represent a noise for the training. Have you annotated the same files in two modes (pre-annotated and manually annotated) and you noticed that there is a difference in...

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

@gabays same question: do you have the same data annotated in the two modes, so I could use it to reproduce the problem?

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

@PonteIneptique @gabays the way how grobid is designed, the line breaks are not used as characters in the training. Only the text is used for the training and the line...

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

It depends on where you add them :) For the **lexical entry** level, if you add a new line between elements of lexical entry (e.g. \, \,..) that's fine. But...

createAnnotatedTrainingDictionaryBodySegmentation generates linebreak in the resulting TEI file

no, I do use it :) but as I told you, this is done in a previous stage. We can not make a general conclusion about the performance of a...