Is there a way to learn off-topic data?
My question is whether it is useful to train sentences that contain no named entities to increase the Precision and Recall. On this way it could be learned which sentences/context contain NE and which do not (like off-topic data). Or should I only provide trainings data containing named entities?
I don't understand what you are trying to ask.
If I understand right, you are trying to extend your dataset with data that is not labeled. In this case, your Precision and Recall will only increase for "O" (BILOU). Also, it may let your Named Entities Scores even worse. Give it a try and run the conneval script, it will clarify what I'm trying to explain.
Unsubscribe
On Tue, Jan 2, 2018 at 9:14 PM, Rafael Antonio Ribeiro Gomes < [email protected]> wrote:
If I understand right, you are trying to extend your dataset with data that is not labeled. In this case, your Precision and Recall will only increase for "O" (BILOU). Also, it may let your Named Entities Scores even worst. Give it a try and run the conneval script, it will clarify what I'm trying to explain.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mit-nlp/MITIE/issues/153#issuecomment-354924443, or mute the thread https://github.com/notifications/unsubscribe-auth/AANybUWWqIPXGm4TuGFhI1Le3_OhuWzBks5tGuKFgaJpZM4PY1Qj .