BERT-Relation-Extraction icon indicating copy to clipboard operation
BERT-Relation-Extraction copied to clipboard

evaluation error

Open GhadaAlfattni opened this issue 5 years ago • 3 comments

Hi

I was trying to run your code on another task. I have processed the data to be similar to semeval data. I have also amended the number of relations and the 2 files for evaluation with my relation types. However, I keep getting this error, any idea why?

[Epoch: 1, 3216/ 32185 points] total loss, accuracy per batch: 0.703, 0.733 [Epoch: 1, 6432/ 32185 points] total loss, accuracy per batch: 0.610, 0.774 [Epoch: 1, 9648/ 32185 points] total loss, accuracy per batch: 0.567, 0.794 [Epoch: 1, 12864/ 32185 points] total loss, accuracy per batch: 0.577, 0.780 [Epoch: 1, 16080/ 32185 points] total loss, accuracy per batch: 0.561, 0.785 [Epoch: 1, 19296/ 32185 points] total loss, accuracy per batch: 0.536, 0.793 [Epoch: 1, 22512/ 32185 points] total loss, accuracy per batch: 0.545, 0.799 [Epoch: 1, 25728/ 32185 points] total loss, accuracy per batch: 0.540, 0.792 [Epoch: 1, 28944/ 32185 points] total loss, accuracy per batch: 0.522, 0.799 [Epoch: 1, 32160/ 32185 points] total loss, accuracy per batch: 0.489, 0.810 06/21/2020 12:24:46 AM [INFO]: Evaluating test samples... 0%| | 1/1644 [00:15<7:00:27, 15.35s/it] Traceback (most recent call last): File "main_task.py", line 48, in net = train_and_fit(args) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/trainer.py", line 159, in train_and_fit results = evaluate_results(net, test_loader, pad_id, cuda) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/train_funcs.py", line 93, in evaluate_results e1_e2_start=e1_e2_start) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 734, in forward embedding_output = self.embeddings(input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 177, in forward position_embeddings = self.position_embeddings(position_ids) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) IndexError: index out of range in self

GhadaAlfattni avatar Jun 21 '20 10:06 GhadaAlfattni

probably got to do with your test_loader outputs

plkmo avatar Aug 19 '20 23:08 plkmo

Hi

I was trying to run your code on another task. I have processed the data to be similar to semeval data. I have also amended the number of relations and the 2 files for evaluation with my relation types. However, I keep getting this error, any idea why?

[Epoch: 1, 3216/ 32185 points] total loss, accuracy per batch: 0.703, 0.733 [Epoch: 1, 6432/ 32185 points] total loss, accuracy per batch: 0.610, 0.774 [Epoch: 1, 9648/ 32185 points] total loss, accuracy per batch: 0.567, 0.794 [Epoch: 1, 12864/ 32185 points] total loss, accuracy per batch: 0.577, 0.780 [Epoch: 1, 16080/ 32185 points] total loss, accuracy per batch: 0.561, 0.785 [Epoch: 1, 19296/ 32185 points] total loss, accuracy per batch: 0.536, 0.793 [Epoch: 1, 22512/ 32185 points] total loss, accuracy per batch: 0.545, 0.799 [Epoch: 1, 25728/ 32185 points] total loss, accuracy per batch: 0.540, 0.792 [Epoch: 1, 28944/ 32185 points] total loss, accuracy per batch: 0.522, 0.799 [Epoch: 1, 32160/ 32185 points] total loss, accuracy per batch: 0.489, 0.810 06/21/2020 12:24:46 AM [INFO]: Evaluating test samples... 0%| | 1/1644 [00:15<7:00:27, 15.35s/it] Traceback (most recent call last): File "main_task.py", line 48, in net = train_and_fit(args) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/trainer.py", line 159, in train_and_fit results = evaluate_results(net, test_loader, pad_id, cuda) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/train_funcs.py", line 93, in evaluate_results e1_e2_start=e1_e2_start) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 734, in forward embedding_output = self.embeddings(input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 177, in forward position_embeddings = self.position_embeddings(position_ids) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) IndexError: index out of range in self

It seems the number of relation types in your data is different from the default! notice that the sequence of the name entities for the same relation is important and is counted as a separate relation type as well

VahidehReshadat avatar Nov 06 '20 10:11 VahidehReshadat

If you want to run the task on your own data, do note that in preprocessing_funcs.py line 70: rm = Relations_Mapper(df_train['relations']), the relation classes are mapped using the train set only. So either you will need to ensure all relations classes are captured in the train dataset, or modify the code to ensure all relation classes in both train + test sets are captured. This may be causing your errors.

plkmo avatar Nov 07 '20 00:11 plkmo