TypeError: forward() missing 2 required positional arguments: 'document_batch' and 'document_sequence_lengths'
I don't understand this error message that I am getting, did anybody experience anything similar?
I am running the training script provided in examples on my own dataset, making sure to pass the data in the same format as provided in the examples. I was getting a cuda error before but after lowering the batch size this is the error that is preventing me from running the code.
Here is the whole error log:
File "/disk/ocean/zein/neural/BERT_4_doc_class_training.py", line 141, in <module>
model.fit((train_documents, train_labels), (dev_documents,dev_labels))
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/bert_document_classification/document_bert.py", line 185, in fit
batch_document_sequence_lengths, device=self.args['device'])
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in replica 3 on device 3.
Original Traceback (most recent call last):
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/disk/ocean/zein/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
TypeError: forward() missing 2 required positional arguments: 'document_batch' and 'document_sequence_lengths'
@sjmielke @AndriyMulyar could this have anything to do with a package version? Is there a way you can provide me with a requirements file with the exact package versions used to succesfully run the code?
For more context, I was initially facing the same issue as the one described in this post https://github.com/AndriyMulyar/bert_document_classification/issues/18 and have downgraded torch to version 1.4.0 as described in the comments.
I managed to get the code to run by reducing the number of cudas used, I was initially using all 4 of the GPUs I had available but after limiting it to only 1 GPU the code runs. I don't know what might be causing the issue with using multiple GPUs.