poptorch icon indicating copy to clipboard operation
poptorch copied to clipboard

Failing on training; trainer.train()

Open rohullaa opened this issue 3 years ago • 1 comments

Hey,

I am trying to run a simple text classification on IPUs with PopTorch and Optimum. When I initilize the training by;

trainer.train()

I get the following error:

Traceback (most recent call last):
  File "IPUs/train.py", line 117, in <module>
    trainer.train()
  File "FOLDER/env/lib/python3.6/site-packages/optimum/graphcore/trainer.py", line 904, in train
    self._compile_model(model, next(iter(train_dataloader)), log=True)
  File "FOLDER/env/lib/python3.6/site-packages/optimum/graphcore/trainer.py", line 375, in _compile_model
    model.compile(**sample_batch)
  File "FOLDER/env/lib/python3.6/site-packages/poptorch/_poplar_executor.py", line 651, in compile
    self._compile(in_tensors)
  File "FOLDER(env/lib/python3.6/site-packages/poptorch/_impl.py", line 259, in wrapper
    return func(self, *args, **kwargs)
  File "FOLDER/env/lib/python3.6/site-packages/poptorch/_poplar_executor.py", line 569, in _compile
    self._executable = poptorch_core.compileWithTrace(*trace_args)
poptorch.poptorch_core.Error: In poptorch/python/poptorch.cpp:1371: 'std::out_of_range': basic_string::replace: __pos (which is 5) > this->size() (which is 0)
Error raised in:
  [0] Compiler::initSession
  [1] LowerToPopart::compile
  [2] compileWithTrace

Can someone please help me with this error?

rohullaa avatar Dec 21 '22 18:12 rohullaa

Hi @rohullaa, sorry for not seeing this earlier, optimum and transformers do not support python 3.6 anymore so the error might be related to that. If you still encounter it on Python 3.8 I would need to know which model from Optimum you are trying to run when you get the failure.

payoto avatar Jan 27 '23 13:01 payoto