Is it not possible to `load` transformer model on cpu only
What happened?
I need to run inference of BERT4Rec on a CPU-only instance, but I can't.
When I try to load the fitted model, I get a PyTorch error.
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
This occurs in __setstate__ at rectools/models/nn/transformers/base.py:596, and it’s currently impossible to pass the map_location parameter when loading the model.
Is exists some workaround there?
Expected behavior
No response
Relevant logs and/or screenshots
No response
Operating System
ubuntu
Python Version
3.11
RecTools version
0.13.0
Hi!
Thank you for highlighting this.
We ourselves usually use pytorch lightning checkpoints in practice for saving and loading models. E.g. you can create a callback that saves model checkpoint on last epoch during training and use this checkpoint with load_from_checkpoint method and pass map_location there.
Is it ready for production use?
As I understand, when using load_from_checkpoint, I can't use the recommend method before fitting the model again.
And I don’t want to fit the model every time the server starts :)
We do have an issue that loaded model can't be saved without fitting again due to pytorch-lightning specific behaviour. But we didn't have any problems with recommending. Which problems do you face in this scenario?