Presenting a problem that arose while exporting model weights, accompanied by the corresponding resolution
During the process of exporting model weights today, the following error was encountered:
PS E:\selfplay\KataGo\python> python ./export_model_pytorch.py -checkpoint "E:\selfplay\train\checkpoint.ckpt" -export-dir E:\selfplay\models -filename-prefix b1c6nbt -model-name b1c6nbt
['./export_model_pytorch.py', '-checkpoint', 'E:\selfplay\train\checkpoint.ckpt', '-export-dir', 'E:\selfplay\models', '-filename-prefix', 'b1c6nbt', '-model-name', 'b1c6nbt']
Traceback (most recent call last):
File "E:\selfplay\KataGo\python\export_model_pytorch.py", line 461, in weights_only argument in torch.load from False to True. Re-running torch.load with weights_only set to False will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
(2) Alternatively, to load with weights_only=True please check the recommended steps in the following error message.
WeightsUnpickler error: Unsupported global: GLOBAL collections.defaultdict was not an allowed global by default. Please use torch.serialization.add_safe_globals([defaultdict]) or the torch.serialization.safe_globals([defaultdict]) context manager to allowlist this global if you trust this class/function.
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
How to solve this problem:
add torch.serialization.add_safe_globals([defaultdict]) torch.serialization.add_safe_globals([float]) in export_model_pytorch.py
Subsequently, no further errors were encountered, and the weight file was successfully output.
@lightvector
Thanks for the report. Looks like pytorch 2.6 was released this year, I'll see about incorporating this fix.
Not only does export_model_pytorch.py have this bug, but train.py does as well. The solution is the same as mentioned above.
['./train.py', '-traindir', 'E:\selfplay\train\b2c16', '-datadir', 'E:\selfplay\trainingdata\train2', '-exportdir', 'E:\selfplay\export', '-exportprefix', 'b2c16', '-pos-len', '19', '-batch-size', '128', '-model-kind', 'b2c16nbt', '-samples-per-epoch', '60000', '-swa-period-samples', '80000', '-quit-if-no-data', '-no-repeat-files', '-lr-scale', '8', '-export-prob', '1']
Using GPU device: NVIDIA GeForce GTX 1660 SUPER
Seeding torch with 25451309904342131
Traceback (most recent call last):
File "E:\selfplay\KataGo\python\train.py", line 1373, in weights_only argument in torch.load from False to True. Re-running torch.load with weights_only set to False will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
(2) Alternatively, to load with weights_only=True please check the recommended steps in the following error message.
WeightsUnpickler error: Unsupported global: GLOBAL collections.defaultdict was not an allowed global by default. Please use torch.serialization.add_safe_globals([defaultdict]) or the torch.serialization.safe_globals([defaultdict]) context manager to allowlist this global if you trust this class/function.
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
@lightvector
Thanks.