PaddleRec icon indicating copy to clipboard operation
PaddleRec copied to clipboard

使用教程中已经训练好的 movie_model的压缩文件进行静态图训练recall模型出错

Open Chgocn opened this issue 4 years ago • 5 comments

教程地址:https://aistudio.baidu.com/aistudio/projectdetail/1816335?channelType=0&channel=0

paddlepaddle : 2.2.1 paddleRec: 2.2.0

已将models/demo/movie_recommand/recall/movie.yaml里的use_gpu改成False

运行命令

cd PaddleRec/models/demo/movie_recommand && python -u ../../../tools/static_trainer.py -m recall/movie.yaml

报错信息如下

Lenovo@DESKTOP-91MGDB0 MINGW64 /c/workspace/github/PaddleRec/models/demo/movie_recommand (release/2.2.0)
$ python -u ../../../tools/static_trainer.py -m recall/movie.yaml
C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\layers\math_op_patch.py:341: UserWarning: C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\nn\functional\common.py:1423
The behavior of expression A / B has been unified with elementwise_div(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_div(X, Y, axis=0) instead of A / B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
2021-12-08 16:24:39,401 - INFO - cpu_num: None
2021-12-08 16:24:39,402 - INFO - **************common.configs**********
2021-12-08 16:24:39,402 - INFO - use_gpu: False, use_xpu: False, use_visual: False, train_batch_size: 1, train_data_dir: ../data/train, epochs: 5, print_interval: 20, model_save_path: movie_model
2021-12-08 16:24:39,402 - INFO - **************common.configs**********
2021-12-08 16:24:39,860 - INFO - reader path:reader
Traceback (most recent call last):
  File "../../../tools/static_trainer.py", line 282, in <module>
    main(args)
  File "../../../tools/static_trainer.py", line 149, in main
    config, use_visual, log_visual, step_num)
  File "../../../tools/static_trainer.py", line 249, in dataloader_train
    fetch_list=[var for _, var in fetch_vars.items()])
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\executor.py", line 1262, in run
    six.reraise(*sys.exc_info())
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\six.py", line 719, in reraise
    raise value
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\executor.py", line 1260, in run
    return_merged=return_merged)
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\executor.py", line 1402, in _run_impl
    use_program_cache=use_program_cache)
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\executor.py", line 1479, in _run_program
    self._feed_data(program, feed, feed_var_name, scope)
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\executor.py", line 800, in _feed_data
    check_feed_shape_type(var, cur_feed)
  File "C:\workspace\github\PaddleRec\venv\lib\site-packages\paddle\fluid\executor.py", line 247, in check_feed_shape_type
    (var.name, var_dtype_format, feed_dtype_format))
ValueError: The data type of fed Variable 'label' must be 'int64', but received 'int32'

Chgocn avatar Dec 08 '21 08:12 Chgocn

分析源码之后,找到models/demo/movie_recommand/recall/static_model.py文件中 int64 全改成int32,可以解决问题。

Chgocn avatar Dec 08 '21 08:12 Chgocn

在aistudio教程中,按照您说的情况,是可以正常运行的,没能复现错误。您这里看起来是在自己的环境下进行训练,是否能提供下环境的信息呢?

yinhaofeng avatar Dec 08 '21 09:12 yinhaofeng

我刚确认了在aistudio里跑是正常的。我是在windows 10下进行的训练。

Chgocn avatar Dec 09 '21 01:12 Chgocn

是否能确认一下python及numpy的版本?这里我们在reader中是写明转成int型的,可能您的环境中默认int型为int32,导致的这个问题。

yinhaofeng avatar Dec 09 '21 03:12 yinhaofeng

python : 3.7.0 Package Version


astor 0.8.1 certifi 2021.10.8 charset-normalizer 2.0.9 decorator 5.1.0 gast 0.3.3 grpcio 1.37.1 grpcio-tools 1.37.1 idna 3.3 importlib 1.0.4 numpy 1.19.3 paddlepaddle 2.0.2 Pillow 8.4.0 pip 21.3.1 protobuf 3.19.1 py27hash 1.0.2 pymilvus 1.1.2 PyYAML 6.0 requests 2.26.0 setuptools 58.3.0 six 1.16.0 ujson 4.3.0 urllib3 1.26.7 wheel 0.37.0

Chgocn avatar Dec 10 '21 09:12 Chgocn