SpeedPPI icon indicating copy to clipboard operation
SpeedPPI copied to clipboard

Cannot create a tensor proto whose content is larger than 2GB

Open Rohit-Satyam opened this issue 1 year ago • 1 comments

Hi @patrickbryant1

A few of my runs show me the following error. Can you help me out here?

/ibex/scratch/projects/c2077/rohit/miniconda/envs/speed_ppi/lib/python3.12/site-packages/Bio/Data/SCOPData.py:18: BiopythonDeprecationWarning: The 'Bio.Data.SCOPData' module will be deprecated in a future release of Biopython in favor of 'Bio.Data.PDBData.
  warnings.warn(
2024-12-24 19:28:37.274749: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1735057717.321417  544152 cuda_dnn.cc:8321] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1735057717.333927  544152 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Traceback (most recent call last):
  File "/ibex/scratch/projects/c2077/rohit/SpeedPPI/src/run_alphafold_single.py", line 258, in <module>
    main(num_ensemble=1,
  File "/ibex/scratch/projects/c2077/rohit/SpeedPPI/src/run_alphafold_single.py", line 220, in main
    processed_feature_dict = model_runner.process_features(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ibex/scratch/projects/c2077/rohit/SpeedPPI/src/alphafold/model/model.py", line 102, in process_features
    return features.np_example_to_features(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ibex/scratch/projects/c2077/rohit/SpeedPPI/src/alphafold/model/features.py", line 91, in np_example_to_features
    tensor_dict = proteins_dataset.np_to_tensor_dict(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ibex/scratch/projects/c2077/rohit/SpeedPPI/src/alphafold/model/tf/proteins_dataset.py", line 160, in np_to_tensor_dict
    tensor_dict = {k: tf.constant(v) for k, v in np_example.items()
                      ^^^^^^^^^^^^^^
  File "/ibex/scratch/projects/c2077/rohit/miniconda/envs/speed_ppi/lib/python3.12/site-packages/tensorflow/python/framework/constant_op.py", line 173, in constant_v1
    return _constant_impl(value, dtype, shape, name, verify_shape=verify_shape,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ibex/scratch/projects/c2077/rohit/miniconda/envs/speed_ppi/lib/python3.12/site-packages/tensorflow/python/framework/constant_op.py", line 291, in _constant_impl
    const_tensor = ops._create_graph_constant(  # pylint: disable=protected-access
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ibex/scratch/projects/c2077/rohit/miniconda/envs/speed_ppi/lib/python3.12/site-packages/tensorflow/python/framework/ops.py", line 284, in _create_graph_constant
    tensor_util.make_tensor_proto(
  File "/ibex/scratch/projects/c2077/rohit/miniconda/envs/speed_ppi/lib/python3.12/site-packages/tensorflow/python/framework/tensor_util.py", line 596, in make_tensor_proto
    raise ValueError(
ValueError: Cannot create a tensor proto whose content is larger than 2GB.

real	2m18.457s
user	1m55.609s
sys	0m8.566s

Rohit-Satyam avatar Dec 25 '24 13:12 Rohit-Satyam

Hi, sorry for the late reply. This is a bug within AF and we're not sure why. Try to clear your GPU memory and rerun/analyse your MSA(s).

patrickbryant1 avatar May 19 '25 20:05 patrickbryant1