alphafold icon indicating copy to clipboard operation
alphafold copied to clipboard

Job was killed with `/app/run_alphafold.sh: line 3: 8 Killed python /app/alphafold/run_alphafold.py "$@"``

Open songyinys opened this issue 3 years ago • 4 comments

Hi, I got an issue here. My job was killed with '/app/run_alphafold.sh: line 3: 8 Killed python /app/alphafold/run_alphafold.py "$@"`'

python3 /home/song/alphafold/docker/run_docker.py --fasta_paths=HM.fasta --max_template_date=2030-03-10 --model_preset=monomer --db_preset=reduced_dbs --data_dir=/home/song/harddrive/alphafold_dbs
I0407 10:01:52.360448 140644131185024 run_docker.py:113] Mounting /home/song/alphafold/fasta -> /mnt/fasta_path_0
I0407 10:01:52.360547 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/uniref90 -> /mnt/uniref90_database_path
I0407 10:01:52.360589 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/mgnify -> /mnt/mgnify_database_path
I0407 10:01:52.360618 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs -> /mnt/data_dir
I0407 10:01:52.360645 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir
I0407 10:01:52.360675 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/pdb_mmcif -> /mnt/obsolete_pdbs_path
I0407 10:01:52.360708 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/pdb70 -> /mnt/pdb70_database_path
I0407 10:01:52.360738 140644131185024 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/small_bfd -> /mnt/small_bfd_database_path
I0407 10:01:54.346276 140644131185024 run_docker.py:255] I0407 15:01:54.345562 140657684973376 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat.
I0407 10:01:56.070738 140644131185024 run_docker.py:255] I0407 15:01:56.069675 140657684973376 tpu_client.py:54] Starting the local TPU driver.
I0407 10:01:56.071013 140644131185024 run_docker.py:255] I0407 15:01:56.070167 140657684973376 xla_bridge.py:212] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
I0407 10:01:56.190624 140644131185024 run_docker.py:255] I0407 15:01:56.190001 140657684973376 xla_bridge.py:212] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
I0407 10:02:04.227958 140644131185024 run_docker.py:255] I0407 15:02:04.227193 140657684973376 run_alphafold.py:377] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0']
I0407 10:02:04.228071 140644131185024 run_docker.py:255] I0407 15:02:04.227301 140657684973376 run_alphafold.py:393] Using random seed 598696474148977303 for the data pipeline
I0407 10:02:04.228105 140644131185024 run_docker.py:255] I0407 15:02:04.227442 140657684973376 run_alphafold.py:161] Predicting HM
I0407 10:02:04.228142 140644131185024 run_docker.py:255] I0407 15:02:04.227701 140657684973376 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpolo2mr4d/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/HM.fasta /mnt/uniref90_database_path/uniref90.fasta"
I0407 10:02:04.283227 140644131185024 run_docker.py:255] I0407 15:02:04.282486 140657684973376 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0407 10:06:29.987746 140644131185024 run_docker.py:255] I0407 15:06:29.987046 140657684973376 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 265.704 seconds
I0407 10:06:30.251781 140644131185024 run_docker.py:255] I0407 15:06:30.251073 140657684973376 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpzqrgg89o/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/HM.fasta /mnt/mgnify_database_path/mgy_clusters_2018_12.fa"
I0407 10:06:30.303924 140644131185024 run_docker.py:255] I0407 15:06:30.303090 140657684973376 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0407 10:10:59.346460 140644131185024 run_docker.py:255] I0407 15:10:59.345720 140657684973376 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 269.042 seconds
I0407 10:11:02.534446 140644131185024 run_docker.py:255] I0407 15:11:02.533800 140657684973376 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpi7u_y8pt/query.a3m -o /tmp/tmpi7u_y8pt/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70"
I0407 10:11:02.591353 140644131185024 run_docker.py:255] I0407 15:11:02.590603 140657684973376 utils.py:36] Started HHsearch query
I0407 10:12:32.815087 140644131185024 run_docker.py:255] I0407 15:12:32.814265 140657684973376 utils.py:40] Finished HHsearch query in 90.223 seconds
I0407 10:12:35.013796 140644131185024 run_docker.py:255] I0407 15:12:35.006426 140657684973376 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp12i9k_rf/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/HM.fasta /mnt/small_bfd_database_path/bfd-first_non_consensus_sequences.fasta"
I0407 10:12:35.058195 140644131185024 run_docker.py:255] I0407 15:12:35.057407 140657684973376 utils.py:36] Started Jackhmmer (bfd-first_non_consensus_sequences.fasta) query
I0407 10:13:46.664667 140644131185024 run_docker.py:255] I0407 15:13:46.663670 140657684973376 utils.py:40] Finished Jackhmmer (bfd-first_non_consensus_sequences.fasta) query in 71.606 seconds
I0407 10:13:47.202137 140644131185024 run_docker.py:255] I0407 15:13:47.201551 140657684973376 templates.py:878] Searching for template for: 
I0407 10:13:47.533718 140644131185024 run_docker.py:255] I0407 15:13:47.533015 140657684973376 templates.py:268] Found an exact template match 5dzt_A.
I0407 10:13:47.686305 140644131185024 run_docker.py:255] I0407 15:13:47.685786 140657684973376 templates.py:268] Found an exact template match 3t33_A.
I0407 10:13:48.182532 140644131185024 run_docker.py:255] I0407 15:13:48.181832 140657684973376 templates.py:268] Found an exact template match 3e6u_A.
I0407 10:13:48.406836 140644131185024 run_docker.py:255] I0407 15:13:48.406119 140657684973376 templates.py:268] Found an exact template match 3e73_A.
I0407 10:13:48.543771 140644131185024 run_docker.py:255] I0407 15:13:48.543182 140657684973376 templates.py:268] Found an exact template match 2g0d_A.
I0407 10:13:48.553284 140644131185024 run_docker.py:255] I0407 15:13:48.552886 140657684973376 templates.py:268] Found an exact template match 3e6u_A.
I0407 10:13:48.563110 140644131185024 run_docker.py:255] I0407 15:13:48.562769 140657684973376 templates.py:268] Found an exact template match 3e73_A.
I0407 10:13:48.897225 140644131185024 run_docker.py:255] I0407 15:13:48.896682 140657684973376 templates.py:268] Found an exact template match 4v1r_B.
I0407 10:13:49.276198 140644131185024 run_docker.py:255] I0407 15:13:49.275689 140657684973376 templates.py:268] Found an exact template match 4v1s_A.
I0407 10:13:49.507065 140644131185024 run_docker.py:255] I0407 15:13:49.506518 140657684973376 templates.py:268] Found an exact template match 4c1s_B.
I0407 10:13:49.516093 140644131185024 run_docker.py:255] I0407 15:13:49.515664 140657684973376 templates.py:268] Found an exact template match 4v1r_B.
I0407 10:13:49.525147 140644131185024 run_docker.py:255] I0407 15:13:49.524823 140657684973376 templates.py:268] Found an exact template match 4v1s_A.
I0407 10:13:49.534539 140644131185024 run_docker.py:255] I0407 15:13:49.534040 140657684973376 templates.py:268] Found an exact template match 4c1s_B.
I0407 10:13:49.543745 140644131185024 run_docker.py:255] I0407 15:13:49.543389 140657684973376 templates.py:268] Found an exact template match 3t33_A.
I0407 10:13:50.003878 140644131185024 run_docker.py:255] I0407 15:13:50.003276 140657684973376 templates.py:268] Found an exact template match 4mu9_A.
I0407 10:13:50.012879 140644131185024 run_docker.py:255] I0407 15:13:50.012417 140657684973376 templates.py:268] Found an exact template match 4mu9_B.
I0407 10:13:50.021894 140644131185024 run_docker.py:255] I0407 15:13:50.021365 140657684973376 templates.py:268] Found an exact template match 3e6u_A.
I0407 10:13:50.031929 140644131185024 run_docker.py:255] I0407 15:13:50.031504 140657684973376 templates.py:268] Found an exact template match 3e73_A.
I0407 10:13:50.041911 140644131185024 run_docker.py:255] I0407 15:13:50.041568 140657684973376 templates.py:268] Found an exact template match 2g0d_A.
I0407 10:13:50.557740 140644131185024 run_docker.py:255] I0407 15:13:50.557202 140657684973376 templates.py:268] Found an exact template match 4wu0_B.
I0407 10:13:51.248248 140644131185024 run_docker.py:255] I0407 15:13:51.247645 140657684973376 pipeline.py:234] Uniref90 MSA size: 4104 sequences.
I0407 10:13:51.248351 140644131185024 run_docker.py:255] I0407 15:13:51.247781 140657684973376 pipeline.py:235] BFD MSA size: 1366 sequences.
I0407 10:13:51.248383 140644131185024 run_docker.py:255] I0407 15:13:51.247805 140657684973376 pipeline.py:236] MGnify MSA size: 501 sequences.
I0407 10:13:51.248409 140644131185024 run_docker.py:255] I0407 15:13:51.247826 140657684973376 pipeline.py:238] Final (deduplicated) MSA size: 5641 sequences.
I0407 10:13:51.248434 140644131185024 run_docker.py:255] I0407 15:13:51.247973 140657684973376 pipeline.py:241] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20.
I0407 10:13:51.310783 140644131185024 run_docker.py:255] I0407 15:13:51.310043 140657684973376 run_alphafold.py:190] Running model model_1_pred_0 on HM
I0407 10:13:53.805705 140644131185024 run_docker.py:255] 2022-04-07 15:13:53.804967: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 46268640 exceeds 10% of free system memory.
I0407 10:13:53.816205 140644131185024 run_docker.py:255] 2022-04-07 15:13:53.815266: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 491443920 exceeds 10% of free system memory.
I0407 10:13:53.817507 140644131185024 run_docker.py:255] 2022-04-07 15:13:53.817032: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 467513640 exceeds 10% of free system memory.
I0407 10:13:53.818194 140644131185024 run_docker.py:255] 2022-04-07 15:13:53.817859: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 44256960 exceeds 10% of free system memory.
I0407 10:13:53.985317 140644131185024 run_docker.py:255] 2022-04-07 15:13:53.984431: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 46268640 exceeds 10% of free system memory.
I0407 10:13:55.190978 140644131185024 run_docker.py:255] I0407 15:13:55.189868 140657684973376 model.py:166] Running predict with shape(feat) = {'aatype': (4, 990), 'residue_index': (4, 990), 'seq_length': (4,), 'template_aatype': (4, 4, 990), 'template_all_atom_masks': (4, 4, 990, 37), 'template_all_atom_positions': (4, 4, 990, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 990), 'msa_mask': (4, 508, 990), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 990, 3), 'template_pseudo_beta_mask': (4, 4, 990), 'atom14_atom_exists': (4, 990, 14), 'residx_atom14_to_atom37': (4, 990, 14), 'residx_atom37_to_atom14': (4, 990, 37), 'atom37_atom_exists': (4, 990, 37), 'extra_msa': (4, 5120, 990), 'extra_msa_mask': (4, 5120, 990), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 990), 'true_msa': (4, 508, 990), 'extra_has_deletion': (4, 5120, 990), 'extra_deletion_value': (4, 5120, 990), 'msa_feat': (4, 508, 990, 49), 'target_feat': (4, 990, 22)}
I0407 10:22:21.857693 140644131185024 run_docker.py:255] I0407 15:22:21.856382 140657684973376 model.py:176] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (990, 990, 64)}, 'experimentally_resolved': {'logits': (990, 37)}, 'masked_msa': {'logits': (508, 990, 23)}, 'predicted_lddt': {'logits': (990, 50)}, 'structure_module': {'final_atom_mask': (990, 37), 'final_atom_positions': (990, 37, 3)}, 'plddt': (990,), 'ranking_confidence': ()}
I0407 10:22:21.858033 140644131185024 run_docker.py:255] I0407 15:22:21.856493 140657684973376 run_alphafold.py:204] Total JAX model model_1_pred_0 on HM predict time (includes compilation time, see --benchmark): 506.7s
I0407 10:22:40.344387 140644131185024 run_docker.py:255] I0407 15:22:40.343476 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:22:41.369135 140644131185024 run_docker.py:255] I0407 15:22:41.368305 140657684973376 amber_minimize.py:408] Minimizing protein, attempt 1 of 100.
I0407 10:22:42.601895 140644131185024 run_docker.py:255] I0407 15:22:42.601060 140657684973376 amber_minimize.py:69] Restraining 7934 / 15723 particles.
I0407 10:22:56.900804 140644131185024 run_docker.py:255] I0407 15:22:56.899905 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:23:07.534636 140644131185024 run_docker.py:255] I0407 15:23:07.533953 140657684973376 amber_minimize.py:500] Iteration completed: Einit 4763421.61 Efinal -18665.63 Time 3.53 s num residue violations 0 num residue exclusions 0
I0407 10:23:21.027155 140644131185024 run_docker.py:255] I0407 15:23:21.026266 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:23:22.952107 140644131185024 run_docker.py:255] I0407 15:23:22.951289 140657684973376 run_alphafold.py:190] Running model model_2_pred_0 on HM
I0407 10:23:25.586225 140644131185024 run_docker.py:255] I0407 15:23:25.585409 140657684973376 model.py:166] Running predict with shape(feat) = {'aatype': (4, 990), 'residue_index': (4, 990), 'seq_length': (4,), 'template_aatype': (4, 4, 990), 'template_all_atom_masks': (4, 4, 990, 37), 'template_all_atom_positions': (4, 4, 990, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 990), 'msa_mask': (4, 508, 990), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 990, 3), 'template_pseudo_beta_mask': (4, 4, 990), 'atom14_atom_exists': (4, 990, 14), 'residx_atom14_to_atom37': (4, 990, 14), 'residx_atom37_to_atom14': (4, 990, 37), 'atom37_atom_exists': (4, 990, 37), 'extra_msa': (4, 1024, 990), 'extra_msa_mask': (4, 1024, 990), 'extra_msa_row_mask': (4, 1024), 'bert_mask': (4, 508, 990), 'true_msa': (4, 508, 990), 'extra_has_deletion': (4, 1024, 990), 'extra_deletion_value': (4, 1024, 990), 'msa_feat': (4, 508, 990, 49), 'target_feat': (4, 990, 22)}
I0407 10:30:00.731788 140644131185024 run_docker.py:255] I0407 15:30:00.724386 140657684973376 model.py:176] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (990, 990, 64)}, 'experimentally_resolved': {'logits': (990, 37)}, 'masked_msa': {'logits': (508, 990, 23)}, 'predicted_lddt': {'logits': (990, 50)}, 'structure_module': {'final_atom_mask': (990, 37), 'final_atom_positions': (990, 37, 3)}, 'plddt': (990,), 'ranking_confidence': ()}
I0407 10:30:00.733442 140644131185024 run_docker.py:255] I0407 15:30:00.732738 140657684973376 run_alphafold.py:204] Total JAX model model_2_pred_0 on HM predict time (includes compilation time, see --benchmark): 395.1s
I0407 10:30:16.300615 140644131185024 run_docker.py:255] I0407 15:30:16.299608 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:30:16.988835 140644131185024 run_docker.py:255] I0407 15:30:16.987960 140657684973376 amber_minimize.py:408] Minimizing protein, attempt 1 of 100.
I0407 10:30:18.402831 140644131185024 run_docker.py:255] I0407 15:30:18.401961 140657684973376 amber_minimize.py:69] Restraining 7934 / 15723 particles.
I0407 10:30:32.590772 140644131185024 run_docker.py:255] I0407 15:30:32.583527 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:30:57.097590 140644131185024 run_docker.py:255] I0407 15:30:57.096977 140657684973376 amber_minimize.py:500] Iteration completed: Einit 11861470.36 Efinal -18421.26 Time 3.81 s num residue violations 0 num residue exclusions 0
I0407 10:31:13.040065 140644131185024 run_docker.py:255] I0407 15:31:13.039160 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:31:14.855728 140644131185024 run_docker.py:255] I0407 15:31:14.854954 140657684973376 run_alphafold.py:190] Running model model_3_pred_0 on HM
I0407 10:31:17.374645 140644131185024 run_docker.py:255] I0407 15:31:17.371672 140657684973376 model.py:166] Running predict with shape(feat) = {'aatype': (4, 990), 'residue_index': (4, 990), 'seq_length': (4,), 'is_distillation': (4,), 'seq_mask': (4, 990), 'msa_mask': (4, 512, 990), 'msa_row_mask': (4, 512), 'random_crop_to_size_seed': (4, 2), 'atom14_atom_exists': (4, 990, 14), 'residx_atom14_to_atom37': (4, 990, 14), 'residx_atom37_to_atom14': (4, 990, 37), 'atom37_atom_exists': (4, 990, 37), 'extra_msa': (4, 5120, 990), 'extra_msa_mask': (4, 5120, 990), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 512, 990), 'true_msa': (4, 512, 990), 'extra_has_deletion': (4, 5120, 990), 'extra_deletion_value': (4, 5120, 990), 'msa_feat': (4, 512, 990, 49), 'target_feat': (4, 990, 22)}
I0407 10:37:45.255258 140644131185024 run_docker.py:255] I0407 15:37:45.254414 140657684973376 model.py:176] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (990, 990, 64)}, 'experimentally_resolved': {'logits': (990, 37)}, 'masked_msa': {'logits': (512, 990, 23)}, 'predicted_lddt': {'logits': (990, 50)}, 'structure_module': {'final_atom_mask': (990, 37), 'final_atom_positions': (990, 37, 3)}, 'plddt': (990,), 'ranking_confidence': ()}
I0407 10:37:45.264786 140644131185024 run_docker.py:255] I0407 15:37:45.263966 140657684973376 run_alphafold.py:204] Total JAX model model_3_pred_0 on HM predict time (includes compilation time, see --benchmark): 387.9s
I0407 10:38:12.306217 140644131185024 run_docker.py:255] I0407 15:38:12.305210 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:38:13.627728 140644131185024 run_docker.py:255] I0407 15:38:13.626933 140657684973376 amber_minimize.py:408] Minimizing protein, attempt 1 of 100.
I0407 10:38:15.155689 140644131185024 run_docker.py:255] I0407 15:38:15.154838 140657684973376 amber_minimize.py:69] Restraining 7934 / 15723 particles.
I0407 10:38:29.867942 140644131185024 run_docker.py:255] I0407 15:38:29.867018 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:38:48.780642 140644131185024 run_docker.py:255] I0407 15:38:48.779655 140657684973376 amber_minimize.py:500] Iteration completed: Einit 1463943.83 Efinal -18591.48 Time 4.98 s num residue violations 0 num residue exclusions 0
I0407 10:39:02.920470 140644131185024 run_docker.py:255] I0407 15:39:02.919537 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:39:04.851090 140644131185024 run_docker.py:255] I0407 15:39:04.849699 140657684973376 run_alphafold.py:190] Running model model_4_pred_0 on HM
I0407 10:39:07.390420 140644131185024 run_docker.py:255] I0407 15:39:07.388259 140657684973376 model.py:166] Running predict with shape(feat) = {'aatype': (4, 990), 'residue_index': (4, 990), 'seq_length': (4,), 'is_distillation': (4,), 'seq_mask': (4, 990), 'msa_mask': (4, 512, 990), 'msa_row_mask': (4, 512), 'random_crop_to_size_seed': (4, 2), 'atom14_atom_exists': (4, 990, 14), 'residx_atom14_to_atom37': (4, 990, 14), 'residx_atom37_to_atom14': (4, 990, 37), 'atom37_atom_exists': (4, 990, 37), 'extra_msa': (4, 5120, 990), 'extra_msa_mask': (4, 5120, 990), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 512, 990), 'true_msa': (4, 512, 990), 'extra_has_deletion': (4, 5120, 990), 'extra_deletion_value': (4, 5120, 990), 'msa_feat': (4, 512, 990, 49), 'target_feat': (4, 990, 22)}
I0407 10:45:44.214986 140644131185024 run_docker.py:255] I0407 15:45:44.207410 140657684973376 model.py:176] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (990, 990, 64)}, 'experimentally_resolved': {'logits': (990, 37)}, 'masked_msa': {'logits': (512, 990, 23)}, 'predicted_lddt': {'logits': (990, 50)}, 'structure_module': {'final_atom_mask': (990, 37), 'final_atom_positions': (990, 37, 3)}, 'plddt': (990,), 'ranking_confidence': ()}
I0407 10:45:44.218248 140644131185024 run_docker.py:255] I0407 15:45:44.217665 140657684973376 run_alphafold.py:204] Total JAX model model_4_pred_0 on HM predict time (includes compilation time, see --benchmark): 396.8s
I0407 10:46:03.766776 140644131185024 run_docker.py:255] I0407 15:46:03.766033 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 989 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:46:04.454188 140644131185024 run_docker.py:255] I0407 15:46:04.453492 140657684973376 amber_minimize.py:408] Minimizing protein, attempt 1 of 100.
I0407 10:46:06.094631 140644131185024 run_docker.py:255] I0407 15:46:06.093979 140657684973376 amber_minimize.py:69] Restraining 7934 / 15723 particles.
I0407 10:46:18.494993 140644131185024 run_docker.py:255] I0407 15:46:18.494155 140657684973376 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0407 10:47:47.292335 140644131185024 run_docker.py:255] /app/run_alphafold.sh: line 3:     8 Killed                  python /app/alphafold/run_alphafold.py "$@"

I noticed when it came to the 4th prediction and minimization, GPU memory was almost full, and then it was killed.

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 510.60.02    Driver Version: 510.60.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 81%   82C    P2   331W / 350W |  11992MiB / 12288MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1141      G   /usr/lib/xorg/Xorg                 96MiB |
|    0   N/A  N/A      1642      G   /usr/lib/xorg/Xorg                318MiB |
|    0   N/A  N/A      1772      G   /usr/bin/gnome-shell               63MiB |
|    0   N/A  N/A    230121      G   ...veSuggestionsOnlyOnDemand       62MiB |
|    0   N/A  N/A    377645      G   ...592484650516736366,131072      193MiB |
|    0   N/A  N/A    480339      C   python                            939MiB |

And I also try the method mentioned in this issue #197, I comment out the following lines in run_docker.py

‘TF_FORCE_UNIFIED_MEMORY’: ‘1’,
‘XLA_PYTHON_CLIENT_MEM_FRACTION’: ‘4.0’,

And then, it showed the same error @Ikajiro said in #197. Perhaps because my sequence is long (~900)

python3 /home/song/alphafold/docker/run_docker.py --fasta_paths=HM.fasta --max_template_date=2030-03-10 --model_preset=monomer --db_preset=reduced_dbs --data_dir=/home/song/harddrive/alphafold_dbs
I0407 10:56:48.221208 139844004888960 run_docker.py:113] Mounting /home/song/alphafold/fasta -> /mnt/fasta_path_0
I0407 10:56:49.324587 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/uniref90 -> /mnt/uniref90_database_path
I0407 10:56:49.337784 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/mgnify -> /mnt/mgnify_database_path
I0407 10:56:49.338311 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs -> /mnt/data_dir
I0407 10:56:49.338590 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir
I0407 10:56:49.339505 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/pdb_mmcif -> /mnt/obsolete_pdbs_path
I0407 10:56:49.340553 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/pdb70 -> /mnt/pdb70_database_path
I0407 10:56:49.346934 139844004888960 run_docker.py:113] Mounting /home/song/harddrive/alphafold_dbs/small_bfd -> /mnt/small_bfd_database_path
I0407 10:56:54.050762 139844004888960 run_docker.py:255] I0407 15:56:54.049874 140462593042240 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat.
I0407 10:56:55.568237 139844004888960 run_docker.py:255] I0407 15:56:55.567192 140462593042240 tpu_client.py:54] Starting the local TPU driver.
I0407 10:56:55.570917 139844004888960 run_docker.py:255] I0407 15:56:55.569945 140462593042240 xla_bridge.py:212] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
I0407 10:56:55.677566 139844004888960 run_docker.py:255] I0407 15:56:55.677165 140462593042240 xla_bridge.py:212] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
I0407 10:57:03.503803 139844004888960 run_docker.py:255] I0407 15:57:03.503148 140462593042240 run_alphafold.py:377] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0']
I0407 10:57:03.503915 139844004888960 run_docker.py:255] I0407 15:57:03.503263 140462593042240 run_alphafold.py:393] Using random seed 1268619124088369252 for the data pipeline
I0407 10:57:03.503951 139844004888960 run_docker.py:255] I0407 15:57:03.503389 140462593042240 run_alphafold.py:161] Predicting HM
I0407 10:57:03.507074 139844004888960 run_docker.py:255] I0407 15:57:03.506884 140462593042240 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpcez7kft8/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/HM.fasta /mnt/uniref90_database_path/uniref90.fasta"
I0407 10:57:03.532036 139844004888960 run_docker.py:255] I0407 15:57:03.531312 140462593042240 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0407 11:01:29.486995 139844004888960 run_docker.py:255] I0407 16:01:29.485760 140462593042240 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 265.954 seconds
I0407 11:01:29.757087 139844004888960 run_docker.py:255] I0407 16:01:29.756328 140462593042240 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp3tz2iwv8/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/HM.fasta /mnt/mgnify_database_path/mgy_clusters_2018_12.fa"
I0407 11:01:29.780973 139844004888960 run_docker.py:255] I0407 16:01:29.780000 140462593042240 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0407 11:05:58.770235 139844004888960 run_docker.py:255] I0407 16:05:58.768591 140462593042240 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 268.988 seconds
I0407 11:06:01.998098 139844004888960 run_docker.py:255] I0407 16:06:01.997191 140462593042240 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpmflo69re/query.a3m -o /tmp/tmpmflo69re/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70"
I0407 11:06:02.025695 139844004888960 run_docker.py:255] I0407 16:06:02.024801 140462593042240 utils.py:36] Started HHsearch query
I0407 11:07:30.042175 139844004888960 run_docker.py:255] I0407 16:07:30.041380 140462593042240 utils.py:40] Finished HHsearch query in 88.016 seconds
I0407 11:07:32.267431 139844004888960 run_docker.py:255] I0407 16:07:32.266582 140462593042240 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpmgl82w8p/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/HM.fasta /mnt/small_bfd_database_path/bfd-first_non_consensus_sequences.fasta"
I0407 11:07:32.294763 139844004888960 run_docker.py:255] I0407 16:07:32.293882 140462593042240 utils.py:36] Started Jackhmmer (bfd-first_non_consensus_sequences.fasta) query
I0407 11:08:43.829909 139844004888960 run_docker.py:255] I0407 16:08:43.828865 140462593042240 utils.py:40] Finished Jackhmmer (bfd-first_non_consensus_sequences.fasta) query in 71.535 seconds
I0407 11:08:44.381553 139844004888960 run_docker.py:255] I0407 16:08:44.380667 140462593042240 templates.py:878] Searching for template for: 
I0407 11:08:44.724613 139844004888960 run_docker.py:255] I0407 16:08:44.723805 140462593042240 templates.py:268] Found an exact template match 5dzt_A.
I0407 11:08:44.878182 139844004888960 run_docker.py:255] I0407 16:08:44.877429 140462593042240 templates.py:268] Found an exact template match 3t33_A.
I0407 11:08:45.386653 139844004888960 run_docker.py:255] I0407 16:08:45.385856 140462593042240 templates.py:268] Found an exact template match 3e6u_A.
I0407 11:08:45.622028 139844004888960 run_docker.py:255] I0407 16:08:45.621242 140462593042240 templates.py:268] Found an exact template match 3e73_A.
I0407 11:08:45.781338 139844004888960 run_docker.py:255] I0407 16:08:45.780581 140462593042240 templates.py:268] Found an exact template match 2g0d_A.
I0407 11:08:45.790633 139844004888960 run_docker.py:255] I0407 16:08:45.790066 140462593042240 templates.py:268] Found an exact template match 3e6u_A.
I0407 11:08:45.800488 139844004888960 run_docker.py:255] I0407 16:08:45.799908 140462593042240 templates.py:268] Found an exact template match 3e73_A.
I0407 11:08:46.149326 139844004888960 run_docker.py:255] I0407 16:08:46.148638 140462593042240 templates.py:268] Found an exact template match 4v1r_B.
I0407 11:08:46.537471 139844004888960 run_docker.py:255] I0407 16:08:46.536653 140462593042240 templates.py:268] Found an exact template match 4v1s_A.
I0407 11:08:46.787966 139844004888960 run_docker.py:255] I0407 16:08:46.787169 140462593042240 templates.py:268] Found an exact template match 4c1s_B.
I0407 11:08:46.796962 139844004888960 run_docker.py:255] I0407 16:08:46.796271 140462593042240 templates.py:268] Found an exact template match 4v1r_B.
I0407 11:08:46.806015 139844004888960 run_docker.py:255] I0407 16:08:46.805377 140462593042240 templates.py:268] Found an exact template match 4v1s_A.
I0407 11:08:46.815184 139844004888960 run_docker.py:255] I0407 16:08:46.814651 140462593042240 templates.py:268] Found an exact template match 4c1s_B.
I0407 11:08:46.824412 139844004888960 run_docker.py:255] I0407 16:08:46.823835 140462593042240 templates.py:268] Found an exact template match 3t33_A.
I0407 11:08:47.299077 139844004888960 run_docker.py:255] I0407 16:08:47.298307 140462593042240 templates.py:268] Found an exact template match 4mu9_A.
I0407 11:08:47.307845 139844004888960 run_docker.py:255] I0407 16:08:47.307226 140462593042240 templates.py:268] Found an exact template match 4mu9_B.
I0407 11:08:47.316535 139844004888960 run_docker.py:255] I0407 16:08:47.316039 140462593042240 templates.py:268] Found an exact template match 3e6u_A.
I0407 11:08:47.326386 139844004888960 run_docker.py:255] I0407 16:08:47.325797 140462593042240 templates.py:268] Found an exact template match 3e73_A.
I0407 11:08:47.336363 139844004888960 run_docker.py:255] I0407 16:08:47.335795 140462593042240 templates.py:268] Found an exact template match 2g0d_A.
I0407 11:08:47.867216 139844004888960 run_docker.py:255] I0407 16:08:47.859856 140462593042240 templates.py:268] Found an exact template match 4wu0_B.
I0407 11:08:48.581511 139844004888960 run_docker.py:255] I0407 16:08:48.580777 140462593042240 pipeline.py:234] Uniref90 MSA size: 4104 sequences.
I0407 11:08:48.581789 139844004888960 run_docker.py:255] I0407 16:08:48.580890 140462593042240 pipeline.py:235] BFD MSA size: 1366 sequences.
I0407 11:08:48.581957 139844004888960 run_docker.py:255] I0407 16:08:48.580915 140462593042240 pipeline.py:236] MGnify MSA size: 501 sequences.
I0407 11:08:48.582135 139844004888960 run_docker.py:255] I0407 16:08:48.580936 140462593042240 pipeline.py:238] Final (deduplicated) MSA size: 5641 sequences.
I0407 11:08:48.582287 139844004888960 run_docker.py:255] I0407 16:08:48.581155 140462593042240 pipeline.py:241] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20.
I0407 11:08:48.647255 139844004888960 run_docker.py:255] I0407 16:08:48.646430 140462593042240 run_alphafold.py:190] Running model model_1_pred_0 on HM
I0407 11:08:51.183267 139844004888960 run_docker.py:255] 2022-04-07 16:08:51.182694: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 46268640 exceeds 10% of free system memory.
I0407 11:08:51.211383 139844004888960 run_docker.py:255] 2022-04-07 16:08:51.210625: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 491443920 exceeds 10% of free system memory.
I0407 11:08:51.212475 139844004888960 run_docker.py:255] 2022-04-07 16:08:51.211673: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 44256960 exceeds 10% of free system memory.
I0407 11:08:51.212612 139844004888960 run_docker.py:255] 2022-04-07 16:08:51.212198: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 467513640 exceeds 10% of free system memory.
I0407 11:08:51.377755 139844004888960 run_docker.py:255] 2022-04-07 16:08:51.377100: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 46268640 exceeds 10% of free system memory.
I0407 11:08:52.583285 139844004888960 run_docker.py:255] I0407 16:08:52.582467 140462593042240 model.py:166] Running predict with shape(feat) = {'aatype': (4, 990), 'residue_index': (4, 990), 'seq_length': (4,), 'template_aatype': (4, 4, 990), 'template_all_atom_masks': (4, 4, 990, 37), 'template_all_atom_positions': (4, 4, 990, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 990), 'msa_mask': (4, 508, 990), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 990, 3), 'template_pseudo_beta_mask': (4, 4, 990), 'atom14_atom_exists': (4, 990, 14), 'residx_atom14_to_atom37': (4, 990, 14), 'residx_atom37_to_atom14': (4, 990, 37), 'atom37_atom_exists': (4, 990, 37), 'extra_msa': (4, 5120, 990), 'extra_msa_mask': (4, 5120, 990), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 990), 'true_msa': (4, 508, 990), 'extra_has_deletion': (4, 5120, 990), 'extra_deletion_value': (4, 5120, 990), 'msa_feat': (4, 508, 990, 49), 'target_feat': (4, 990, 22)}
I0407 11:10:03.927810 139844004888960 run_docker.py:255] 2022-04-07 16:10:03.919715: W external/org_tensorflow/tensorflow/core/common_runtime/bfc_allocator.cc:457] Allocator (GPU_0_bfc) ran out of memory trying to allocate 8.30GiB (rounded to 8913849088)requested by op
I0407 11:10:03.932932 139844004888960 run_docker.py:255] 2022-04-07 16:10:03.931572: W external/org_tensorflow/tensorflow/core/common_runtime/bfc_allocator.cc:468] ****************************________________________________________________________________________
I0407 11:10:03.933348 139844004888960 run_docker.py:255] 2022-04-07 16:10:03.931752: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2040] Execution of replica 0 failed: Resource exhausted: Out of memory while trying to allocate 8913848992 bytes.
I0407 11:10:03.949940 139844004888960 run_docker.py:255] Traceback (most recent call last):
I0407 11:10:03.950385 139844004888960 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 422, in <module>
I0407 11:10:03.950693 139844004888960 run_docker.py:255] app.run(main)
I0407 11:10:03.951089 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
I0407 11:10:03.951364 139844004888960 run_docker.py:255] _run_main(main, args)
I0407 11:10:03.951620 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
I0407 11:10:03.951720 139844004888960 run_docker.py:255] sys.exit(main(argv))
I0407 11:10:03.951772 139844004888960 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 406, in main
I0407 11:10:03.951828 139844004888960 run_docker.py:255] random_seed=random_seed)
I0407 11:10:03.951884 139844004888960 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure
I0407 11:10:03.951936 139844004888960 run_docker.py:255] random_seed=model_random_seed)
I0407 11:10:03.951988 139844004888960 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict
I0407 11:10:03.952040 139844004888960 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat)
I0407 11:10:03.952091 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/_src/traceback_util.py", line 183, in reraise_with_filtered_traceback
I0407 11:10:03.952144 139844004888960 run_docker.py:255] return fun(*args, **kwargs)
I0407 11:10:03.952196 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/_src/api.py", line 427, in cache_miss
I0407 11:10:03.952247 139844004888960 run_docker.py:255] donated_invars=donated_invars, inline=inline)
I0407 11:10:03.952299 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/core.py", line 1560, in bind
I0407 11:10:03.952349 139844004888960 run_docker.py:255] return call_bind(self, fun, *args, **params)
I0407 11:10:03.952400 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/core.py", line 1551, in call_bind
I0407 11:10:03.952451 139844004888960 run_docker.py:255] outs = primitive.process(top_trace, fun, tracers, params)
I0407 11:10:03.952502 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/core.py", line 1563, in process
I0407 11:10:03.952553 139844004888960 run_docker.py:255] return trace.process_call(self, fun, tracers, params)
I0407 11:10:03.952604 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/core.py", line 606, in process_call
I0407 11:10:03.952850 139844004888960 run_docker.py:255] return primitive.impl(f, *tracers, **params)
I0407 11:10:03.953096 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/interpreters/xla.py", line 595, in _xla_call_impl
I0407 11:10:03.953342 139844004888960 run_docker.py:255] return compiled_fun(*args)
I0407 11:10:03.953590 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/interpreters/xla.py", line 893, in _execute_compiled
I0407 11:10:03.953839 139844004888960 run_docker.py:255] out_bufs = compiled.execute(input_bufs)
I0407 11:10:03.954083 139844004888960 run_docker.py:255] jax._src.traceback_util.UnfilteredStackTrace: RuntimeError: Resource exhausted: Out of memory while trying to allocate 8913848992 bytes.
I0407 11:10:03.954333 139844004888960 run_docker.py:255] 
I0407 11:10:03.954578 139844004888960 run_docker.py:255] The stack trace below excludes JAX-internal frames.
I0407 11:10:03.954821 139844004888960 run_docker.py:255] The preceding is the original exception that occurred, unmodified.
I0407 11:10:03.955081 139844004888960 run_docker.py:255] 
I0407 11:10:03.955332 139844004888960 run_docker.py:255] --------------------
I0407 11:10:03.955577 139844004888960 run_docker.py:255] 
I0407 11:10:03.955823 139844004888960 run_docker.py:255] The above exception was the direct cause of the following exception:
I0407 11:10:03.956065 139844004888960 run_docker.py:255] 
I0407 11:10:03.956306 139844004888960 run_docker.py:255] Traceback (most recent call last):
I0407 11:10:03.956411 139844004888960 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 422, in <module>
I0407 11:10:03.956462 139844004888960 run_docker.py:255] app.run(main)
I0407 11:10:03.956513 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
I0407 11:10:03.956564 139844004888960 run_docker.py:255] _run_main(main, args)
I0407 11:10:03.956614 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
I0407 11:10:03.956665 139844004888960 run_docker.py:255] sys.exit(main(argv))
I0407 11:10:03.956715 139844004888960 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 406, in main
I0407 11:10:03.956765 139844004888960 run_docker.py:255] random_seed=random_seed)
I0407 11:10:03.956816 139844004888960 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure
I0407 11:10:03.956866 139844004888960 run_docker.py:255] random_seed=model_random_seed)
I0407 11:10:03.956917 139844004888960 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict
I0407 11:10:03.956967 139844004888960 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat)
I0407 11:10:03.957017 139844004888960 run_docker.py:255] File "/opt/conda/lib/python3.7/site-packages/jax/interpreters/xla.py", line 893, in _execute_compiled
I0407 11:10:03.957067 139844004888960 run_docker.py:255] out_bufs = compiled.execute(input_bufs)
I0407 11:10:03.957118 139844004888960 run_docker.py:255] RuntimeError: Resource exhausted: Out of memory while trying to allocate 8913848992 bytes.

Same issue also happens here #130 Could anyone help me with this issue? Thanks.

songyinys avatar Apr 07 '22 16:04 songyinys

Saw your issue because I had the same thing. Followed this tutorial to increase my swap size to 40GB (I have 32GB ram) and my runs work now. I'm very new to Linux, so I could be wrong.

MMMJoey avatar Apr 17 '22 05:04 MMMJoey

I have the same questions. So how do you solve it?

lbw124765283 avatar Jun 21 '22 09:06 lbw124765283

I have the same questions. So how do you solve it? so do I.Anybody knows how to deal with it?

suzejie avatar Jul 23 '22 16:07 suzejie

u can try the solution given b @MMMJoey

kbrunnerLXG avatar Jun 04 '23 23:06 kbrunnerLXG