Indexdb died error message when creating colabfold_envdb_202108_db with MMseqs
Hi,
I was trying to setup the database. But it breaks upon the execution of this code:
mmseqs createindex colabfold_envdb_202108_db tmp2 --remove-tmp-files 1
The error message I get is this:
MMseqs Version: edb8223d1ea07385ffe63d4f103af0eb12b2058e
Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out
k-mer length 0
Alphabet size aa:21,nucl:5
Compositional bias 1
Max sequence length 65535
Max results per query 300
Mask residues 1
Mask lower case residues 0
Spaced k-mers 1
Spaced k-mer pattern
Sensitivity 7.5
k-score seq:0,prof:0
Check compatible 0
Search type 0
Split database 0
Split memory limit 0
Verbosity 3
Threads 8
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Compressed 0
Add orf stop false
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Strand selection 1
Remove temporary files true
indexdb colabfold_envdb_202108_db colabfold_envdb_202108_db --seed-sub-mat aa:VTML80.out,nucl:nucleotide.out -k 0 --alph-size aa:21,nucl:5 --comp-bias-corr 1 --max-seq-len 65535 --max-seqs 300 --mask 1 --mask-lower-case 0 --spaced-kmer-mode 1 -s 7.5 --k-score seq:0,prof:0 --check-compatible 0 --search-type 0 --split 0 --split-memory-limit 0 -v 3 --threads 8
Target split mode. Searching through 34 splits
Estimated memory consumption: 29G
Write VERSION (0)
Write META (1)
Write SCOREMATRIX3MER (4)
Write SCOREMATRIX2MER (3)
Write SCOREMATRIXNAME (2)
Write SPACEDPATTERN (23)
Write GENERATOR (22)
Write DBR1INDEX (5)
Write DBR1DATA (6)
Write DBR2INDEX (7)
Killed
Error: indexdb died
It works fine with uniref30_2103.tar.gz file though.
How can I resolve the problem?
G.V.
I assume your computer does not have enough RAM. How much RAM does your server has?
I am using AWS p3.2xlarge instance. It has around 61GB RAM.

Online searches: Our Colabfold server has ~760GB RAM and keeps full database and index in memory. Batch searches: To perform a batch search you require less memory. But its still approx 1 byte per residue. So I would assume you would probably require at least 90GB. We still need to figure out whats the lower bound for this database.
i have 128G RAM, but i have same erro .
the erro :
Estimated memory consumption: 560G
Process needs more than 38G main memory.
Increase the size of --split or set it to 0 to automatically optimize target database split.
Write VERSION (0)
Write META (1)
Write SCOREMATRIX3MER (4)
Write SCOREMATRIX2MER (3)
Write SCOREMATRIXNAME (2)
Write SPACEDPATTERN (23)
Write GENERATOR (22)
Write DBR1INDEX (5)
Write DBR1DATA (6)
Write DBR2INDEX (7)
Write DBR2DATA (8)
Write HDR1INDEX (18)
Write HDR1DATA (19)
Write ALNINDEX (24)
Write ALNDATA (25)
Index table: counting k-mers
[=================================================================] 100.00% 209.34M 7m 34s 698ms
Index table: Masked residues: 1117805658
Can not allocate entries memory in IndexTable::initMemory
Error: indexdb died