MMseqs2 icon indicating copy to clipboard operation
MMseqs2 copied to clipboard

Excessively long easy-taxonomy against NR

Open mgabriell1 opened this issue 4 years ago • 1 comments

Hi, I am trying to get the taxonomy of several contigs present in a multi-fasta file, but I'm having some issues with the easy-taxonomy command, as it is has not completed the assignment of about 804K contigs on 16 threads in 24h using as reference database NR. Due to the limits of the machine that I'm using (I can use a partition with external connection using only a single core and a rather short time limit) set up the database using a mix of the databases command and the other commands shown in the user guide. Among the different steps I changed the number of threads used, as, for example, it seemed that createdb worked only using the same number of threads with which databases was initially run. Is this something to be expected or have I done something wrong during the database setup?

Thanks in advance for your help and, also, for making this tool!

These are the commands that I've used:

mmseqs databases NR refDB/NR tmp --threads 1 -v 3  --force-reuse 1
mmseqs createdb tmp/11117391383852458210/nr.gz refDB/NR --compressed 0 -v 3
mmseqs createtaxdb refDB/NR tmp
mmseqs createindex refDB/NR tmp --split-memory-limit 100G
mmseqs easy-taxonomy contigs.fasta refDB/NR alnRes tmp --split-memory-limit 100G --threads 16

This is the output of createdb:

createdb tmp/11117391383852458210/nr.gz refDB/NR --compressed 0 -v 3 

MMseqs Version:       	13.45111
Database type         	0
Shuffle input database	true
Createdb mode         	0
Write lookup file     	1
Offset of numeric ids 	0
Compressed            	0
Verbosity             	3

Converting sequences
[===================================================================================================	1 Mio. sequences processed
===================================================================================================	2 Mio. sequences processed
===================================================================================================	3 Mio. sequences processed
===================================================================================================	4 Mio. sequences processed
===================================================================================================	5 Mio. sequences processed
===================================================================================================	6 Mio. sequences processed
===================================================================================================	7 Mio. sequences processed
===================================================================================================	8 Mio. sequences processed
===================================================================================================	9 Mio. sequences processed
===================================================================================================	10 Mio. sequences processed
===================================================================================================	11 Mio. sequences processed
===================================================================================================	12 Mio. sequences processed
===================================================================================================	13 Mio. sequences processed
===================================================================================================	14 Mio. sequences processed
===================================================================================================	15 Mio. sequences processed
===================================================================================================	16 Mio. sequences processed
===================================================================================================	17 Mio. sequences processed
===================================================================================================	18 Mio. sequences processed
===================================================================================================	19 Mio. sequences processed
===================================================================================================	20 Mio. sequences processed
===================================================================================================	21 Mio. sequences processed
===================================================================================================	22 Mio. sequences processed
===================================================================================================	23 Mio. sequences processed
===================================================================================================	24 Mio. sequences processed
===================================================================================================	25 Mio. sequences processed
===================================================================================================	26 Mio. sequences processed
===================================================================================================	27 Mio. sequences processed
===================================================================================================	28 Mio. sequences processed
===================================================================================================	29 Mio. sequences processed
===================================================================================================	30 Mio. sequences processed
===================================================================================================	31 Mio. sequences processed
===================================================================================================	32 Mio. sequences processed
===================================================================================================	33 Mio. sequences processed
===================================================================================================	34 Mio. sequences processed
===================================================================================================	35 Mio. sequences processed
===================================================================================================	36 Mio. sequences processed
===================================================================================================	37 Mio. sequences processed
===================================================================================================	38 Mio. sequences processed
===================================================================================================	39 Mio. sequences processed
===================================================================================================	40 Mio. sequences processed
===================================================================================================	41 Mio. sequences processed
===================================================================================================	42 Mio. sequences processed
===================================================================================================	43 Mio. sequences processed
===================================================================================================	44 Mio. sequences processed
===================================================================================================	45 Mio. sequences processed
===================================================================================================	46 Mio. sequences processed
===================================================================================================	47 Mio. sequences processed
===================================================================================================	48 Mio. sequences processed
===================================================================================================	49 Mio. sequences processed
===================================================================================================	50 Mio. sequences processed
===================================================================================================	51 Mio. sequences processed
===================================================================================================	52 Mio. sequences processed
===================================================================================================	53 Mio. sequences processed
===================================================================================================	54 Mio. sequences processed
===================================================================================================	55 Mio. sequences processed
===================================================================================================	56 Mio. sequences processed
===================================================================================================	57 Mio. sequences processed
===================================================================================================	58 Mio. sequences processed
===================================================================================================	59 Mio. sequences processed
===================================================================================================	60 Mio. sequences processed
===================================================================================================	61 Mio. sequences processed
===================================================================================================	62 Mio. sequences processed
===================================================================================================	63 Mio. sequences processed
===================================================================================================	64 Mio. sequences processed
===================================================================================================	65 Mio. sequences processed
===================================================================================================	66 Mio. sequences processed
===================================================================================================	67 Mio. sequences processed
===================================================================================================	68 Mio. sequences processed
===================================================================================================	69 Mio. sequences processed
===================================================================================================	70 Mio. sequences processed
===================================================================================================	71 Mio. sequences processed
===================================================================================================	72 Mio. sequences processed
===================================================================================================	73 Mio. sequences processed
===================================================================================================	74 Mio. sequences processed
===================================================================================================	75 Mio. sequences processed
===================================================================================================	76 Mio. sequences processed
===================================================================================================	77 Mio. sequences processed
===================================================================================================	78 Mio. sequences processed
===================================================================================================	79 Mio. sequences processed
===================================================================================================	80 Mio. sequences processed
===================================================================================================	81 Mio. sequences processed
===================================================================================================	82 Mio. sequences processed
===================================================================================================	83 Mio. sequences processed
===================================================================================================	84 Mio. sequences processed
===================================================================================================	85 Mio. sequences processed
===================================================================================================	86 Mio. sequences processed
===================================================================================================	87 Mio. sequences processed
===================================================================================================	88 Mio. sequences processed
===================================================================================================	89 Mio. sequences processed
===================================================================================================	90 Mio. sequences processed
===================================================================================================	91 Mio. sequences processed
===================================================================================================	92 Mio. sequences processed
===================================================================================================	93 Mio. sequences processed
===================================================================================================	94 Mio. sequences processed
===================================================================================================	95 Mio. sequences processed
===================================================================================================	96 Mio. sequences processed
===================================================================================================	97 Mio. sequences processed
===================================================================================================	98 Mio. sequences processed
===================================================================================================	99 Mio. sequences processed
===================================================================================================	100 Mio. sequences processed
===================================================================================================	101 Mio. sequences processed
===================================================================================================	102 Mio. sequences processed
===================================================================================================	103 Mio. sequences processed
===================================================================================================	104 Mio. sequences processed
===================================================================================================	105 Mio. sequences processed
===================================================================================================	106 Mio. sequences processed
===================================================================================================	107 Mio. sequences processed
===================================================================================================	108 Mio. sequences processed
===================================================================================================	109 Mio. sequences processed
===================================================================================================	110 Mio. sequences processed
===================================================================================================	111 Mio. sequences processed
===================================================================================================	112 Mio. sequences processed
===================================================================================================	113 Mio. sequences processed
===================================================================================================	114 Mio. sequences processed
===================================================================================================	115 Mio. sequences processed
===================================================================================================	116 Mio. sequences processed
===================================================================================================	117 Mio. sequences processed
===================================================================================================	118 Mio. sequences processed
===================================================================================================	119 Mio. sequences processed
===================================================================================================	120 Mio. sequences processed
===================================================================================================	121 Mio. sequences processed
===================================================================================================	122 Mio. sequences processed
===================================================================================================	123 Mio. sequences processed
===================================================================================================	124 Mio. sequences processed
===================================================================================================	125 Mio. sequences processed
===================================================================================================	126 Mio. sequences processed
===================================================================================================	127 Mio. sequences processed
===================================================================================================	128 Mio. sequences processed
===================================================================================================	129 Mio. sequences processed
===================================================================================================	130 Mio. sequences processed
===================================================================================================	131 Mio. sequences processed
===================================================================================================	132 Mio. sequences processed
===================================================================================================	133 Mio. sequences processed
===================================================================================================	134 Mio. sequences processed
===================================================================================================	135 Mio. sequences processed
===================================================================================================	136 Mio. sequences processed
===================================================================================================	137 Mio. sequences processed
===================================================================================================	138 Mio. sequences processed
===================================================================================================	139 Mio. sequences processed
===================================================================================================	140 Mio. sequences processed
===================================================================================================	141 Mio. sequences processed
===================================================================================================	142 Mio. sequences processed
===================================================================================================	143 Mio. sequences processed
===================================================================================================	144 Mio. sequences processed
===================================================================================================	145 Mio. sequences processed
===================================================================================================	146 Mio. sequences processed
===================================================================================================	147 Mio. sequences processed
===================================================================================================	148 Mio. sequences processed
===================================================================================================	149 Mio. sequences processed
===================================================================================================	150 Mio. sequences processed
===================================================================================================	151 Mio. sequences processed
===================================================================================================	152 Mio. sequences processed
===================================================================================================	153 Mio. sequences processed
===================================================================================================	154 Mio. sequences processed
===================================================================================================	155 Mio. sequences processed
===================================================================================================	156 Mio. sequences processed
===================================================================================================	157 Mio. sequences processed
===================================================================================================	158 Mio. sequences processed
===================================================================================================	159 Mio. sequences processed
===================================================================================================	160 Mio. sequences processed
===================================================================================================	161 Mio. sequences processed
===================================================================================================	162 Mio. sequences processed
===================================================================================================	163 Mio. sequences processed
===================================================================================================	164 Mio. sequences processed
===================================================================================================	165 Mio. sequences processed
===================================================================================================	166 Mio. sequences processed
===================================================================================================	167 Mio. sequences processed
===================================================================================================	168 Mio. sequences processed
===================================================================================================	169 Mio. sequences processed
===================================================================================================	170 Mio. sequences processed
===================================================================================================	171 Mio. sequences processed
===================================================================================================	172 Mio. sequences processed
===================================================================================================	173 Mio. sequences processed
===================================================================================================	174 Mio. sequences processed
===================================================================================================	175 Mio. sequences processed
===================================================================================================	176 Mio. sequences processed
===================================================================================================	177 Mio. sequences processed
===================================================================================================	178 Mio. sequences processed
===================================================================================================	179 Mio. sequences processed
===================================================================================================	180 Mio. sequences processed
===================================================================================================	181 Mio. sequences processed
===================================================================================================	182 Mio. sequences processed
===================================================================================================	183 Mio. sequences processed
===================================================================================================	184 Mio. sequences processed
===================================================================================================	185 Mio. sequences processed
===================================================================================================	186 Mio. sequences processed
===================================================================================================	187 Mio. sequences processed
===================================================================================================	188 Mio. sequences processed
===================================================================================================	189 Mio. sequences processed
===================================================================================================	190 Mio. sequences processed
===================================================================================================	191 Mio. sequences processed
===================================================================================================	192 Mio. sequences processed
===================================================================================================	193 Mio. sequences processed
===================================================================================================	194 Mio. sequences processed
===================================================================================================	195 Mio. sequences processed
===================================================================================================	196 Mio. sequences processed
===================================================================================================	197 Mio. sequences processed
===================================================================================================	198 Mio. sequences processed
===================================================================================================	199 Mio. sequences processed
===================================================================================================	200 Mio. sequences processed
===================================================================================================	201 Mio. sequences processed
===================================================================================================	202 Mio. sequences processed
===================================================================================================	203 Mio. sequences processed
===================================================================================================	204 Mio. sequences processed
===================================================================================================	205 Mio. sequences processed
===================================================================================================	206 Mio. sequences processed
===================================================================================================	207 Mio. sequences processed
===================================================================================================	208 Mio. sequences processed
===================================================================================================	209 Mio. sequences processed
===================================================================================================	210 Mio. sequences processed
===================================================================================================	211 Mio. sequences processed
===================================================================================================	212 Mio. sequences processed
===================================================================================================	213 Mio. sequences processed
===================================================================================================	214 Mio. sequences processed
===================================================================================================	215 Mio. sequences processed
===================================================================================================	216 Mio. sequences processed
===================================================================================================	217 Mio. sequences processed
===================================================================================================	218 Mio. sequences processed
===================================================================================================	219 Mio. sequences processed
===================================================================================================	220 Mio. sequences processed
===================================================================================================	221 Mio. sequences processed
===================================================================================================	222 Mio. sequences processed
===================================================================================================	223 Mio. sequences processed
===================================================================================================	224 Mio. sequences processed
===================================================================================================	225 Mio. sequences processed
===================================================================================================	226 Mio. sequences processed
===================================================================================================	227 Mio. sequences processed
===================================================================================================	228 Mio. sequences processed
===================================================================================================	229 Mio. sequences processed
===================================================================================================	230 Mio. sequences processed
===================================================================================================	231 Mio. sequences processed
===================================================================================================	232 Mio. sequences processed
===================================================================================================	233 Mio. sequences processed
===================================================================================================	234 Mio. sequences processed
===================================================================================================	235 Mio. sequences processed
===================================================================================================	236 Mio. sequences processed
===================================================================================================	237 Mio. sequences processed
===================================================================================================	238 Mio. sequences processed
===================================================================================================	239 Mio. sequences processed
===================================================================================================	240 Mio. sequences processed
===================================================================================================	241 Mio. sequences processed
===================================================================================================	242 Mio. sequences processed
===================================================================================================	243 Mio. sequences processed
===================================================================================================	244 Mio. sequences processed
===================================================================================================	245 Mio. sequences processed
===================================================================================================	246 Mio. sequences processed
===================================================================================================	247 Mio. sequences processed
===================================================================================================	248 Mio. sequences processed
===================================================================================================	249 Mio. sequences processed
===================================================================================================	250 Mio. sequences processed
===================================================================================================	251 Mio. sequences processed
===================================================================================================	252 Mio. sequences processed
===================================================================================================	253 Mio. sequences processed
===================================================================================================	254 Mio. sequences processed
===================================================================================================	255 Mio. sequences processed
===================================================================================================	256 Mio. sequences processed
===================================================================================================	257 Mio. sequences processed
===================================================================================================	258 Mio. sequences processed
===================================================================================================	259 Mio. sequences processed
===================================================================================================	260 Mio. sequences processed
===================================================================================================	261 Mio. sequences processed
===================================================================================================	262 Mio. sequences processed
===================================================================================================	263 Mio. sequences processed
===================================================================================================	264 Mio. sequences processed
===================================================================================================	265 Mio. sequences processed
===================================================================================================	266 Mio. sequences processed
===================================================================================================	267 Mio. sequences processed
===================================================================================================	268 Mio. sequences processed
===================================================================================================	269 Mio. sequences processed
===================================================================================================	270 Mio. sequences processed
===================================================================================================	271 Mio. sequences processed
===================================================================================================	272 Mio. sequences processed
===================================================================================================	273 Mio. sequences processed
===================================================================================================	274 Mio. sequences processed
===================================================================================================	275 Mio. sequences processed
===================================================================================================	276 Mio. sequences processed
===================================================================================================	277 Mio. sequences processed
===================================================================================================	278 Mio. sequences processed
===================================================================================================	279 Mio. sequences processed
===================================================================================================	280 Mio. sequences processed
===================================================================================================	281 Mio. sequences processed
===================================================================================================	282 Mio. sequences processed
===================================================================================================	283 Mio. sequences processed
===================================================================================================	284 Mio. sequences processed
===================================================================================================	285 Mio. sequences processed
===================================================================================================	286 Mio. sequences processed
===================================================================================================	287 Mio. sequences processed
===================================================================================================	288 Mio. sequences processed
===================================================================================================	289 Mio. sequences processed
===================================================================================================	290 Mio. sequences processed
===================================================================================================	291 Mio. sequences processed
===================================================================================================	292 Mio. sequences processed
===================================================================================================	293 Mio. sequences processed
===================================================================================================	294 Mio. sequences processed
===================================================================================================	295 Mio. sequences processed
===================================================================================================	296 Mio. sequences processed
===================================================================================================	297 Mio. sequences processed
===================================================================================================	298 Mio. sequences processed
===================================================================================================	299 Mio. sequences processed
===================================================================================================	300 Mio. sequences processed
===================================================================================================	301 Mio. sequences processed
===================================================================================================	302 Mio. sequences processed
===================================================================================================	303 Mio. sequences processed
===================================================================================================	304 Mio. sequences processed
===================================================================================================	305 Mio. sequences processed
===================================================================================================	306 Mio. sequences processed
===================================================================================================	307 Mio. sequences processed
===================================================================================================	308 Mio. sequences processed
===================================================================================================	309 Mio. sequences processed
===================================================================================================	310 Mio. sequences processed
===================================================================================================	311 Mio. sequences processed
===================================================================================================	312 Mio. sequences processed
===================================================================================================	313 Mio. sequences processed
===================================================================================================	314 Mio. sequences processed
===================================================================================================	315 Mio. sequences processed
===================================================================================================	316 Mio. sequences processed
===================================================================================================	317 Mio. sequences processed
===================================================================================================	318 Mio. sequences processed
===================================================================================================	319 Mio. sequences processed
===================================================================================================	320 Mio. sequences processed
===================================================================================================	321 Mio. sequences processed
===================================================================================================	322 Mio. sequences processed
===================================================================================================	323 Mio. sequences processed
===================================================================================================	324 Mio. sequences processed
===================================================================================================	325 Mio. sequences processed
===================================================================================================	326 Mio. sequences processed
===================================================================================================	327 Mio. sequences processed
===================================================================================================	328 Mio. sequences processed
===================================================================================================	329 Mio. sequences processed
===================================================================================================	330 Mio. sequences processed
===================================================================================================	331 Mio. sequences processed
===================================================================================================	332 Mio. sequences processed
===================================================================================================	333 Mio. sequences processed
===================================================================================================	334 Mio. sequences processed
===================================================================================================	335 Mio. sequences processed
===================================================================================================	336 Mio. sequences processed
===================================================================================================	337 Mio. sequences processed
===================================================================================================	338 Mio. sequences processed
===================================================================================================	339 Mio. sequences processed
===================================================================================================	340 Mio. sequences processed
===================================================================================================	341 Mio. sequences processed
===================================================================================================	342 Mio. sequences processed
===================================================================================================	343 Mio. sequences processed
===================================================================================================	344 Mio. sequences processed
===================================================================================================	345 Mio. sequences processed
===================================================================================================	346 Mio. sequences processed
===================================================================================================	347 Mio. sequences processed
===================================================================================================	348 Mio. sequences processed
===================================================================================================	349 Mio. sequences processed
===================================================================================================	350 Mio. sequences processed
===================================================================================================	351 Mio. sequences processed
===================================================================================================	352 Mio. sequences processed
===================================================================================================	353 Mio. sequences processed
===================================================================================================	354 Mio. sequences processed
===================================================================================================	355 Mio. sequences processed
===================================================================================================	356 Mio. sequences processed
===================================================================================================	357 Mio. sequences processed
===================================================================================================	358 Mio. sequences processed
===================================================================================================	359 Mio. sequences processed
===================================================================================================	360 Mio. sequences processed
===================================================================================================	361 Mio. sequences processed
===================================================================================================	362 Mio. sequences processed
===================================================================================================	363 Mio. sequences processed
===================================================================================================	364 Mio. sequences processed
===================================================================================================	365 Mio. sequences processed
===================================================================================================	366 Mio. sequences processed
===================================================================================================	367 Mio. sequences processed
===================================================================================================	368 Mio. sequences processed
===================================================================================================	369 Mio. sequences processed
===================================================================================================	370 Mio. sequences processed
===================================================================================================	371 Mio. sequences processed
===================================================================================================	372 Mio. sequences processed
===================================================================================================	373 Mio. sequences processed
===================================================================================================	374 Mio. sequences processed
===================================================================================================	375 Mio. sequences processed
===================================================================================================	376 Mio. sequences processed
===================================================================================================	377 Mio. sequences processed
===================================================================================================	378 Mio. sequences processed
===================================================================================================	379 Mio. sequences processed
===================================================================================================	380 Mio. sequences processed
===================================================================================================	381 Mio. sequences processed
===================================================================================================	382 Mio. sequences processed
===================================================================================================	383 Mio. sequences processed
===================================================================================================	384 Mio. sequences processed
===================================================================================================	385 Mio. sequences processed
===================================================================================================	386 Mio. sequences processed
===================================================================================================	387 Mio. sequences processed
===================================================================================================	388 Mio. sequences processed
===================================================================================================	389 Mio. sequences processed
===================================================================================================	390 Mio. sequences processed
===================================================================================================	391 Mio. sequences processed
===================================================================================================	392 Mio. sequences processed
===================================================================================================	393 Mio. sequences processed
===================================================================================================	394 Mio. sequences processed
===================================================================================================	395 Mio. sequences processed
===================================================================================================	396 Mio. sequences processed
===================================================================================================	397 Mio. sequences processed
===================================================================================================	398 Mio. sequences processed
===================================================================================================	399 Mio. sequences processed
===================================================================================================	400 Mio. sequences processed
===================================================================================================	401 Mio. sequences processed
===================================================================================================	402 Mio. sequences processed
===================================================================================================	403 Mio. sequences processed
===================================================================================================	404 Mio. sequences processed
===================================================================================================	405 Mio. sequences processed
===================================================================================================	406 Mio. sequences processed
===================================================================================================	407 Mio. sequences processed
===================================================================================================	408 Mio. sequences processed
===================================================================================================	409 Mio. sequences processed
===================================================================================================	410 Mio. sequences processed
===================================================================================================	411 Mio. sequences processed
===================================================================================================	412 Mio. sequences processed
===================================================================================================	413 Mio. sequences processed
===================================================================================================	414 Mio. sequences processed
===================================================================================================	415 Mio. sequences processed
===================================================================================================	416 Mio. sequences processed
===================================================================================================	417 Mio. sequences processed
===================================================================================================	418 Mio. sequences processed
===================================================================================================	419 Mio. sequences processed
===================================================================================================	420 Mio. sequences processed
===================================================================================================	421 Mio. sequences processed
===================================================================================================	422 Mio. sequences processed
===================================================================================================	423 Mio. sequences processed
===================================================================================================	424 Mio. sequences processed
===================================================================================================	425 Mio. sequences processed
===================================================================================================	426 Mio. sequences processed
===================================================================================================	427 Mio. sequences processed
===================================================================================================	428 Mio. sequences processed
===================================================================================================	429 Mio. sequences processed
===================================================================================================	430 Mio. sequences processed
===================================================================================================	431 Mio. sequences processed
===================================================================================================	432 Mio. sequences processed
===================================================================================================	433 Mio. sequences processed
===================================================================================================	434 Mio. sequences processed
===================================================================================================	435 Mio. sequences processed
===================================================================================================	436 Mio. sequences processed
===================================================================================================	437 Mio. sequences processed
===================================================================================================	438 Mio. sequences processed
===================================================================================================	439 Mio. sequences processed
===================================================================================================	440 Mio. sequences processed
===================================================================================================	441 Mio. sequences processed
===================================================================================================	442 Mio. sequences processed
===================================================================================================	443 Mio. sequences processed
===================================================================================================	444 Mio. sequences processed
============================================================
Time for merging to NR_h: 0h 3m 55s 886ms
Time for merging to NR: 0h 7m 40s 283ms
Database type: Aminoacid
Time for processing: 1h 17m 9s 618ms

This is the output for createindex:

createindex refDB/NR tmp --split-memory-limit 100G 

MMseqs Version:          	13.45111
Seed substitution matrix 	nucl:nucleotide.out,aa:VTML80.out
k-mer length             	0
Alphabet size            	nucl:5,aa:21
Compositional bias       	1
Max sequence length      	65535
Max results per query    	300
Mask residues            	1
Mask lower case residues 	0
Spaced k-mers            	1
Spaced k-mer pattern     	
Sensitivity              	7.5
k-score                  	0
Check compatible         	0
Search type              	0
Split database           	0
Split memory limit       	100G
Verbosity                	3
Threads                  	48
Min codons in orf        	30
Max codons in length     	32734
Max orf gaps             	2147483647
Contig start mode        	2
Contig end mode          	2
Orf start mode           	1
Forward frames           	1,2,3
Reverse frames           	1,2,3
Translation table        	1
Translate orf            	0
Use all table starts     	false
Offset of numeric ids    	0
Create lookup            	0
Compressed               	0
Add orf stop             	false
Overlap between sequences	0
Sequence split mode      	1
Header split mode        	0
Strand selection         	1
Remove temporary files   	false

createindex refDB/NR tmp --split-memory-limit 100G 

MMseqs Version:          	13.45111
Seed substitution matrix 	nucl:nucleotide.out,aa:VTML80.out
k-mer length             	0
Alphabet size            	nucl:5,aa:21
Compositional bias       	1
Max sequence length      	65535
Max results per query    	300
Mask residues            	1
Mask lower case residues 	0
Spaced k-mers            	1
Spaced k-mer pattern     	
Sensitivity              	7.5
k-score                  	0
Check compatible         	0
Search type              	0
Split database           	0
Split memory limit       	100G
Verbosity                	3
Threads                  	48
Min codons in orf        	30
Max codons in length     	32734
Max orf gaps             	2147483647
Contig start mode        	2
Contig end mode          	2
Orf start mode           	1
Forward frames           	1,2,3
Reverse frames           	1,2,3
Translation table        	1
Translate orf            	0
Use all table starts     	false
Offset of numeric ids    	0
Create lookup            	0
Compressed               	0
Add orf stop             	false
Overlap between sequences	0
Sequence split mode      	1
Header split mode        	0
Strand selection         	1
Remove temporary files   	false

indexdb refDB/NR refDB/NR --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --alph-size nucl:5,aa:21 --comp-bias-corr 1 --max-seq-len 65535 --max-seqs 300 --mask 1 --mask-lower-case 0 --spaced-kmer-mode 1 -s 7.5 --k-score 0 --check-compatible 0 --search-type 0 --split 0 --split-memory-limit 100G -v 3 --threads 48 

Target split mode. Searching through 41 splits
Estimated memory consumption: 79G
Write VERSION (0)
Write META (1)
Write SCOREMATRIX3MER (4)
Write SCOREMATRIX2MER (3)
Write SCOREMATRIXNAME (2)
Write SPACEDPATTERN (23)
Write GENERATOR (22)
Write DBR1INDEX (5)
Write DBR1DATA (6)
Write HDR1INDEX (18)
Write HDR1DATA (19)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 920ms
Index table: Masked residues: 61238522
Index table: fill
[=================================================================] 10.84M 1m 25s 193ms
Index statistics
Entries:          3850121923
DB size:          31796 MB
Avg k-mer size:   3.007908
Top 10 k-mers
    SGQQRIA	33175
    FLLLLLA	30439
    ATQAYAV	30261
    LAYGSGV	30200
    CYGPSYQ	30190
    SVAYNPS	30179
    ACNSPVY	30160
    GSLGSSV	30151
    HALLFPS	30146
    ISEQEGT	30145
Write ENTRIES (9)
Write ENTRIESOFFSETS (10)
Write SEQINDEXDATASIZE (15)
Write SEQINDEXSEQOFFSET (16)
Write SEQINDEXDATA (14)
Write ENTRIESNUM (12)
Write SEQCOUNT (13)
Index table: counting k-mers
[=================================================================] 10.85M 1m 3s 858ms
Index table: Masked residues: 61454634
Index table: fill
[=================================================================] 10.85M 1m 22s 65ms
Index statistics
Entries:          3849611059
DB size:          31793 MB
Avg k-mer size:   3.007509
Top 10 k-mers
    SGQQRIA	33182
    FLLLLLA	29650
    ATQAYAV	29520
    GLGTVAK	29423
    KLKLNKS	29407
    LAYGSGV	29406
    GSLGSSV	29390
    MLYKVMT	29388
    ACNSPVY	29374
    NEQILVS	29366
Write ENTRIES (1009)
Write ENTRIESOFFSETS (1010)
Write SEQINDEXDATASIZE (1015)
Write SEQINDEXSEQOFFSET (1016)
Write SEQINDEXDATA (1014)
Write ENTRIESNUM (1012)
Write SEQCOUNT (1013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 9s 665ms
Index table: Masked residues: 61188721
Index table: fill
[=================================================================] 10.84M 1m 30s 911ms
Index statistics
Entries:          3850232186
DB size:          31796 MB
Avg k-mer size:   3.007994
Top 10 k-mers
    SGQQRIA	33408
    FLLLLLA	30301
    ATQAYAV	30153
    AVNDSVL	30055
    DNALQAS	30055
    LAYGSGV	30055
    SVAYNPS	30029
    GSLGSSV	30023
    ISEQEGT	30012
    ACNSPVY	30011
Write ENTRIES (2009)
Write ENTRIESOFFSETS (2010)
Write SEQINDEXDATASIZE (2015)
Write SEQINDEXSEQOFFSET (2016)
Write SEQINDEXDATA (2014)
Write ENTRIESNUM (2012)
Write SEQCOUNT (2013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 736ms
Index table: Masked residues: 61279535
Index table: fill
[=================================================================] 10.84M 1m 21s 843ms
Index statistics
Entries:          3850105067
DB size:          31796 MB
Avg k-mer size:   3.007895
Top 10 k-mers
    SGQQRIA	32981
    FLLLLLA	30126
    ATQAYAV	29941
    GSLGSSV	29847
    EKVLLLL	29841
    KLKLNKS	29837
    DNALQAS	29818
    HALLFPS	29817
    SVAYNPS	29814
    MLYKVMT	29808
Write ENTRIES (3009)
Write ENTRIESOFFSETS (3010)
Write SEQINDEXDATASIZE (3015)
Write SEQINDEXSEQOFFSET (3016)
Write SEQINDEXDATA (3014)
Write ENTRIESNUM (3012)
Write SEQCOUNT (3013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 501ms
Index table: Masked residues: 61136706
Index table: fill
[=================================================================] 10.84M 1m 21s 674ms
Index statistics
Entries:          3850166774
DB size:          31796 MB
Avg k-mer size:   3.007943
Top 10 k-mers
    SGQQRIA	33368
    FLLLLLA	30128
    ATQAYAV	29916
    VLCNGSG	29834
    LAYGSGV	29833
    SVAYNPS	29819
    GSLGSSV	29814
    FSLCYSP	29805
    ILSISKQ	29801
    TELKAKV	29800
Write ENTRIES (4009)
Write ENTRIESOFFSETS (4010)
Write SEQINDEXDATASIZE (4015)
Write SEQINDEXSEQOFFSET (4016)
Write SEQINDEXDATA (4014)
Write ENTRIESNUM (4012)
Write SEQCOUNT (4013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 3s 676ms
Index table: Masked residues: 61264052
Index table: fill
[=================================================================] 10.85M 1m 22s 163ms
Index statistics
Entries:          3850288340
DB size:          31797 MB
Avg k-mer size:   3.008038
Top 10 k-mers
    SGQQRIA	33315
    FLLLLLA	29996
    ATQAYAV	29786
    LAYGSGV	29736
    AVNDSVL	29728
    GSLGSSV	29722
    KLKLNKS	29704
    SVAYNPS	29704
    ACNSPVY	29692
    GQFVLYN	29673
Write ENTRIES (5009)
Write ENTRIESOFFSETS (5010)
Write SEQINDEXDATASIZE (5015)
Write SEQINDEXSEQOFFSET (5016)
Write SEQINDEXDATA (5014)
Write ENTRIESNUM (5012)
Write SEQCOUNT (5013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 230ms
Index table: Masked residues: 61371917
Index table: fill
[=================================================================] 10.84M 1m 21s 243ms
Index statistics
Entries:          3850040390
DB size:          31795 MB
Avg k-mer size:   3.007844
Top 10 k-mers
    SGQQRIA	33009
    FLLLLLA	30239
    ATQAYAV	30076
    LAYGSGV	29994
    GSLGSSV	29988
    SVAYNPS	29975
    MVVCGTL	29966
    FSLCYSP	29963
    KLKLNKS	29958
    HALLFPS	29956
Write ENTRIES (6009)
Write ENTRIESOFFSETS (6010)
Write SEQINDEXDATASIZE (6015)
Write SEQINDEXSEQOFFSET (6016)
Write SEQINDEXDATA (6014)
Write ENTRIESNUM (6012)
Write SEQCOUNT (6013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 3s 405ms
Index table: Masked residues: 61034741
Index table: fill
[=================================================================] 10.85M 1m 21s 828ms
Index statistics
Entries:          3850317055
DB size:          31797 MB
Avg k-mer size:   3.008060
Top 10 k-mers
    SGQQRIA	32887
    FLLLLLA	30184
    ATQAYAV	29964
    LAYGSGV	29853
    GSLGSSV	29847
    KLKLNKS	29837
    HALLFPS	29834
    SVAYNPS	29827
    ACNSPVY	29817
    FLPLAAY	29796
Write ENTRIES (7009)
Write ENTRIESOFFSETS (7010)
Write SEQINDEXDATASIZE (7015)
Write SEQINDEXSEQOFFSET (7016)
Write SEQINDEXDATA (7014)
Write ENTRIESNUM (7012)
Write SEQCOUNT (7013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 797ms
Index table: Masked residues: 61311938
Index table: fill
[=================================================================] 10.84M 1m 21s 46ms
Index statistics
Entries:          3850086594
DB size:          31795 MB
Avg k-mer size:   3.007880
Top 10 k-mers
    SGQQRIA	33346
    FLLLLLA	30182
    ATQAYAV	30024
    KLKLNKS	29930
    AVNDSVL	29924
    LAYGSGV	29921
    MLYKVMT	29906
    GSLGSSV	29905
    ACNSPVY	29878
    LTNVETP	29872
Write ENTRIES (8009)
Write ENTRIESOFFSETS (8010)
Write SEQINDEXDATASIZE (8015)
Write SEQINDEXSEQOFFSET (8016)
Write SEQINDEXDATA (8014)
Write ENTRIESNUM (8012)
Write SEQCOUNT (8013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 400ms
Index table: Masked residues: 61287007
Index table: fill
[=================================================================] 10.84M 1m 21s 849ms
Index statistics
Entries:          3850445130
DB size:          31798 MB
Avg k-mer size:   3.008160
Top 10 k-mers
    SGQQRIA	33244
    FLLLLLA	30250
    ATQAYAV	30105
    GLGTVAK	30034
    KLKLNKS	30017
    LAYGSGV	30007
    GSLGSSV	29989
    ACNSPVY	29970
    HALLFPS	29959
    ISEQEGT	29956
Write ENTRIES (9009)
Write ENTRIESOFFSETS (9010)
Write SEQINDEXDATASIZE (9015)
Write SEQINDEXSEQOFFSET (9016)
Write SEQINDEXDATA (9014)
Write ENTRIESNUM (9012)
Write SEQCOUNT (9013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 9s 678ms
Index table: Masked residues: 61466528
Index table: fill
[=================================================================] 10.85M 1m 30s 622ms
Index statistics
Entries:          3849908410
DB size:          31794 MB
Avg k-mer size:   3.007741
Top 10 k-mers
    SGQQRIA	33047
    FLLLLLA	30087
    ATQAYAV	29938
    KLKLNKS	29845
    LAYGSGV	29839
    SVAYNPS	29821
    GSLGSSV	29801
    ACNSPVY	29799
    KHFCLLP	29784
    VVLVLLR	29783
Write ENTRIES (10009)
Write ENTRIESOFFSETS (10010)
Write SEQINDEXDATASIZE (10015)
Write SEQINDEXSEQOFFSET (10016)
Write SEQINDEXDATA (10014)
Write ENTRIESNUM (10012)
Write SEQCOUNT (10013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 921ms
Index table: Masked residues: 61076649
Index table: fill
[=================================================================] 10.84M 1m 21s 691ms
Index statistics
Entries:          3850338479
DB size:          31797 MB
Avg k-mer size:   3.008077
Top 10 k-mers
    SGQQRIA	32957
    FLLLLLA	30300
    ATQAYAV	30150
    VLCNGSG	30032
    LAYGSGV	30032
    AVNDSVL	30028
    CYGPSYQ	30023
    TELKAKV	30017
    SVAYNPS	30014
    GSLGSSV	30004
Write ENTRIES (11009)
Write ENTRIESOFFSETS (11010)
Write SEQINDEXDATASIZE (11015)
Write SEQINDEXSEQOFFSET (11016)
Write SEQINDEXDATA (11014)
Write ENTRIESNUM (11012)
Write SEQCOUNT (11013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 9s 855ms
Index table: Masked residues: 61187843
Index table: fill
[=================================================================] 10.85M 1m 30s 773ms
Index statistics
Entries:          3850149201
DB size:          31796 MB
Avg k-mer size:   3.007929
Top 10 k-mers
    SGQQRIA	33023
    FLLLLLA	30135
    ATQAYAV	29963
    LAYGSGV	29880
    SVAYNPS	29853
    GSLGSSV	29842
    HALLFPS	29838
    ACNSPVY	29836
    KLKLNKS	29820
    ISEQEGT	29805
Write ENTRIES (12009)
Write ENTRIESOFFSETS (12010)
Write SEQINDEXDATASIZE (12015)
Write SEQINDEXSEQOFFSET (12016)
Write SEQINDEXDATA (12014)
Write ENTRIESNUM (12012)
Write SEQCOUNT (12013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 214ms
Index table: Masked residues: 61302946
Index table: fill
[=================================================================] 10.84M 1m 22s 633ms
Index statistics
Entries:          3850002684
DB size:          31795 MB
Avg k-mer size:   3.007815
Top 10 k-mers
    SGQQRIA	33277
    FLLLLLA	30092
    ATQAYAV	29927
    MVVCGTL	29836
    KLKLNKS	29833
    LAYGSGV	29827
    GSLGSSV	29825
    ILSISKQ	29800
    LKTNVKN	29795
    ACNSPVY	29795
Write ENTRIES (13009)
Write ENTRIESOFFSETS (13010)
Write SEQINDEXDATASIZE (13015)
Write SEQINDEXSEQOFFSET (13016)
Write SEQINDEXDATA (13014)
Write ENTRIESNUM (13012)
Write SEQCOUNT (13013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 6s 563ms
Index table: Masked residues: 61272135
Index table: fill
[=================================================================] 10.85M 1m 21s 448ms
Index statistics
Entries:          3850117980
DB size:          31796 MB
Avg k-mer size:   3.007905
Top 10 k-mers
    SGQQRIA	33363
    FLLLLLA	29998
    ATQAYAV	29857
    AVNDSVL	29755
    LAYGSGV	29740
    GSLGSSV	29722
    MVVCGTL	29711
    MLYKVMT	29710
    HALLFPS	29694
    ACNSPVY	29694
Write ENTRIES (14009)
Write ENTRIESOFFSETS (14010)
Write SEQINDEXDATASIZE (14015)
Write SEQINDEXSEQOFFSET (14016)
Write SEQINDEXDATA (14014)
Write ENTRIESNUM (14012)
Write SEQCOUNT (14013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 11s 175ms
Index table: Masked residues: 61180635
Index table: fill
[=================================================================] 10.84M 1m 31s 883ms
Index statistics
Entries:          3850138116
DB size:          31796 MB
Avg k-mer size:   3.007920
Top 10 k-mers
    SGQQRIA	33160
    FLLLLLA	30415
    ATQAYAV	30219
    LAYGSGV	30142
    SVAYNPS	30130
    GSLGSSV	30128
    ACNSPVY	30105
    MLYKVMT	30094
    FLPLAAY	30091
    KLKLNKS	30076
Write ENTRIES (15009)
Write ENTRIESOFFSETS (15010)
Write SEQINDEXDATASIZE (15015)
Write SEQINDEXSEQOFFSET (15016)
Write SEQINDEXDATA (15014)
Write ENTRIESNUM (15012)
Write SEQCOUNT (15013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 779ms
Index table: Masked residues: 61262358
Index table: fill
[=================================================================] 10.84M 1m 22s 983ms
Index statistics
Entries:          3849957767
DB size:          31795 MB
Avg k-mer size:   3.007780
Top 10 k-mers
    SGQQRIA	33057
    FLLLLLA	30065
    ATQAYAV	29891
    LAYGSGV	29796
    VLCNGSG	29781
    KLKLNKS	29780
    SVAYNPS	29774
    ACNSPVY	29763
    GSLGSSV	29756
    MLYKVMT	29752
Write ENTRIES (16009)
Write ENTRIESOFFSETS (16010)
Write SEQINDEXDATASIZE (16015)
Write SEQINDEXSEQOFFSET (16016)
Write SEQINDEXDATA (16014)
Write ENTRIESNUM (16012)
Write SEQCOUNT (16013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 149ms
Index table: Masked residues: 61004416
Index table: fill
[=================================================================] 10.84M 1m 21s 354ms
Index statistics
Entries:          3850452900
DB size:          31798 MB
Avg k-mer size:   3.008166
Top 10 k-mers
    SGQQRIA	33588
    FLLLLLA	30144
    ATQAYAV	29993
    LAYGSGV	29895
    MVVCGTL	29874
    AVNDSVL	29868
    CYGPSYQ	29867
    GSLGSSV	29864
    ACNSPVY	29854
    ISEQEGT	29838
Write ENTRIES (17009)
Write ENTRIESOFFSETS (17010)
Write SEQINDEXDATASIZE (17015)
Write SEQINDEXSEQOFFSET (17016)
Write SEQINDEXDATA (17014)
Write ENTRIESNUM (17012)
Write SEQCOUNT (17013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 9s 890ms
Index table: Masked residues: 61440134
Index table: fill
[=================================================================] 10.85M 1m 31s 477ms
Index statistics
Entries:          3849779316
DB size:          31794 MB
Avg k-mer size:   3.007640
Top 10 k-mers
    SGQQRIA	33287
    FLLLLLA	29845
    ATQAYAV	29665
    LAYGSGV	29575
    KLKLNKS	29567
    GSLGSSV	29566
    FSLCYSP	29555
    SVAYNPS	29551
    MLYKVMT	29550
    ACNSPVY	29542
Write ENTRIES (18009)
Write ENTRIESOFFSETS (18010)
Write SEQINDEXDATASIZE (18015)
Write SEQINDEXSEQOFFSET (18016)
Write SEQINDEXDATA (18014)
Write ENTRIESNUM (18012)
Write SEQCOUNT (18013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 12s 514ms
Index table: Masked residues: 61281590
Index table: fill
[=================================================================] 10.84M 1m 31s 295ms
Index statistics
Entries:          3850348785
DB size:          31797 MB
Avg k-mer size:   3.008085
Top 10 k-mers
    SGQQRIA	33176
    FLLLLLA	30272
    ATQAYAV	30107
    AVNDSVL	29995
    KLKLNKS	29989
    LAYGSGV	29986
    MVVCGTL	29961
    GSLGSSV	29957
    ACNSPVY	29952
    MLYKVMT	29936
Write ENTRIES (19009)
Write ENTRIESOFFSETS (19010)
Write SEQINDEXDATASIZE (19015)
Write SEQINDEXSEQOFFSET (19016)
Write SEQINDEXDATA (19014)
Write ENTRIESNUM (19012)
Write SEQCOUNT (19013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 347ms
Index table: Masked residues: 61054807
Index table: fill
[=================================================================] 10.84M 1m 21s 327ms
Index statistics
Entries:          3850437386
DB size:          31798 MB
Avg k-mer size:   3.008154
Top 10 k-mers
    SGQQRIA	33395
    FLLLLLA	30061
    ATQAYAV	29933
    LAYGSGV	29830
    KLKLNKS	29820
    SVAYNPS	29801
    ACNSPVY	29795
    MLYKVMT	29785
    GSLGSSV	29781
    GQFVLYN	29758
Write ENTRIES (20009)
Write ENTRIESOFFSETS (20010)
Write SEQINDEXDATASIZE (20015)
Write SEQINDEXSEQOFFSET (20016)
Write SEQINDEXDATA (20014)
Write ENTRIESNUM (20012)
Write SEQCOUNT (20013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 10s 948ms
Index table: Masked residues: 61358532
Index table: fill
[=================================================================] 10.85M 1m 29s 524ms
Index statistics
Entries:          3849836671
DB size:          31794 MB
Avg k-mer size:   3.007685
Top 10 k-mers
    SGQQRIA	33178
    FLLLLLA	29948
    ATQAYAV	29740
    LAYGSGV	29648
    AVNDSVL	29635
    CYGPSYQ	29631
    SVAYNPS	29630
    GSLGSSV	29623
    ACNSPVY	29604
    FLPLAAY	29581
Write ENTRIES (21009)
Write ENTRIESOFFSETS (21010)
Write SEQINDEXDATASIZE (21015)
Write SEQINDEXSEQOFFSET (21016)
Write SEQINDEXDATA (21014)
Write ENTRIESNUM (21012)
Write SEQCOUNT (21013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 273ms
Index table: Masked residues: 61202841
Index table: fill
[=================================================================] 10.84M 1m 19s 228ms
Index statistics
Entries:          3850254812
DB size:          31796 MB
Avg k-mer size:   3.008012
Top 10 k-mers
    SGQQRIA	33182
    FLLLLLA	30118
    ATQAYAV	29943
    VLCNGSG	29851
    LAYGSGV	29851
    SVAYNPS	29837
    GSLGSSV	29834
    HALLFPS	29812
    ACNSPVY	29806
    ISEQEGT	29802
Write ENTRIES (22009)
Write ENTRIESOFFSETS (22010)
Write SEQINDEXDATASIZE (22015)
Write SEQINDEXSEQOFFSET (22016)
Write SEQINDEXDATA (22014)
Write ENTRIESNUM (22012)
Write SEQCOUNT (22013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 11s 694ms
Index table: Masked residues: 61145173
Index table: fill
[=================================================================] 10.85M 1m 27s 632ms
Index statistics
Entries:          3850176462
DB size:          31796 MB
Avg k-mer size:   3.007950
Top 10 k-mers
    SGQQRIA	33446
    FLLLLLA	30080
    ATQAYAV	29847
    GSLGSSV	29771
    AVNDSVL	29749
    CYGPSYQ	29749
    SVAYNPS	29744
    HALLFPS	29718
    ACNSPVY	29716
    KHFCLLP	29702
Write ENTRIES (23009)
Write ENTRIESOFFSETS (23010)
Write SEQINDEXDATASIZE (23015)
Write SEQINDEXSEQOFFSET (23016)
Write SEQINDEXDATA (23014)
Write ENTRIESNUM (23012)
Write SEQCOUNT (23013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 8s 743ms
Index table: Masked residues: 61136999
Index table: fill
[=================================================================] 10.85M 1m 17s 938ms
Index statistics
Entries:          3850256482
DB size:          31796 MB
Avg k-mer size:   3.008013
Top 10 k-mers
    SGQQRIA	33137
    FLLLLLA	29781
    ATQAYAV	29580
    LAYGSGV	29521
    CYGPSYQ	29506
    SVAYNPS	29500
    FSLCYSP	29491
    GSLGSSV	29490
    ACNSPVY	29486
    ILSISKQ	29461
Write ENTRIES (24009)
Write ENTRIESOFFSETS (24010)
Write SEQINDEXDATASIZE (24015)
Write SEQINDEXSEQOFFSET (24016)
Write SEQINDEXDATA (24014)
Write ENTRIESNUM (24012)
Write SEQCOUNT (24013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 7s 705ms
Index table: Masked residues: 61196311
Index table: fill
[=================================================================] 10.85M 1m 18s 198ms
Index statistics
Entries:          3850220763
DB size:          31796 MB
Avg k-mer size:   3.007985
Top 10 k-mers
    SGQQRIA	33140
    FLLLLLA	29995
    ATQAYAV	29827
    LAYGSGV	29771
    MVVCGTL	29759
    CYGPSYQ	29753
    KLKLNKS	29751
    SVAYNPS	29748
    ACNSPVY	29735
    MLYKVMT	29712
Write ENTRIES (25009)
Write ENTRIESOFFSETS (25010)
Write SEQINDEXDATASIZE (25015)
Write SEQINDEXSEQOFFSET (25016)
Write SEQINDEXDATA (25014)
Write ENTRIESNUM (25012)
Write SEQCOUNT (25013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 11s 929ms
Index table: Masked residues: 61047096
Index table: fill
[=================================================================] 10.85M 1m 27s 703ms
Index statistics
Entries:          3850450523
DB size:          31798 MB
Avg k-mer size:   3.008164
Top 10 k-mers
    SGQQRIA	33254
    FLLLLLA	30111
    ATQAYAV	29941
    LAYGSGV	29869
    CYGPSYQ	29850
    SVAYNPS	29847
    GSLGSSV	29830
    ACNSPVY	29828
    KLKLNKS	29823
    HALLFPS	29811
Write ENTRIES (26009)
Write ENTRIESOFFSETS (26010)
Write SEQINDEXDATASIZE (26015)
Write SEQINDEXSEQOFFSET (26016)
Write SEQINDEXDATA (26014)
Write ENTRIESNUM (26012)
Write SEQCOUNT (26013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 57ms
Index table: Masked residues: 61463986
Index table: fill
[=================================================================] 10.84M 1m 17s 662ms
Index statistics
Entries:          3849969010
DB size:          31795 MB
Avg k-mer size:   3.007788
Top 10 k-mers
    SGQQRIA	33231
    FLLLLLA	30254
    ATQAYAV	30083
    MVVCGTL	29995
    LAYGSGV	29994
    KLKLNKS	29983
    GSLGSSV	29978
    ILSISKQ	29956
    TELKAKV	29954
    ACNSPVY	29953
Write ENTRIES (27009)
Write ENTRIESOFFSETS (27010)
Write SEQINDEXDATASIZE (27015)
Write SEQINDEXSEQOFFSET (27016)
Write SEQINDEXDATA (27014)
Write ENTRIESNUM (27012)
Write SEQCOUNT (27013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 421ms
Index table: Masked residues: 61447173
Index table: fill
[=================================================================] 10.84M 1m 17s 628ms
Index statistics
Entries:          3850043049
DB size:          31795 MB
Avg k-mer size:   3.007846
Top 10 k-mers
    SGQQRIA	33530
    FLLLLLA	29878
    ATQAYAV	29693
    VLCNGSG	29651
    LAYGSGV	29644
    CYGPSYQ	29636
    GSLGSSV	29614
    ACNSPVY	29613
    KLKLNKS	29597
    MVVCGTL	29592
Write ENTRIES (28009)
Write ENTRIESOFFSETS (28010)
Write SEQINDEXDATASIZE (28015)
Write SEQINDEXSEQOFFSET (28016)
Write SEQINDEXDATA (28014)
Write ENTRIESNUM (28012)
Write SEQCOUNT (28013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 265ms
Index table: Masked residues: 61304785
Index table: fill
[=================================================================] 10.84M 1m 17s 421ms
Index statistics
Entries:          3849995941
DB size:          31795 MB
Avg k-mer size:   3.007809
Top 10 k-mers
    SGQQRIA	33071
    FLLLLLA	30126
    ATQAYAV	29984
    LAYGSGV	29870
    GLGTVAK	29855
    VVLVLLR	29854
    DNALQAS	29854
    SVAYNPS	29854
    GSLGSSV	29851
    ACNSPVY	29835
Write ENTRIES (29009)
Write ENTRIESOFFSETS (29010)
Write SEQINDEXDATASIZE (29015)
Write SEQINDEXSEQOFFSET (29016)
Write SEQINDEXDATA (29014)
Write ENTRIESNUM (29012)
Write SEQCOUNT (29013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 542ms
Index table: Masked residues: 61389881
Index table: fill
[=================================================================] 10.84M 1m 17s 448ms
Index statistics
Entries:          3849817877
DB size:          31794 MB
Avg k-mer size:   3.007670
Top 10 k-mers
    SGQQRIA	33369
    FLLLLLA	30183
    ATQAYAV	30042
    VLCNGSG	29941
    MVVCGTL	29937
    LAYGSGV	29936
    GSLGSSV	29920
    ACNSPVY	29901
    TELKAKV	29890
    TLGWLVV	29887
Write ENTRIES (30009)
Write ENTRIESOFFSETS (30010)
Write SEQINDEXDATASIZE (30015)
Write SEQINDEXSEQOFFSET (30016)
Write SEQINDEXDATA (30014)
Write ENTRIESNUM (30012)
Write SEQCOUNT (30013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 5s 69ms
Index table: Masked residues: 61266593
Index table: fill
[=================================================================] 10.85M 1m 17s 362ms
Index statistics
Entries:          3849915772
DB size:          31795 MB
Avg k-mer size:   3.007747
Top 10 k-mers
    SGQQRIA	33329
    FLLLLLA	30187
    ATQAYAV	30023
    GLGTVAK	29954
    LAYGSGV	29930
    CYGPSYQ	29910
    HALLFPS	29907
    SVAYNPS	29904
    GSLGSSV	29900
    ACNSPVY	29883
Write ENTRIES (31009)
Write ENTRIESOFFSETS (31010)
Write SEQINDEXDATASIZE (31015)
Write SEQINDEXSEQOFFSET (31016)
Write SEQINDEXDATA (31014)
Write ENTRIESNUM (31012)
Write SEQCOUNT (31013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 4s 938ms
Index table: Masked residues: 61324289
Index table: fill
[=================================================================] 10.85M 1m 17s 470ms
Index statistics
Entries:          3850149373
DB size:          31796 MB
Avg k-mer size:   3.007929
Top 10 k-mers
    SGQQRIA	32987
    FLLLLLA	29953
    ATQAYAV	29771
    LAYGSGV	29658
    GSLGSSV	29657
    KHHFLFL	29638
    EKVLLLL	29637
    CYGPSYQ	29636
    HALLFPS	29633
    SVAYNPS	29626
Write ENTRIES (32009)
Write ENTRIESOFFSETS (32010)
Write SEQINDEXDATASIZE (32015)
Write SEQINDEXSEQOFFSET (32016)
Write SEQINDEXDATA (32014)
Write ENTRIESNUM (32012)
Write SEQCOUNT (32013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 4s 962ms
Index table: Masked residues: 61229032
Index table: fill
[=================================================================] 10.85M 1m 17s 193ms
Index statistics
Entries:          3850133104
DB size:          31796 MB
Avg k-mer size:   3.007916
Top 10 k-mers
    SGQQRIA	33206
    FLLLLLA	29981
    ATQAYAV	29773
    VLCNGSG	29656
    KLKLNKS	29654
    LAYGSGV	29650
    AVNDSVL	29630
    GSLGSSV	29622
    DNALQAS	29621
    ACNSPVY	29612
Write ENTRIES (33009)
Write ENTRIESOFFSETS (33010)
Write SEQINDEXDATASIZE (33015)
Write SEQINDEXSEQOFFSET (33016)
Write SEQINDEXDATA (33014)
Write ENTRIESNUM (33012)
Write SEQCOUNT (33013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 4s 860ms
Index table: Masked residues: 61307069
Index table: fill
[=================================================================] 10.85M 1m 17s 150ms
Index statistics
Entries:          3849878129
DB size:          31794 MB
Avg k-mer size:   3.007717
Top 10 k-mers
    SGQQRIA	32843
    FLLLLLA	30212
    ATQAYAV	30033
    VLCNGSG	29957
    KLKLNKS	29939
    LAYGSGV	29937
    ILSISKQ	29921
    ISEQEGT	29919
    GSLGSSV	29913
    ACNSPVY	29909
Write ENTRIES (34009)
Write ENTRIESOFFSETS (34010)
Write SEQINDEXDATASIZE (34015)
Write SEQINDEXSEQOFFSET (34016)
Write SEQINDEXDATA (34014)
Write ENTRIESNUM (34012)
Write SEQCOUNT (34013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 319ms
Index table: Masked residues: 61203280
Index table: fill
[=================================================================] 10.84M 1m 17s 218ms
Index statistics
Entries:          3850317290
DB size:          31797 MB
Avg k-mer size:   3.008060
Top 10 k-mers
    SGQQRIA	33293
    FLLLLLA	30047
    ATQAYAV	29922
    KLKLNKS	29793
    LAYGSGV	29790
    HALLFPS	29770
    MVVCGTL	29767
    SVAYNPS	29766
    MLYKVMT	29766
    GSLGSSV	29766
Write ENTRIES (35009)
Write ENTRIESOFFSETS (35010)
Write SEQINDEXDATASIZE (35015)
Write SEQINDEXSEQOFFSET (35016)
Write SEQINDEXDATA (35014)
Write ENTRIESNUM (35012)
Write SEQCOUNT (35013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 360ms
Index table: Masked residues: 61352470
Index table: fill
[=================================================================] 10.84M 1m 17s 49ms
Index statistics
Entries:          3849997806
DB size:          31795 MB
Avg k-mer size:   3.007811
Top 10 k-mers
    SGQQRIA	33159
    FLLLLLA	30256
    ATQAYAV	30113
    LAYGSGV	30002
    GSLGSSV	29975
    SVAYNPS	29966
    ACNSPVY	29962
    KHFCLLP	29940
    KLKLNKS	29934
    MLYKVMT	29933
Write ENTRIES (36009)
Write ENTRIESOFFSETS (36010)
Write SEQINDEXDATASIZE (36015)
Write SEQINDEXSEQOFFSET (36016)
Write SEQINDEXDATA (36014)
Write ENTRIESNUM (36012)
Write SEQCOUNT (36013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 494ms
Index table: Masked residues: 61207851
Index table: fill
[=================================================================] 10.84M 1m 17s 227ms
Index statistics
Entries:          3850216299
DB size:          31796 MB
Avg k-mer size:   3.007981
Top 10 k-mers
    SGQQRIA	33099
    FLLLLLA	29994
    ATQAYAV	29804
    VLCNGSG	29727
    LAYGSGV	29718
    CYGPSYQ	29709
    KLKLNKS	29704
    GSLGSSV	29701
    SVAYNPS	29697
    ISEQEGT	29678
Write ENTRIES (37009)
Write ENTRIESOFFSETS (37010)
Write SEQINDEXDATASIZE (37015)
Write SEQINDEXSEQOFFSET (37016)
Write SEQINDEXDATA (37014)
Write ENTRIESNUM (37012)
Write SEQCOUNT (37013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 815ms
Index table: Masked residues: 61187236
Index table: fill
[=================================================================] 10.84M 1m 16s 934ms
Index statistics
Entries:          3850225941
DB size:          31796 MB
Avg k-mer size:   3.007989
Top 10 k-mers
    SGQQRIA	33128
    FLLLLLA	30170
    ATQAYAV	29962
    VLCNGSG	29895
    LAYGSGV	29894
    KLKLNKS	29870
    GSLGSSV	29870
    TELKAKV	29857
    ACNSPVY	29843
    NEQILVS	29829
Write ENTRIES (38009)
Write ENTRIESOFFSETS (38010)
Write SEQINDEXDATASIZE (38015)
Write SEQINDEXSEQOFFSET (38016)
Write SEQINDEXDATA (38014)
Write ENTRIESNUM (38012)
Write SEQCOUNT (38013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 594ms
Index table: Masked residues: 61224305
Index table: fill
[=================================================================] 10.84M 1m 16s 989ms
Index statistics
Entries:          3850265437
DB size:          31797 MB
Avg k-mer size:   3.008020
Top 10 k-mers
    SGQQRIA	32988
    FLLLLLA	30232
    ATQAYAV	30073
    LAYGSGV	29988
    CYGPSYQ	29965
    SVAYNPS	29965
    ACNSPVY	29944
    HALLFPS	29941
    GSLGSSV	29937
    MLYKVMT	29929
Write ENTRIES (39009)
Write ENTRIESOFFSETS (39010)
Write SEQINDEXDATASIZE (39015)
Write SEQINDEXSEQOFFSET (39016)
Write SEQINDEXDATA (39014)
Write ENTRIESNUM (39012)
Write SEQCOUNT (39013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 7s 44ms
Index table: Masked residues: 61246094
Index table: fill
[=================================================================] 10.84M 1m 17s 250ms
Index statistics
Entries:          3850118943
DB size:          31796 MB
Avg k-mer size:   3.007905
Top 10 k-mers
    SGQQRIA	33367
    FLLLLLA	30314
    ATQAYAV	30098
    GLGTVAK	30043
    LAYGSGV	29998
    GSLGSSV	29988
    SVAYNPS	29980
    HALLFPS	29973
    TELKAKV	29968
    ACNSPVY	29959
Write ENTRIES (40009)
Write ENTRIESOFFSETS (40010)
Write SEQINDEXDATASIZE (40015)
Write SEQINDEXSEQOFFSET (40016)
Write SEQINDEXDATA (40014)
Write ENTRIESNUM (40012)
Write SEQCOUNT (40013)
Time for merging to NR.idx: 0h 0m 0s 603ms
Time for processing: 2h 25m 32s 642ms

Unfortunately, I don't have the output of createtaxdb as managed to run it in interactive mode (it took less than 10 minutes).

This is the output from the easy-taxonomy command:


easy-taxonomy contigs.fasta refDB/NR alnRes tmp --split-memory-limit 100G --threads 16 

MMseqs Version:                        	13.45111
ORF filter                             	0
ORF filter e-value                     	100
ORF filter sensitivity                 	2
LCA mode                               	3
Majority threshold                     	0.5
Vote mo
```de                              	1
LCA ranks                              	
Column with taxonomic lineage          	0
Compressed                             	0
Threads                                	16
Verbosity                              	3
Taxon blacklist                        	12908:unclassified sequences,28384:other sequences
Substitution matrix                    	nucl:nucleotide.out,aa:blosum62.out
Add backtrace                          	false
Alignment mode                         	0
Alignment mode                         	0
Allow wrapped scoring                  	false
E-value threshold                      	0.001
Seq. id. threshold                     	0
Min alignment length                   	0
Seq. id. mode                          	0
Alternative alignments                 	0
Coverage threshold                     	0
Coverage mode                          	0
Max sequence length                    	65535
Compositional bias                     	1
Max reject                             	2147483647
Max accept                             	2147483647
Include identical seq. id.             	false
Preload mode                           	0
Pseudo count a                         	1
Pseudo count b                         	1.5
Score bias                             	0
Realign hits                           	false
Realign score bias                     	-0.2
Realign max seqs                       	2147483647
Gap open cost                          	nucl:5,aa:11
Gap extension cost                     	nucl:2,aa:1
Zdrop                                  	40
Seed substitution matrix               	nucl:nucleotide.out,aa:VTML80.out
Sensitivity                            	4
k-mer length                           	0
k-score                                	2147483647
Alphabet size                          	nucl:5,aa:21
Max results per query                  	300
Split database                         	0
Split mode                             	2
Split memory limit                     	100G
Diagonal scoring                       	true
Exact k-mer matching                   	0
Mask residues                          	1
Mask lower case residues               	0
Minimum diagonal score                 	15
Spaced k-mers                          	1
Spaced k-mer pattern                   	
Local temporary path                   	
Rescore mode                           	0
Remove hits by seq. id. and coverage   	false
Sort results                           	0
Mask profile                           	1
Profile E-value threshold              	0.001
Global sequence weighting              	false
Allow deletions                        	false
Filter MSA                             	1
Maximum seq. id. threshold             	0.9
Minimum seq. id.                       	0
Minimum score per column               	-20
Minimum coverage                       	0
Select N most diverse seqs             	1000
Min codons in orf                      	30
Max codons in length                   	32734
Max orf gaps                           	2147483647
Contig start mode                      	2
Contig end mode                        	2
Orf start mode                         	1
Forward frames                         	1,2,3
Reverse frames                         	1,2,3
Translation table                      	1
Translate orf                          	0
Use all table starts                   	false
Offset of numeric ids                  	0
Create lookup                          	0
Add orf stop                           	false
Overlap between sequences              	0
Sequence split mode                    	1
Header split mode                      	0
Chain overlapping alignments           	0
Merge query                            	1
Search type                            	0
Search iterations                      	1
Start sensitivity                      	4
Search steps                           	1
Exhaustive search mode                 	false
Filter results during exhaustive search	0
Strand selection                       	1
LCA search mode                        	false
Disk space limit                       	0
MPI runner                             	
Force restart with latest tmp          	false
Remove temporary files                 	true
Report mode                            	0
Alignment format                       	0
Format alignment output                	query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits
Database output                        	false
First sequence as representative       	false
Target column                          	1
Add full header                        	false
Sequence source                        	0
Database type                          	0
Shuffle input database                 	true
Createdb mode                          	1
Write lookup file                      	0

createdb /contigs.fasta tmp/18031188072042168038/query --dbtype 0 --shuffle 1 --createdb-mode 1 --write-lookup 0 --id-offset 0 --compressed 0 -v 3 

Shuffle database cannot be combined with --createdb-mode 0
We recompute with --shuffle 0
Converting sequences
[Multiline fasta can not be combined with --createdb-mode 0
We recompute with --createdb-mode 1
Time for merging to query_h: 0h 0m 0s 2ms
Time for merging to query: 0h 0m 0s 1ms
[=================================================================================
Time for merging to query_h: 0h 0m 0s 2ms
Time for merging to query: 0h 0m 0s 2ms
Database type: Nucleotide
Time for processing: 0h 0m 8s 216ms
Create directory tmp/18031188072042168038/taxonomy_tmp
taxonomy tmp/18031188072042168038/query refDB/NR tmp/18031188072042168038/result tmp/18031188072042168038/taxonomy_tmp --tax-output-mode 2 --threads 16 --split-memory-limit 100G --remove-tmp-files 1 

extractorfs tmp/18031188072042168038/query tmp/18031188072042168038/taxonomy_tmp/2085806724977121770/orfs_aa --min-length 30 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 1 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --threads 16 --compressed 0 -v 3 

[=================================================================] 810.40K 31s 522ms
Time for merging to orfs_aa_h: 0h 0m 16s 759ms
Time for merging to orfs_aa: 0h 0m 22s 22ms
Time for processing: 0h 1m 23s 421ms
prefilter tmp/18031188072042168038/taxonomy_tmp/2085806724977121770/orfs_aa refDB/NR.idx tmp/18031188072042168038/taxonomy_tmp/2085806724977121770/orfs_pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -s 2 -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 1 --split 0 --split-mode 2 --split-memory-limit 100G -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 0 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 3 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 16 --compressed 0 -v 3 

Index version: 16
Generated by:  13.45111
ScoreMatrix:  VTML80.out
Query database size: 47918555 type: Aminoacid
Target split mode. Searching through 41 splits
Estimated memory consumption: 64G
Target database size: 444603205 type: Aminoacid
Process prefiltering step 1 of 41

k-mer similarity threshold: 163
Starting prefiltering scores calculation (step 1 of 41)
Query db start 1 to 47918555
Target db start 1 to 10838348

mgabriell1 avatar Dec 13 '21 10:12 mgabriell1

I have also noticed that the NR database and its index files NR.idx... occupying around 1.9 TB of disk space. Is that normal?

mgabriell1 avatar Dec 13 '21 18:12 mgabriell1