Excessively long easy-taxonomy against NR
Hi,
I am trying to get the taxonomy of several contigs present in a multi-fasta file, but I'm having some issues with the easy-taxonomy command, as it is has not completed the assignment of about 804K contigs on 16 threads in 24h using as reference database NR.
Due to the limits of the machine that I'm using (I can use a partition with external connection using only a single core and a rather short time limit) set up the database using a mix of the databases command and the other commands shown in the user guide. Among the different steps I changed the number of threads used, as, for example, it seemed that createdb worked only using the same number of threads with which databases was initially run.
Is this something to be expected or have I done something wrong during the database setup?
Thanks in advance for your help and, also, for making this tool!
These are the commands that I've used:
mmseqs databases NR refDB/NR tmp --threads 1 -v 3 --force-reuse 1
mmseqs createdb tmp/11117391383852458210/nr.gz refDB/NR --compressed 0 -v 3
mmseqs createtaxdb refDB/NR tmp
mmseqs createindex refDB/NR tmp --split-memory-limit 100G
mmseqs easy-taxonomy contigs.fasta refDB/NR alnRes tmp --split-memory-limit 100G --threads 16
This is the output of createdb:
createdb tmp/11117391383852458210/nr.gz refDB/NR --compressed 0 -v 3
MMseqs Version: 13.45111
Database type 0
Shuffle input database true
Createdb mode 0
Write lookup file 1
Offset of numeric ids 0
Compressed 0
Verbosity 3
Converting sequences
[=================================================================================================== 1 Mio. sequences processed
=================================================================================================== 2 Mio. sequences processed
=================================================================================================== 3 Mio. sequences processed
=================================================================================================== 4 Mio. sequences processed
=================================================================================================== 5 Mio. sequences processed
=================================================================================================== 6 Mio. sequences processed
=================================================================================================== 7 Mio. sequences processed
=================================================================================================== 8 Mio. sequences processed
=================================================================================================== 9 Mio. sequences processed
=================================================================================================== 10 Mio. sequences processed
=================================================================================================== 11 Mio. sequences processed
=================================================================================================== 12 Mio. sequences processed
=================================================================================================== 13 Mio. sequences processed
=================================================================================================== 14 Mio. sequences processed
=================================================================================================== 15 Mio. sequences processed
=================================================================================================== 16 Mio. sequences processed
=================================================================================================== 17 Mio. sequences processed
=================================================================================================== 18 Mio. sequences processed
=================================================================================================== 19 Mio. sequences processed
=================================================================================================== 20 Mio. sequences processed
=================================================================================================== 21 Mio. sequences processed
=================================================================================================== 22 Mio. sequences processed
=================================================================================================== 23 Mio. sequences processed
=================================================================================================== 24 Mio. sequences processed
=================================================================================================== 25 Mio. sequences processed
=================================================================================================== 26 Mio. sequences processed
=================================================================================================== 27 Mio. sequences processed
=================================================================================================== 28 Mio. sequences processed
=================================================================================================== 29 Mio. sequences processed
=================================================================================================== 30 Mio. sequences processed
=================================================================================================== 31 Mio. sequences processed
=================================================================================================== 32 Mio. sequences processed
=================================================================================================== 33 Mio. sequences processed
=================================================================================================== 34 Mio. sequences processed
=================================================================================================== 35 Mio. sequences processed
=================================================================================================== 36 Mio. sequences processed
=================================================================================================== 37 Mio. sequences processed
=================================================================================================== 38 Mio. sequences processed
=================================================================================================== 39 Mio. sequences processed
=================================================================================================== 40 Mio. sequences processed
=================================================================================================== 41 Mio. sequences processed
=================================================================================================== 42 Mio. sequences processed
=================================================================================================== 43 Mio. sequences processed
=================================================================================================== 44 Mio. sequences processed
=================================================================================================== 45 Mio. sequences processed
=================================================================================================== 46 Mio. sequences processed
=================================================================================================== 47 Mio. sequences processed
=================================================================================================== 48 Mio. sequences processed
=================================================================================================== 49 Mio. sequences processed
=================================================================================================== 50 Mio. sequences processed
=================================================================================================== 51 Mio. sequences processed
=================================================================================================== 52 Mio. sequences processed
=================================================================================================== 53 Mio. sequences processed
=================================================================================================== 54 Mio. sequences processed
=================================================================================================== 55 Mio. sequences processed
=================================================================================================== 56 Mio. sequences processed
=================================================================================================== 57 Mio. sequences processed
=================================================================================================== 58 Mio. sequences processed
=================================================================================================== 59 Mio. sequences processed
=================================================================================================== 60 Mio. sequences processed
=================================================================================================== 61 Mio. sequences processed
=================================================================================================== 62 Mio. sequences processed
=================================================================================================== 63 Mio. sequences processed
=================================================================================================== 64 Mio. sequences processed
=================================================================================================== 65 Mio. sequences processed
=================================================================================================== 66 Mio. sequences processed
=================================================================================================== 67 Mio. sequences processed
=================================================================================================== 68 Mio. sequences processed
=================================================================================================== 69 Mio. sequences processed
=================================================================================================== 70 Mio. sequences processed
=================================================================================================== 71 Mio. sequences processed
=================================================================================================== 72 Mio. sequences processed
=================================================================================================== 73 Mio. sequences processed
=================================================================================================== 74 Mio. sequences processed
=================================================================================================== 75 Mio. sequences processed
=================================================================================================== 76 Mio. sequences processed
=================================================================================================== 77 Mio. sequences processed
=================================================================================================== 78 Mio. sequences processed
=================================================================================================== 79 Mio. sequences processed
=================================================================================================== 80 Mio. sequences processed
=================================================================================================== 81 Mio. sequences processed
=================================================================================================== 82 Mio. sequences processed
=================================================================================================== 83 Mio. sequences processed
=================================================================================================== 84 Mio. sequences processed
=================================================================================================== 85 Mio. sequences processed
=================================================================================================== 86 Mio. sequences processed
=================================================================================================== 87 Mio. sequences processed
=================================================================================================== 88 Mio. sequences processed
=================================================================================================== 89 Mio. sequences processed
=================================================================================================== 90 Mio. sequences processed
=================================================================================================== 91 Mio. sequences processed
=================================================================================================== 92 Mio. sequences processed
=================================================================================================== 93 Mio. sequences processed
=================================================================================================== 94 Mio. sequences processed
=================================================================================================== 95 Mio. sequences processed
=================================================================================================== 96 Mio. sequences processed
=================================================================================================== 97 Mio. sequences processed
=================================================================================================== 98 Mio. sequences processed
=================================================================================================== 99 Mio. sequences processed
=================================================================================================== 100 Mio. sequences processed
=================================================================================================== 101 Mio. sequences processed
=================================================================================================== 102 Mio. sequences processed
=================================================================================================== 103 Mio. sequences processed
=================================================================================================== 104 Mio. sequences processed
=================================================================================================== 105 Mio. sequences processed
=================================================================================================== 106 Mio. sequences processed
=================================================================================================== 107 Mio. sequences processed
=================================================================================================== 108 Mio. sequences processed
=================================================================================================== 109 Mio. sequences processed
=================================================================================================== 110 Mio. sequences processed
=================================================================================================== 111 Mio. sequences processed
=================================================================================================== 112 Mio. sequences processed
=================================================================================================== 113 Mio. sequences processed
=================================================================================================== 114 Mio. sequences processed
=================================================================================================== 115 Mio. sequences processed
=================================================================================================== 116 Mio. sequences processed
=================================================================================================== 117 Mio. sequences processed
=================================================================================================== 118 Mio. sequences processed
=================================================================================================== 119 Mio. sequences processed
=================================================================================================== 120 Mio. sequences processed
=================================================================================================== 121 Mio. sequences processed
=================================================================================================== 122 Mio. sequences processed
=================================================================================================== 123 Mio. sequences processed
=================================================================================================== 124 Mio. sequences processed
=================================================================================================== 125 Mio. sequences processed
=================================================================================================== 126 Mio. sequences processed
=================================================================================================== 127 Mio. sequences processed
=================================================================================================== 128 Mio. sequences processed
=================================================================================================== 129 Mio. sequences processed
=================================================================================================== 130 Mio. sequences processed
=================================================================================================== 131 Mio. sequences processed
=================================================================================================== 132 Mio. sequences processed
=================================================================================================== 133 Mio. sequences processed
=================================================================================================== 134 Mio. sequences processed
=================================================================================================== 135 Mio. sequences processed
=================================================================================================== 136 Mio. sequences processed
=================================================================================================== 137 Mio. sequences processed
=================================================================================================== 138 Mio. sequences processed
=================================================================================================== 139 Mio. sequences processed
=================================================================================================== 140 Mio. sequences processed
=================================================================================================== 141 Mio. sequences processed
=================================================================================================== 142 Mio. sequences processed
=================================================================================================== 143 Mio. sequences processed
=================================================================================================== 144 Mio. sequences processed
=================================================================================================== 145 Mio. sequences processed
=================================================================================================== 146 Mio. sequences processed
=================================================================================================== 147 Mio. sequences processed
=================================================================================================== 148 Mio. sequences processed
=================================================================================================== 149 Mio. sequences processed
=================================================================================================== 150 Mio. sequences processed
=================================================================================================== 151 Mio. sequences processed
=================================================================================================== 152 Mio. sequences processed
=================================================================================================== 153 Mio. sequences processed
=================================================================================================== 154 Mio. sequences processed
=================================================================================================== 155 Mio. sequences processed
=================================================================================================== 156 Mio. sequences processed
=================================================================================================== 157 Mio. sequences processed
=================================================================================================== 158 Mio. sequences processed
=================================================================================================== 159 Mio. sequences processed
=================================================================================================== 160 Mio. sequences processed
=================================================================================================== 161 Mio. sequences processed
=================================================================================================== 162 Mio. sequences processed
=================================================================================================== 163 Mio. sequences processed
=================================================================================================== 164 Mio. sequences processed
=================================================================================================== 165 Mio. sequences processed
=================================================================================================== 166 Mio. sequences processed
=================================================================================================== 167 Mio. sequences processed
=================================================================================================== 168 Mio. sequences processed
=================================================================================================== 169 Mio. sequences processed
=================================================================================================== 170 Mio. sequences processed
=================================================================================================== 171 Mio. sequences processed
=================================================================================================== 172 Mio. sequences processed
=================================================================================================== 173 Mio. sequences processed
=================================================================================================== 174 Mio. sequences processed
=================================================================================================== 175 Mio. sequences processed
=================================================================================================== 176 Mio. sequences processed
=================================================================================================== 177 Mio. sequences processed
=================================================================================================== 178 Mio. sequences processed
=================================================================================================== 179 Mio. sequences processed
=================================================================================================== 180 Mio. sequences processed
=================================================================================================== 181 Mio. sequences processed
=================================================================================================== 182 Mio. sequences processed
=================================================================================================== 183 Mio. sequences processed
=================================================================================================== 184 Mio. sequences processed
=================================================================================================== 185 Mio. sequences processed
=================================================================================================== 186 Mio. sequences processed
=================================================================================================== 187 Mio. sequences processed
=================================================================================================== 188 Mio. sequences processed
=================================================================================================== 189 Mio. sequences processed
=================================================================================================== 190 Mio. sequences processed
=================================================================================================== 191 Mio. sequences processed
=================================================================================================== 192 Mio. sequences processed
=================================================================================================== 193 Mio. sequences processed
=================================================================================================== 194 Mio. sequences processed
=================================================================================================== 195 Mio. sequences processed
=================================================================================================== 196 Mio. sequences processed
=================================================================================================== 197 Mio. sequences processed
=================================================================================================== 198 Mio. sequences processed
=================================================================================================== 199 Mio. sequences processed
=================================================================================================== 200 Mio. sequences processed
=================================================================================================== 201 Mio. sequences processed
=================================================================================================== 202 Mio. sequences processed
=================================================================================================== 203 Mio. sequences processed
=================================================================================================== 204 Mio. sequences processed
=================================================================================================== 205 Mio. sequences processed
=================================================================================================== 206 Mio. sequences processed
=================================================================================================== 207 Mio. sequences processed
=================================================================================================== 208 Mio. sequences processed
=================================================================================================== 209 Mio. sequences processed
=================================================================================================== 210 Mio. sequences processed
=================================================================================================== 211 Mio. sequences processed
=================================================================================================== 212 Mio. sequences processed
=================================================================================================== 213 Mio. sequences processed
=================================================================================================== 214 Mio. sequences processed
=================================================================================================== 215 Mio. sequences processed
=================================================================================================== 216 Mio. sequences processed
=================================================================================================== 217 Mio. sequences processed
=================================================================================================== 218 Mio. sequences processed
=================================================================================================== 219 Mio. sequences processed
=================================================================================================== 220 Mio. sequences processed
=================================================================================================== 221 Mio. sequences processed
=================================================================================================== 222 Mio. sequences processed
=================================================================================================== 223 Mio. sequences processed
=================================================================================================== 224 Mio. sequences processed
=================================================================================================== 225 Mio. sequences processed
=================================================================================================== 226 Mio. sequences processed
=================================================================================================== 227 Mio. sequences processed
=================================================================================================== 228 Mio. sequences processed
=================================================================================================== 229 Mio. sequences processed
=================================================================================================== 230 Mio. sequences processed
=================================================================================================== 231 Mio. sequences processed
=================================================================================================== 232 Mio. sequences processed
=================================================================================================== 233 Mio. sequences processed
=================================================================================================== 234 Mio. sequences processed
=================================================================================================== 235 Mio. sequences processed
=================================================================================================== 236 Mio. sequences processed
=================================================================================================== 237 Mio. sequences processed
=================================================================================================== 238 Mio. sequences processed
=================================================================================================== 239 Mio. sequences processed
=================================================================================================== 240 Mio. sequences processed
=================================================================================================== 241 Mio. sequences processed
=================================================================================================== 242 Mio. sequences processed
=================================================================================================== 243 Mio. sequences processed
=================================================================================================== 244 Mio. sequences processed
=================================================================================================== 245 Mio. sequences processed
=================================================================================================== 246 Mio. sequences processed
=================================================================================================== 247 Mio. sequences processed
=================================================================================================== 248 Mio. sequences processed
=================================================================================================== 249 Mio. sequences processed
=================================================================================================== 250 Mio. sequences processed
=================================================================================================== 251 Mio. sequences processed
=================================================================================================== 252 Mio. sequences processed
=================================================================================================== 253 Mio. sequences processed
=================================================================================================== 254 Mio. sequences processed
=================================================================================================== 255 Mio. sequences processed
=================================================================================================== 256 Mio. sequences processed
=================================================================================================== 257 Mio. sequences processed
=================================================================================================== 258 Mio. sequences processed
=================================================================================================== 259 Mio. sequences processed
=================================================================================================== 260 Mio. sequences processed
=================================================================================================== 261 Mio. sequences processed
=================================================================================================== 262 Mio. sequences processed
=================================================================================================== 263 Mio. sequences processed
=================================================================================================== 264 Mio. sequences processed
=================================================================================================== 265 Mio. sequences processed
=================================================================================================== 266 Mio. sequences processed
=================================================================================================== 267 Mio. sequences processed
=================================================================================================== 268 Mio. sequences processed
=================================================================================================== 269 Mio. sequences processed
=================================================================================================== 270 Mio. sequences processed
=================================================================================================== 271 Mio. sequences processed
=================================================================================================== 272 Mio. sequences processed
=================================================================================================== 273 Mio. sequences processed
=================================================================================================== 274 Mio. sequences processed
=================================================================================================== 275 Mio. sequences processed
=================================================================================================== 276 Mio. sequences processed
=================================================================================================== 277 Mio. sequences processed
=================================================================================================== 278 Mio. sequences processed
=================================================================================================== 279 Mio. sequences processed
=================================================================================================== 280 Mio. sequences processed
=================================================================================================== 281 Mio. sequences processed
=================================================================================================== 282 Mio. sequences processed
=================================================================================================== 283 Mio. sequences processed
=================================================================================================== 284 Mio. sequences processed
=================================================================================================== 285 Mio. sequences processed
=================================================================================================== 286 Mio. sequences processed
=================================================================================================== 287 Mio. sequences processed
=================================================================================================== 288 Mio. sequences processed
=================================================================================================== 289 Mio. sequences processed
=================================================================================================== 290 Mio. sequences processed
=================================================================================================== 291 Mio. sequences processed
=================================================================================================== 292 Mio. sequences processed
=================================================================================================== 293 Mio. sequences processed
=================================================================================================== 294 Mio. sequences processed
=================================================================================================== 295 Mio. sequences processed
=================================================================================================== 296 Mio. sequences processed
=================================================================================================== 297 Mio. sequences processed
=================================================================================================== 298 Mio. sequences processed
=================================================================================================== 299 Mio. sequences processed
=================================================================================================== 300 Mio. sequences processed
=================================================================================================== 301 Mio. sequences processed
=================================================================================================== 302 Mio. sequences processed
=================================================================================================== 303 Mio. sequences processed
=================================================================================================== 304 Mio. sequences processed
=================================================================================================== 305 Mio. sequences processed
=================================================================================================== 306 Mio. sequences processed
=================================================================================================== 307 Mio. sequences processed
=================================================================================================== 308 Mio. sequences processed
=================================================================================================== 309 Mio. sequences processed
=================================================================================================== 310 Mio. sequences processed
=================================================================================================== 311 Mio. sequences processed
=================================================================================================== 312 Mio. sequences processed
=================================================================================================== 313 Mio. sequences processed
=================================================================================================== 314 Mio. sequences processed
=================================================================================================== 315 Mio. sequences processed
=================================================================================================== 316 Mio. sequences processed
=================================================================================================== 317 Mio. sequences processed
=================================================================================================== 318 Mio. sequences processed
=================================================================================================== 319 Mio. sequences processed
=================================================================================================== 320 Mio. sequences processed
=================================================================================================== 321 Mio. sequences processed
=================================================================================================== 322 Mio. sequences processed
=================================================================================================== 323 Mio. sequences processed
=================================================================================================== 324 Mio. sequences processed
=================================================================================================== 325 Mio. sequences processed
=================================================================================================== 326 Mio. sequences processed
=================================================================================================== 327 Mio. sequences processed
=================================================================================================== 328 Mio. sequences processed
=================================================================================================== 329 Mio. sequences processed
=================================================================================================== 330 Mio. sequences processed
=================================================================================================== 331 Mio. sequences processed
=================================================================================================== 332 Mio. sequences processed
=================================================================================================== 333 Mio. sequences processed
=================================================================================================== 334 Mio. sequences processed
=================================================================================================== 335 Mio. sequences processed
=================================================================================================== 336 Mio. sequences processed
=================================================================================================== 337 Mio. sequences processed
=================================================================================================== 338 Mio. sequences processed
=================================================================================================== 339 Mio. sequences processed
=================================================================================================== 340 Mio. sequences processed
=================================================================================================== 341 Mio. sequences processed
=================================================================================================== 342 Mio. sequences processed
=================================================================================================== 343 Mio. sequences processed
=================================================================================================== 344 Mio. sequences processed
=================================================================================================== 345 Mio. sequences processed
=================================================================================================== 346 Mio. sequences processed
=================================================================================================== 347 Mio. sequences processed
=================================================================================================== 348 Mio. sequences processed
=================================================================================================== 349 Mio. sequences processed
=================================================================================================== 350 Mio. sequences processed
=================================================================================================== 351 Mio. sequences processed
=================================================================================================== 352 Mio. sequences processed
=================================================================================================== 353 Mio. sequences processed
=================================================================================================== 354 Mio. sequences processed
=================================================================================================== 355 Mio. sequences processed
=================================================================================================== 356 Mio. sequences processed
=================================================================================================== 357 Mio. sequences processed
=================================================================================================== 358 Mio. sequences processed
=================================================================================================== 359 Mio. sequences processed
=================================================================================================== 360 Mio. sequences processed
=================================================================================================== 361 Mio. sequences processed
=================================================================================================== 362 Mio. sequences processed
=================================================================================================== 363 Mio. sequences processed
=================================================================================================== 364 Mio. sequences processed
=================================================================================================== 365 Mio. sequences processed
=================================================================================================== 366 Mio. sequences processed
=================================================================================================== 367 Mio. sequences processed
=================================================================================================== 368 Mio. sequences processed
=================================================================================================== 369 Mio. sequences processed
=================================================================================================== 370 Mio. sequences processed
=================================================================================================== 371 Mio. sequences processed
=================================================================================================== 372 Mio. sequences processed
=================================================================================================== 373 Mio. sequences processed
=================================================================================================== 374 Mio. sequences processed
=================================================================================================== 375 Mio. sequences processed
=================================================================================================== 376 Mio. sequences processed
=================================================================================================== 377 Mio. sequences processed
=================================================================================================== 378 Mio. sequences processed
=================================================================================================== 379 Mio. sequences processed
=================================================================================================== 380 Mio. sequences processed
=================================================================================================== 381 Mio. sequences processed
=================================================================================================== 382 Mio. sequences processed
=================================================================================================== 383 Mio. sequences processed
=================================================================================================== 384 Mio. sequences processed
=================================================================================================== 385 Mio. sequences processed
=================================================================================================== 386 Mio. sequences processed
=================================================================================================== 387 Mio. sequences processed
=================================================================================================== 388 Mio. sequences processed
=================================================================================================== 389 Mio. sequences processed
=================================================================================================== 390 Mio. sequences processed
=================================================================================================== 391 Mio. sequences processed
=================================================================================================== 392 Mio. sequences processed
=================================================================================================== 393 Mio. sequences processed
=================================================================================================== 394 Mio. sequences processed
=================================================================================================== 395 Mio. sequences processed
=================================================================================================== 396 Mio. sequences processed
=================================================================================================== 397 Mio. sequences processed
=================================================================================================== 398 Mio. sequences processed
=================================================================================================== 399 Mio. sequences processed
=================================================================================================== 400 Mio. sequences processed
=================================================================================================== 401 Mio. sequences processed
=================================================================================================== 402 Mio. sequences processed
=================================================================================================== 403 Mio. sequences processed
=================================================================================================== 404 Mio. sequences processed
=================================================================================================== 405 Mio. sequences processed
=================================================================================================== 406 Mio. sequences processed
=================================================================================================== 407 Mio. sequences processed
=================================================================================================== 408 Mio. sequences processed
=================================================================================================== 409 Mio. sequences processed
=================================================================================================== 410 Mio. sequences processed
=================================================================================================== 411 Mio. sequences processed
=================================================================================================== 412 Mio. sequences processed
=================================================================================================== 413 Mio. sequences processed
=================================================================================================== 414 Mio. sequences processed
=================================================================================================== 415 Mio. sequences processed
=================================================================================================== 416 Mio. sequences processed
=================================================================================================== 417 Mio. sequences processed
=================================================================================================== 418 Mio. sequences processed
=================================================================================================== 419 Mio. sequences processed
=================================================================================================== 420 Mio. sequences processed
=================================================================================================== 421 Mio. sequences processed
=================================================================================================== 422 Mio. sequences processed
=================================================================================================== 423 Mio. sequences processed
=================================================================================================== 424 Mio. sequences processed
=================================================================================================== 425 Mio. sequences processed
=================================================================================================== 426 Mio. sequences processed
=================================================================================================== 427 Mio. sequences processed
=================================================================================================== 428 Mio. sequences processed
=================================================================================================== 429 Mio. sequences processed
=================================================================================================== 430 Mio. sequences processed
=================================================================================================== 431 Mio. sequences processed
=================================================================================================== 432 Mio. sequences processed
=================================================================================================== 433 Mio. sequences processed
=================================================================================================== 434 Mio. sequences processed
=================================================================================================== 435 Mio. sequences processed
=================================================================================================== 436 Mio. sequences processed
=================================================================================================== 437 Mio. sequences processed
=================================================================================================== 438 Mio. sequences processed
=================================================================================================== 439 Mio. sequences processed
=================================================================================================== 440 Mio. sequences processed
=================================================================================================== 441 Mio. sequences processed
=================================================================================================== 442 Mio. sequences processed
=================================================================================================== 443 Mio. sequences processed
=================================================================================================== 444 Mio. sequences processed
============================================================
Time for merging to NR_h: 0h 3m 55s 886ms
Time for merging to NR: 0h 7m 40s 283ms
Database type: Aminoacid
Time for processing: 1h 17m 9s 618ms
This is the output for createindex:
createindex refDB/NR tmp --split-memory-limit 100G
MMseqs Version: 13.45111
Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out
k-mer length 0
Alphabet size nucl:5,aa:21
Compositional bias 1
Max sequence length 65535
Max results per query 300
Mask residues 1
Mask lower case residues 0
Spaced k-mers 1
Spaced k-mer pattern
Sensitivity 7.5
k-score 0
Check compatible 0
Search type 0
Split database 0
Split memory limit 100G
Verbosity 3
Threads 48
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Compressed 0
Add orf stop false
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Strand selection 1
Remove temporary files false
createindex refDB/NR tmp --split-memory-limit 100G
MMseqs Version: 13.45111
Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out
k-mer length 0
Alphabet size nucl:5,aa:21
Compositional bias 1
Max sequence length 65535
Max results per query 300
Mask residues 1
Mask lower case residues 0
Spaced k-mers 1
Spaced k-mer pattern
Sensitivity 7.5
k-score 0
Check compatible 0
Search type 0
Split database 0
Split memory limit 100G
Verbosity 3
Threads 48
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Compressed 0
Add orf stop false
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Strand selection 1
Remove temporary files false
indexdb refDB/NR refDB/NR --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --alph-size nucl:5,aa:21 --comp-bias-corr 1 --max-seq-len 65535 --max-seqs 300 --mask 1 --mask-lower-case 0 --spaced-kmer-mode 1 -s 7.5 --k-score 0 --check-compatible 0 --search-type 0 --split 0 --split-memory-limit 100G -v 3 --threads 48
Target split mode. Searching through 41 splits
Estimated memory consumption: 79G
Write VERSION (0)
Write META (1)
Write SCOREMATRIX3MER (4)
Write SCOREMATRIX2MER (3)
Write SCOREMATRIXNAME (2)
Write SPACEDPATTERN (23)
Write GENERATOR (22)
Write DBR1INDEX (5)
Write DBR1DATA (6)
Write HDR1INDEX (18)
Write HDR1DATA (19)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 920ms
Index table: Masked residues: 61238522
Index table: fill
[=================================================================] 10.84M 1m 25s 193ms
Index statistics
Entries: 3850121923
DB size: 31796 MB
Avg k-mer size: 3.007908
Top 10 k-mers
SGQQRIA 33175
FLLLLLA 30439
ATQAYAV 30261
LAYGSGV 30200
CYGPSYQ 30190
SVAYNPS 30179
ACNSPVY 30160
GSLGSSV 30151
HALLFPS 30146
ISEQEGT 30145
Write ENTRIES (9)
Write ENTRIESOFFSETS (10)
Write SEQINDEXDATASIZE (15)
Write SEQINDEXSEQOFFSET (16)
Write SEQINDEXDATA (14)
Write ENTRIESNUM (12)
Write SEQCOUNT (13)
Index table: counting k-mers
[=================================================================] 10.85M 1m 3s 858ms
Index table: Masked residues: 61454634
Index table: fill
[=================================================================] 10.85M 1m 22s 65ms
Index statistics
Entries: 3849611059
DB size: 31793 MB
Avg k-mer size: 3.007509
Top 10 k-mers
SGQQRIA 33182
FLLLLLA 29650
ATQAYAV 29520
GLGTVAK 29423
KLKLNKS 29407
LAYGSGV 29406
GSLGSSV 29390
MLYKVMT 29388
ACNSPVY 29374
NEQILVS 29366
Write ENTRIES (1009)
Write ENTRIESOFFSETS (1010)
Write SEQINDEXDATASIZE (1015)
Write SEQINDEXSEQOFFSET (1016)
Write SEQINDEXDATA (1014)
Write ENTRIESNUM (1012)
Write SEQCOUNT (1013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 9s 665ms
Index table: Masked residues: 61188721
Index table: fill
[=================================================================] 10.84M 1m 30s 911ms
Index statistics
Entries: 3850232186
DB size: 31796 MB
Avg k-mer size: 3.007994
Top 10 k-mers
SGQQRIA 33408
FLLLLLA 30301
ATQAYAV 30153
AVNDSVL 30055
DNALQAS 30055
LAYGSGV 30055
SVAYNPS 30029
GSLGSSV 30023
ISEQEGT 30012
ACNSPVY 30011
Write ENTRIES (2009)
Write ENTRIESOFFSETS (2010)
Write SEQINDEXDATASIZE (2015)
Write SEQINDEXSEQOFFSET (2016)
Write SEQINDEXDATA (2014)
Write ENTRIESNUM (2012)
Write SEQCOUNT (2013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 736ms
Index table: Masked residues: 61279535
Index table: fill
[=================================================================] 10.84M 1m 21s 843ms
Index statistics
Entries: 3850105067
DB size: 31796 MB
Avg k-mer size: 3.007895
Top 10 k-mers
SGQQRIA 32981
FLLLLLA 30126
ATQAYAV 29941
GSLGSSV 29847
EKVLLLL 29841
KLKLNKS 29837
DNALQAS 29818
HALLFPS 29817
SVAYNPS 29814
MLYKVMT 29808
Write ENTRIES (3009)
Write ENTRIESOFFSETS (3010)
Write SEQINDEXDATASIZE (3015)
Write SEQINDEXSEQOFFSET (3016)
Write SEQINDEXDATA (3014)
Write ENTRIESNUM (3012)
Write SEQCOUNT (3013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 501ms
Index table: Masked residues: 61136706
Index table: fill
[=================================================================] 10.84M 1m 21s 674ms
Index statistics
Entries: 3850166774
DB size: 31796 MB
Avg k-mer size: 3.007943
Top 10 k-mers
SGQQRIA 33368
FLLLLLA 30128
ATQAYAV 29916
VLCNGSG 29834
LAYGSGV 29833
SVAYNPS 29819
GSLGSSV 29814
FSLCYSP 29805
ILSISKQ 29801
TELKAKV 29800
Write ENTRIES (4009)
Write ENTRIESOFFSETS (4010)
Write SEQINDEXDATASIZE (4015)
Write SEQINDEXSEQOFFSET (4016)
Write SEQINDEXDATA (4014)
Write ENTRIESNUM (4012)
Write SEQCOUNT (4013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 3s 676ms
Index table: Masked residues: 61264052
Index table: fill
[=================================================================] 10.85M 1m 22s 163ms
Index statistics
Entries: 3850288340
DB size: 31797 MB
Avg k-mer size: 3.008038
Top 10 k-mers
SGQQRIA 33315
FLLLLLA 29996
ATQAYAV 29786
LAYGSGV 29736
AVNDSVL 29728
GSLGSSV 29722
KLKLNKS 29704
SVAYNPS 29704
ACNSPVY 29692
GQFVLYN 29673
Write ENTRIES (5009)
Write ENTRIESOFFSETS (5010)
Write SEQINDEXDATASIZE (5015)
Write SEQINDEXSEQOFFSET (5016)
Write SEQINDEXDATA (5014)
Write ENTRIESNUM (5012)
Write SEQCOUNT (5013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 230ms
Index table: Masked residues: 61371917
Index table: fill
[=================================================================] 10.84M 1m 21s 243ms
Index statistics
Entries: 3850040390
DB size: 31795 MB
Avg k-mer size: 3.007844
Top 10 k-mers
SGQQRIA 33009
FLLLLLA 30239
ATQAYAV 30076
LAYGSGV 29994
GSLGSSV 29988
SVAYNPS 29975
MVVCGTL 29966
FSLCYSP 29963
KLKLNKS 29958
HALLFPS 29956
Write ENTRIES (6009)
Write ENTRIESOFFSETS (6010)
Write SEQINDEXDATASIZE (6015)
Write SEQINDEXSEQOFFSET (6016)
Write SEQINDEXDATA (6014)
Write ENTRIESNUM (6012)
Write SEQCOUNT (6013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 3s 405ms
Index table: Masked residues: 61034741
Index table: fill
[=================================================================] 10.85M 1m 21s 828ms
Index statistics
Entries: 3850317055
DB size: 31797 MB
Avg k-mer size: 3.008060
Top 10 k-mers
SGQQRIA 32887
FLLLLLA 30184
ATQAYAV 29964
LAYGSGV 29853
GSLGSSV 29847
KLKLNKS 29837
HALLFPS 29834
SVAYNPS 29827
ACNSPVY 29817
FLPLAAY 29796
Write ENTRIES (7009)
Write ENTRIESOFFSETS (7010)
Write SEQINDEXDATASIZE (7015)
Write SEQINDEXSEQOFFSET (7016)
Write SEQINDEXDATA (7014)
Write ENTRIESNUM (7012)
Write SEQCOUNT (7013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 797ms
Index table: Masked residues: 61311938
Index table: fill
[=================================================================] 10.84M 1m 21s 46ms
Index statistics
Entries: 3850086594
DB size: 31795 MB
Avg k-mer size: 3.007880
Top 10 k-mers
SGQQRIA 33346
FLLLLLA 30182
ATQAYAV 30024
KLKLNKS 29930
AVNDSVL 29924
LAYGSGV 29921
MLYKVMT 29906
GSLGSSV 29905
ACNSPVY 29878
LTNVETP 29872
Write ENTRIES (8009)
Write ENTRIESOFFSETS (8010)
Write SEQINDEXDATASIZE (8015)
Write SEQINDEXSEQOFFSET (8016)
Write SEQINDEXDATA (8014)
Write ENTRIESNUM (8012)
Write SEQCOUNT (8013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 400ms
Index table: Masked residues: 61287007
Index table: fill
[=================================================================] 10.84M 1m 21s 849ms
Index statistics
Entries: 3850445130
DB size: 31798 MB
Avg k-mer size: 3.008160
Top 10 k-mers
SGQQRIA 33244
FLLLLLA 30250
ATQAYAV 30105
GLGTVAK 30034
KLKLNKS 30017
LAYGSGV 30007
GSLGSSV 29989
ACNSPVY 29970
HALLFPS 29959
ISEQEGT 29956
Write ENTRIES (9009)
Write ENTRIESOFFSETS (9010)
Write SEQINDEXDATASIZE (9015)
Write SEQINDEXSEQOFFSET (9016)
Write SEQINDEXDATA (9014)
Write ENTRIESNUM (9012)
Write SEQCOUNT (9013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 9s 678ms
Index table: Masked residues: 61466528
Index table: fill
[=================================================================] 10.85M 1m 30s 622ms
Index statistics
Entries: 3849908410
DB size: 31794 MB
Avg k-mer size: 3.007741
Top 10 k-mers
SGQQRIA 33047
FLLLLLA 30087
ATQAYAV 29938
KLKLNKS 29845
LAYGSGV 29839
SVAYNPS 29821
GSLGSSV 29801
ACNSPVY 29799
KHFCLLP 29784
VVLVLLR 29783
Write ENTRIES (10009)
Write ENTRIESOFFSETS (10010)
Write SEQINDEXDATASIZE (10015)
Write SEQINDEXSEQOFFSET (10016)
Write SEQINDEXDATA (10014)
Write ENTRIESNUM (10012)
Write SEQCOUNT (10013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 3s 921ms
Index table: Masked residues: 61076649
Index table: fill
[=================================================================] 10.84M 1m 21s 691ms
Index statistics
Entries: 3850338479
DB size: 31797 MB
Avg k-mer size: 3.008077
Top 10 k-mers
SGQQRIA 32957
FLLLLLA 30300
ATQAYAV 30150
VLCNGSG 30032
LAYGSGV 30032
AVNDSVL 30028
CYGPSYQ 30023
TELKAKV 30017
SVAYNPS 30014
GSLGSSV 30004
Write ENTRIES (11009)
Write ENTRIESOFFSETS (11010)
Write SEQINDEXDATASIZE (11015)
Write SEQINDEXSEQOFFSET (11016)
Write SEQINDEXDATA (11014)
Write ENTRIESNUM (11012)
Write SEQCOUNT (11013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 9s 855ms
Index table: Masked residues: 61187843
Index table: fill
[=================================================================] 10.85M 1m 30s 773ms
Index statistics
Entries: 3850149201
DB size: 31796 MB
Avg k-mer size: 3.007929
Top 10 k-mers
SGQQRIA 33023
FLLLLLA 30135
ATQAYAV 29963
LAYGSGV 29880
SVAYNPS 29853
GSLGSSV 29842
HALLFPS 29838
ACNSPVY 29836
KLKLNKS 29820
ISEQEGT 29805
Write ENTRIES (12009)
Write ENTRIESOFFSETS (12010)
Write SEQINDEXDATASIZE (12015)
Write SEQINDEXSEQOFFSET (12016)
Write SEQINDEXDATA (12014)
Write ENTRIESNUM (12012)
Write SEQCOUNT (12013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 214ms
Index table: Masked residues: 61302946
Index table: fill
[=================================================================] 10.84M 1m 22s 633ms
Index statistics
Entries: 3850002684
DB size: 31795 MB
Avg k-mer size: 3.007815
Top 10 k-mers
SGQQRIA 33277
FLLLLLA 30092
ATQAYAV 29927
MVVCGTL 29836
KLKLNKS 29833
LAYGSGV 29827
GSLGSSV 29825
ILSISKQ 29800
LKTNVKN 29795
ACNSPVY 29795
Write ENTRIES (13009)
Write ENTRIESOFFSETS (13010)
Write SEQINDEXDATASIZE (13015)
Write SEQINDEXSEQOFFSET (13016)
Write SEQINDEXDATA (13014)
Write ENTRIESNUM (13012)
Write SEQCOUNT (13013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 6s 563ms
Index table: Masked residues: 61272135
Index table: fill
[=================================================================] 10.85M 1m 21s 448ms
Index statistics
Entries: 3850117980
DB size: 31796 MB
Avg k-mer size: 3.007905
Top 10 k-mers
SGQQRIA 33363
FLLLLLA 29998
ATQAYAV 29857
AVNDSVL 29755
LAYGSGV 29740
GSLGSSV 29722
MVVCGTL 29711
MLYKVMT 29710
HALLFPS 29694
ACNSPVY 29694
Write ENTRIES (14009)
Write ENTRIESOFFSETS (14010)
Write SEQINDEXDATASIZE (14015)
Write SEQINDEXSEQOFFSET (14016)
Write SEQINDEXDATA (14014)
Write ENTRIESNUM (14012)
Write SEQCOUNT (14013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 11s 175ms
Index table: Masked residues: 61180635
Index table: fill
[=================================================================] 10.84M 1m 31s 883ms
Index statistics
Entries: 3850138116
DB size: 31796 MB
Avg k-mer size: 3.007920
Top 10 k-mers
SGQQRIA 33160
FLLLLLA 30415
ATQAYAV 30219
LAYGSGV 30142
SVAYNPS 30130
GSLGSSV 30128
ACNSPVY 30105
MLYKVMT 30094
FLPLAAY 30091
KLKLNKS 30076
Write ENTRIES (15009)
Write ENTRIESOFFSETS (15010)
Write SEQINDEXDATASIZE (15015)
Write SEQINDEXSEQOFFSET (15016)
Write SEQINDEXDATA (15014)
Write ENTRIESNUM (15012)
Write SEQCOUNT (15013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 779ms
Index table: Masked residues: 61262358
Index table: fill
[=================================================================] 10.84M 1m 22s 983ms
Index statistics
Entries: 3849957767
DB size: 31795 MB
Avg k-mer size: 3.007780
Top 10 k-mers
SGQQRIA 33057
FLLLLLA 30065
ATQAYAV 29891
LAYGSGV 29796
VLCNGSG 29781
KLKLNKS 29780
SVAYNPS 29774
ACNSPVY 29763
GSLGSSV 29756
MLYKVMT 29752
Write ENTRIES (16009)
Write ENTRIESOFFSETS (16010)
Write SEQINDEXDATASIZE (16015)
Write SEQINDEXSEQOFFSET (16016)
Write SEQINDEXDATA (16014)
Write ENTRIESNUM (16012)
Write SEQCOUNT (16013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 149ms
Index table: Masked residues: 61004416
Index table: fill
[=================================================================] 10.84M 1m 21s 354ms
Index statistics
Entries: 3850452900
DB size: 31798 MB
Avg k-mer size: 3.008166
Top 10 k-mers
SGQQRIA 33588
FLLLLLA 30144
ATQAYAV 29993
LAYGSGV 29895
MVVCGTL 29874
AVNDSVL 29868
CYGPSYQ 29867
GSLGSSV 29864
ACNSPVY 29854
ISEQEGT 29838
Write ENTRIES (17009)
Write ENTRIESOFFSETS (17010)
Write SEQINDEXDATASIZE (17015)
Write SEQINDEXSEQOFFSET (17016)
Write SEQINDEXDATA (17014)
Write ENTRIESNUM (17012)
Write SEQCOUNT (17013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 9s 890ms
Index table: Masked residues: 61440134
Index table: fill
[=================================================================] 10.85M 1m 31s 477ms
Index statistics
Entries: 3849779316
DB size: 31794 MB
Avg k-mer size: 3.007640
Top 10 k-mers
SGQQRIA 33287
FLLLLLA 29845
ATQAYAV 29665
LAYGSGV 29575
KLKLNKS 29567
GSLGSSV 29566
FSLCYSP 29555
SVAYNPS 29551
MLYKVMT 29550
ACNSPVY 29542
Write ENTRIES (18009)
Write ENTRIESOFFSETS (18010)
Write SEQINDEXDATASIZE (18015)
Write SEQINDEXSEQOFFSET (18016)
Write SEQINDEXDATA (18014)
Write ENTRIESNUM (18012)
Write SEQCOUNT (18013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 12s 514ms
Index table: Masked residues: 61281590
Index table: fill
[=================================================================] 10.84M 1m 31s 295ms
Index statistics
Entries: 3850348785
DB size: 31797 MB
Avg k-mer size: 3.008085
Top 10 k-mers
SGQQRIA 33176
FLLLLLA 30272
ATQAYAV 30107
AVNDSVL 29995
KLKLNKS 29989
LAYGSGV 29986
MVVCGTL 29961
GSLGSSV 29957
ACNSPVY 29952
MLYKVMT 29936
Write ENTRIES (19009)
Write ENTRIESOFFSETS (19010)
Write SEQINDEXDATASIZE (19015)
Write SEQINDEXSEQOFFSET (19016)
Write SEQINDEXDATA (19014)
Write ENTRIESNUM (19012)
Write SEQCOUNT (19013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 347ms
Index table: Masked residues: 61054807
Index table: fill
[=================================================================] 10.84M 1m 21s 327ms
Index statistics
Entries: 3850437386
DB size: 31798 MB
Avg k-mer size: 3.008154
Top 10 k-mers
SGQQRIA 33395
FLLLLLA 30061
ATQAYAV 29933
LAYGSGV 29830
KLKLNKS 29820
SVAYNPS 29801
ACNSPVY 29795
MLYKVMT 29785
GSLGSSV 29781
GQFVLYN 29758
Write ENTRIES (20009)
Write ENTRIESOFFSETS (20010)
Write SEQINDEXDATASIZE (20015)
Write SEQINDEXSEQOFFSET (20016)
Write SEQINDEXDATA (20014)
Write ENTRIESNUM (20012)
Write SEQCOUNT (20013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 10s 948ms
Index table: Masked residues: 61358532
Index table: fill
[=================================================================] 10.85M 1m 29s 524ms
Index statistics
Entries: 3849836671
DB size: 31794 MB
Avg k-mer size: 3.007685
Top 10 k-mers
SGQQRIA 33178
FLLLLLA 29948
ATQAYAV 29740
LAYGSGV 29648
AVNDSVL 29635
CYGPSYQ 29631
SVAYNPS 29630
GSLGSSV 29623
ACNSPVY 29604
FLPLAAY 29581
Write ENTRIES (21009)
Write ENTRIESOFFSETS (21010)
Write SEQINDEXDATASIZE (21015)
Write SEQINDEXSEQOFFSET (21016)
Write SEQINDEXDATA (21014)
Write ENTRIESNUM (21012)
Write SEQCOUNT (21013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 273ms
Index table: Masked residues: 61202841
Index table: fill
[=================================================================] 10.84M 1m 19s 228ms
Index statistics
Entries: 3850254812
DB size: 31796 MB
Avg k-mer size: 3.008012
Top 10 k-mers
SGQQRIA 33182
FLLLLLA 30118
ATQAYAV 29943
VLCNGSG 29851
LAYGSGV 29851
SVAYNPS 29837
GSLGSSV 29834
HALLFPS 29812
ACNSPVY 29806
ISEQEGT 29802
Write ENTRIES (22009)
Write ENTRIESOFFSETS (22010)
Write SEQINDEXDATASIZE (22015)
Write SEQINDEXSEQOFFSET (22016)
Write SEQINDEXDATA (22014)
Write ENTRIESNUM (22012)
Write SEQCOUNT (22013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 11s 694ms
Index table: Masked residues: 61145173
Index table: fill
[=================================================================] 10.85M 1m 27s 632ms
Index statistics
Entries: 3850176462
DB size: 31796 MB
Avg k-mer size: 3.007950
Top 10 k-mers
SGQQRIA 33446
FLLLLLA 30080
ATQAYAV 29847
GSLGSSV 29771
AVNDSVL 29749
CYGPSYQ 29749
SVAYNPS 29744
HALLFPS 29718
ACNSPVY 29716
KHFCLLP 29702
Write ENTRIES (23009)
Write ENTRIESOFFSETS (23010)
Write SEQINDEXDATASIZE (23015)
Write SEQINDEXSEQOFFSET (23016)
Write SEQINDEXDATA (23014)
Write ENTRIESNUM (23012)
Write SEQCOUNT (23013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 8s 743ms
Index table: Masked residues: 61136999
Index table: fill
[=================================================================] 10.85M 1m 17s 938ms
Index statistics
Entries: 3850256482
DB size: 31796 MB
Avg k-mer size: 3.008013
Top 10 k-mers
SGQQRIA 33137
FLLLLLA 29781
ATQAYAV 29580
LAYGSGV 29521
CYGPSYQ 29506
SVAYNPS 29500
FSLCYSP 29491
GSLGSSV 29490
ACNSPVY 29486
ILSISKQ 29461
Write ENTRIES (24009)
Write ENTRIESOFFSETS (24010)
Write SEQINDEXDATASIZE (24015)
Write SEQINDEXSEQOFFSET (24016)
Write SEQINDEXDATA (24014)
Write ENTRIESNUM (24012)
Write SEQCOUNT (24013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 7s 705ms
Index table: Masked residues: 61196311
Index table: fill
[=================================================================] 10.85M 1m 18s 198ms
Index statistics
Entries: 3850220763
DB size: 31796 MB
Avg k-mer size: 3.007985
Top 10 k-mers
SGQQRIA 33140
FLLLLLA 29995
ATQAYAV 29827
LAYGSGV 29771
MVVCGTL 29759
CYGPSYQ 29753
KLKLNKS 29751
SVAYNPS 29748
ACNSPVY 29735
MLYKVMT 29712
Write ENTRIES (25009)
Write ENTRIESOFFSETS (25010)
Write SEQINDEXDATASIZE (25015)
Write SEQINDEXSEQOFFSET (25016)
Write SEQINDEXDATA (25014)
Write ENTRIESNUM (25012)
Write SEQCOUNT (25013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 11s 929ms
Index table: Masked residues: 61047096
Index table: fill
[=================================================================] 10.85M 1m 27s 703ms
Index statistics
Entries: 3850450523
DB size: 31798 MB
Avg k-mer size: 3.008164
Top 10 k-mers
SGQQRIA 33254
FLLLLLA 30111
ATQAYAV 29941
LAYGSGV 29869
CYGPSYQ 29850
SVAYNPS 29847
GSLGSSV 29830
ACNSPVY 29828
KLKLNKS 29823
HALLFPS 29811
Write ENTRIES (26009)
Write ENTRIESOFFSETS (26010)
Write SEQINDEXDATASIZE (26015)
Write SEQINDEXSEQOFFSET (26016)
Write SEQINDEXDATA (26014)
Write ENTRIESNUM (26012)
Write SEQCOUNT (26013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 57ms
Index table: Masked residues: 61463986
Index table: fill
[=================================================================] 10.84M 1m 17s 662ms
Index statistics
Entries: 3849969010
DB size: 31795 MB
Avg k-mer size: 3.007788
Top 10 k-mers
SGQQRIA 33231
FLLLLLA 30254
ATQAYAV 30083
MVVCGTL 29995
LAYGSGV 29994
KLKLNKS 29983
GSLGSSV 29978
ILSISKQ 29956
TELKAKV 29954
ACNSPVY 29953
Write ENTRIES (27009)
Write ENTRIESOFFSETS (27010)
Write SEQINDEXDATASIZE (27015)
Write SEQINDEXSEQOFFSET (27016)
Write SEQINDEXDATA (27014)
Write ENTRIESNUM (27012)
Write SEQCOUNT (27013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 421ms
Index table: Masked residues: 61447173
Index table: fill
[=================================================================] 10.84M 1m 17s 628ms
Index statistics
Entries: 3850043049
DB size: 31795 MB
Avg k-mer size: 3.007846
Top 10 k-mers
SGQQRIA 33530
FLLLLLA 29878
ATQAYAV 29693
VLCNGSG 29651
LAYGSGV 29644
CYGPSYQ 29636
GSLGSSV 29614
ACNSPVY 29613
KLKLNKS 29597
MVVCGTL 29592
Write ENTRIES (28009)
Write ENTRIESOFFSETS (28010)
Write SEQINDEXDATASIZE (28015)
Write SEQINDEXSEQOFFSET (28016)
Write SEQINDEXDATA (28014)
Write ENTRIESNUM (28012)
Write SEQCOUNT (28013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 6s 265ms
Index table: Masked residues: 61304785
Index table: fill
[=================================================================] 10.84M 1m 17s 421ms
Index statistics
Entries: 3849995941
DB size: 31795 MB
Avg k-mer size: 3.007809
Top 10 k-mers
SGQQRIA 33071
FLLLLLA 30126
ATQAYAV 29984
LAYGSGV 29870
GLGTVAK 29855
VVLVLLR 29854
DNALQAS 29854
SVAYNPS 29854
GSLGSSV 29851
ACNSPVY 29835
Write ENTRIES (29009)
Write ENTRIESOFFSETS (29010)
Write SEQINDEXDATASIZE (29015)
Write SEQINDEXSEQOFFSET (29016)
Write SEQINDEXDATA (29014)
Write ENTRIESNUM (29012)
Write SEQCOUNT (29013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 542ms
Index table: Masked residues: 61389881
Index table: fill
[=================================================================] 10.84M 1m 17s 448ms
Index statistics
Entries: 3849817877
DB size: 31794 MB
Avg k-mer size: 3.007670
Top 10 k-mers
SGQQRIA 33369
FLLLLLA 30183
ATQAYAV 30042
VLCNGSG 29941
MVVCGTL 29937
LAYGSGV 29936
GSLGSSV 29920
ACNSPVY 29901
TELKAKV 29890
TLGWLVV 29887
Write ENTRIES (30009)
Write ENTRIESOFFSETS (30010)
Write SEQINDEXDATASIZE (30015)
Write SEQINDEXSEQOFFSET (30016)
Write SEQINDEXDATA (30014)
Write ENTRIESNUM (30012)
Write SEQCOUNT (30013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 5s 69ms
Index table: Masked residues: 61266593
Index table: fill
[=================================================================] 10.85M 1m 17s 362ms
Index statistics
Entries: 3849915772
DB size: 31795 MB
Avg k-mer size: 3.007747
Top 10 k-mers
SGQQRIA 33329
FLLLLLA 30187
ATQAYAV 30023
GLGTVAK 29954
LAYGSGV 29930
CYGPSYQ 29910
HALLFPS 29907
SVAYNPS 29904
GSLGSSV 29900
ACNSPVY 29883
Write ENTRIES (31009)
Write ENTRIESOFFSETS (31010)
Write SEQINDEXDATASIZE (31015)
Write SEQINDEXSEQOFFSET (31016)
Write SEQINDEXDATA (31014)
Write ENTRIESNUM (31012)
Write SEQCOUNT (31013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 4s 938ms
Index table: Masked residues: 61324289
Index table: fill
[=================================================================] 10.85M 1m 17s 470ms
Index statistics
Entries: 3850149373
DB size: 31796 MB
Avg k-mer size: 3.007929
Top 10 k-mers
SGQQRIA 32987
FLLLLLA 29953
ATQAYAV 29771
LAYGSGV 29658
GSLGSSV 29657
KHHFLFL 29638
EKVLLLL 29637
CYGPSYQ 29636
HALLFPS 29633
SVAYNPS 29626
Write ENTRIES (32009)
Write ENTRIESOFFSETS (32010)
Write SEQINDEXDATASIZE (32015)
Write SEQINDEXSEQOFFSET (32016)
Write SEQINDEXDATA (32014)
Write ENTRIESNUM (32012)
Write SEQCOUNT (32013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 4s 962ms
Index table: Masked residues: 61229032
Index table: fill
[=================================================================] 10.85M 1m 17s 193ms
Index statistics
Entries: 3850133104
DB size: 31796 MB
Avg k-mer size: 3.007916
Top 10 k-mers
SGQQRIA 33206
FLLLLLA 29981
ATQAYAV 29773
VLCNGSG 29656
KLKLNKS 29654
LAYGSGV 29650
AVNDSVL 29630
GSLGSSV 29622
DNALQAS 29621
ACNSPVY 29612
Write ENTRIES (33009)
Write ENTRIESOFFSETS (33010)
Write SEQINDEXDATASIZE (33015)
Write SEQINDEXSEQOFFSET (33016)
Write SEQINDEXDATA (33014)
Write ENTRIESNUM (33012)
Write SEQCOUNT (33013)
Index table: counting k-mers
[=================================================================] 10.85M 1m 4s 860ms
Index table: Masked residues: 61307069
Index table: fill
[=================================================================] 10.85M 1m 17s 150ms
Index statistics
Entries: 3849878129
DB size: 31794 MB
Avg k-mer size: 3.007717
Top 10 k-mers
SGQQRIA 32843
FLLLLLA 30212
ATQAYAV 30033
VLCNGSG 29957
KLKLNKS 29939
LAYGSGV 29937
ILSISKQ 29921
ISEQEGT 29919
GSLGSSV 29913
ACNSPVY 29909
Write ENTRIES (34009)
Write ENTRIESOFFSETS (34010)
Write SEQINDEXDATASIZE (34015)
Write SEQINDEXSEQOFFSET (34016)
Write SEQINDEXDATA (34014)
Write ENTRIESNUM (34012)
Write SEQCOUNT (34013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 319ms
Index table: Masked residues: 61203280
Index table: fill
[=================================================================] 10.84M 1m 17s 218ms
Index statistics
Entries: 3850317290
DB size: 31797 MB
Avg k-mer size: 3.008060
Top 10 k-mers
SGQQRIA 33293
FLLLLLA 30047
ATQAYAV 29922
KLKLNKS 29793
LAYGSGV 29790
HALLFPS 29770
MVVCGTL 29767
SVAYNPS 29766
MLYKVMT 29766
GSLGSSV 29766
Write ENTRIES (35009)
Write ENTRIESOFFSETS (35010)
Write SEQINDEXDATASIZE (35015)
Write SEQINDEXSEQOFFSET (35016)
Write SEQINDEXDATA (35014)
Write ENTRIESNUM (35012)
Write SEQCOUNT (35013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 360ms
Index table: Masked residues: 61352470
Index table: fill
[=================================================================] 10.84M 1m 17s 49ms
Index statistics
Entries: 3849997806
DB size: 31795 MB
Avg k-mer size: 3.007811
Top 10 k-mers
SGQQRIA 33159
FLLLLLA 30256
ATQAYAV 30113
LAYGSGV 30002
GSLGSSV 29975
SVAYNPS 29966
ACNSPVY 29962
KHFCLLP 29940
KLKLNKS 29934
MLYKVMT 29933
Write ENTRIES (36009)
Write ENTRIESOFFSETS (36010)
Write SEQINDEXDATASIZE (36015)
Write SEQINDEXSEQOFFSET (36016)
Write SEQINDEXDATA (36014)
Write ENTRIESNUM (36012)
Write SEQCOUNT (36013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 494ms
Index table: Masked residues: 61207851
Index table: fill
[=================================================================] 10.84M 1m 17s 227ms
Index statistics
Entries: 3850216299
DB size: 31796 MB
Avg k-mer size: 3.007981
Top 10 k-mers
SGQQRIA 33099
FLLLLLA 29994
ATQAYAV 29804
VLCNGSG 29727
LAYGSGV 29718
CYGPSYQ 29709
KLKLNKS 29704
GSLGSSV 29701
SVAYNPS 29697
ISEQEGT 29678
Write ENTRIES (37009)
Write ENTRIESOFFSETS (37010)
Write SEQINDEXDATASIZE (37015)
Write SEQINDEXSEQOFFSET (37016)
Write SEQINDEXDATA (37014)
Write ENTRIESNUM (37012)
Write SEQCOUNT (37013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 4s 815ms
Index table: Masked residues: 61187236
Index table: fill
[=================================================================] 10.84M 1m 16s 934ms
Index statistics
Entries: 3850225941
DB size: 31796 MB
Avg k-mer size: 3.007989
Top 10 k-mers
SGQQRIA 33128
FLLLLLA 30170
ATQAYAV 29962
VLCNGSG 29895
LAYGSGV 29894
KLKLNKS 29870
GSLGSSV 29870
TELKAKV 29857
ACNSPVY 29843
NEQILVS 29829
Write ENTRIES (38009)
Write ENTRIESOFFSETS (38010)
Write SEQINDEXDATASIZE (38015)
Write SEQINDEXSEQOFFSET (38016)
Write SEQINDEXDATA (38014)
Write ENTRIESNUM (38012)
Write SEQCOUNT (38013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 5s 594ms
Index table: Masked residues: 61224305
Index table: fill
[=================================================================] 10.84M 1m 16s 989ms
Index statistics
Entries: 3850265437
DB size: 31797 MB
Avg k-mer size: 3.008020
Top 10 k-mers
SGQQRIA 32988
FLLLLLA 30232
ATQAYAV 30073
LAYGSGV 29988
CYGPSYQ 29965
SVAYNPS 29965
ACNSPVY 29944
HALLFPS 29941
GSLGSSV 29937
MLYKVMT 29929
Write ENTRIES (39009)
Write ENTRIESOFFSETS (39010)
Write SEQINDEXDATASIZE (39015)
Write SEQINDEXSEQOFFSET (39016)
Write SEQINDEXDATA (39014)
Write ENTRIESNUM (39012)
Write SEQCOUNT (39013)
Index table: counting k-mers
[=================================================================] 10.84M 1m 7s 44ms
Index table: Masked residues: 61246094
Index table: fill
[=================================================================] 10.84M 1m 17s 250ms
Index statistics
Entries: 3850118943
DB size: 31796 MB
Avg k-mer size: 3.007905
Top 10 k-mers
SGQQRIA 33367
FLLLLLA 30314
ATQAYAV 30098
GLGTVAK 30043
LAYGSGV 29998
GSLGSSV 29988
SVAYNPS 29980
HALLFPS 29973
TELKAKV 29968
ACNSPVY 29959
Write ENTRIES (40009)
Write ENTRIESOFFSETS (40010)
Write SEQINDEXDATASIZE (40015)
Write SEQINDEXSEQOFFSET (40016)
Write SEQINDEXDATA (40014)
Write ENTRIESNUM (40012)
Write SEQCOUNT (40013)
Time for merging to NR.idx: 0h 0m 0s 603ms
Time for processing: 2h 25m 32s 642ms
Unfortunately, I don't have the output of createtaxdb as managed to run it in interactive mode (it took less than 10 minutes).
This is the output from the easy-taxonomy command:
easy-taxonomy contigs.fasta refDB/NR alnRes tmp --split-memory-limit 100G --threads 16
MMseqs Version: 13.45111
ORF filter 0
ORF filter e-value 100
ORF filter sensitivity 2
LCA mode 3
Majority threshold 0.5
Vote mo
```de 1
LCA ranks
Column with taxonomic lineage 0
Compressed 0
Threads 16
Verbosity 3
Taxon blacklist 12908:unclassified sequences,28384:other sequences
Substitution matrix nucl:nucleotide.out,aa:blosum62.out
Add backtrace false
Alignment mode 0
Alignment mode 0
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0
Min alignment length 0
Seq. id. mode 0
Alternative alignments 0
Coverage threshold 0
Coverage mode 0
Max sequence length 65535
Compositional bias 1
Max reject 2147483647
Max accept 2147483647
Include identical seq. id. false
Preload mode 0
Pseudo count a 1
Pseudo count b 1.5
Score bias 0
Realign hits false
Realign score bias -0.2
Realign max seqs 2147483647
Gap open cost nucl:5,aa:11
Gap extension cost nucl:2,aa:1
Zdrop 40
Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out
Sensitivity 4
k-mer length 0
k-score 2147483647
Alphabet size nucl:5,aa:21
Max results per query 300
Split database 0
Split mode 2
Split memory limit 100G
Diagonal scoring true
Exact k-mer matching 0
Mask residues 1
Mask lower case residues 0
Minimum diagonal score 15
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Rescore mode 0
Remove hits by seq. id. and coverage false
Sort results 0
Mask profile 1
Profile E-value threshold 0.001
Global sequence weighting false
Allow deletions false
Filter MSA 1
Maximum seq. id. threshold 0.9
Minimum seq. id. 0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Add orf stop false
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Chain overlapping alignments 0
Merge query 1
Search type 0
Search iterations 1
Start sensitivity 4
Search steps 1
Exhaustive search mode false
Filter results during exhaustive search 0
Strand selection 1
LCA search mode false
Disk space limit 0
MPI runner
Force restart with latest tmp false
Remove temporary files true
Report mode 0
Alignment format 0
Format alignment output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits
Database output false
First sequence as representative false
Target column 1
Add full header false
Sequence source 0
Database type 0
Shuffle input database true
Createdb mode 1
Write lookup file 0
createdb /contigs.fasta tmp/18031188072042168038/query --dbtype 0 --shuffle 1 --createdb-mode 1 --write-lookup 0 --id-offset 0 --compressed 0 -v 3
Shuffle database cannot be combined with --createdb-mode 0
We recompute with --shuffle 0
Converting sequences
[Multiline fasta can not be combined with --createdb-mode 0
We recompute with --createdb-mode 1
Time for merging to query_h: 0h 0m 0s 2ms
Time for merging to query: 0h 0m 0s 1ms
[=================================================================================
Time for merging to query_h: 0h 0m 0s 2ms
Time for merging to query: 0h 0m 0s 2ms
Database type: Nucleotide
Time for processing: 0h 0m 8s 216ms
Create directory tmp/18031188072042168038/taxonomy_tmp
taxonomy tmp/18031188072042168038/query refDB/NR tmp/18031188072042168038/result tmp/18031188072042168038/taxonomy_tmp --tax-output-mode 2 --threads 16 --split-memory-limit 100G --remove-tmp-files 1
extractorfs tmp/18031188072042168038/query tmp/18031188072042168038/taxonomy_tmp/2085806724977121770/orfs_aa --min-length 30 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 1 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --threads 16 --compressed 0 -v 3
[=================================================================] 810.40K 31s 522ms
Time for merging to orfs_aa_h: 0h 0m 16s 759ms
Time for merging to orfs_aa: 0h 0m 22s 22ms
Time for processing: 0h 1m 23s 421ms
prefilter tmp/18031188072042168038/taxonomy_tmp/2085806724977121770/orfs_aa refDB/NR.idx tmp/18031188072042168038/taxonomy_tmp/2085806724977121770/orfs_pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -s 2 -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 1 --split 0 --split-mode 2 --split-memory-limit 100G -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 0 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 3 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 16 --compressed 0 -v 3
Index version: 16
Generated by: 13.45111
ScoreMatrix: VTML80.out
Query database size: 47918555 type: Aminoacid
Target split mode. Searching through 41 splits
Estimated memory consumption: 64G
Target database size: 444603205 type: Aminoacid
Process prefiltering step 1 of 41
k-mer similarity threshold: 163
Starting prefiltering scores calculation (step 1 of 41)
Query db start 1 to 47918555
Target db start 1 to 10838348
I have also noticed that the NR database and its index files NR.idx... occupying around 1.9 TB of disk space. Is that normal?