Failing to load db index
Although the index for database is built and exists in the same directory with the database file, Sina throws the error of 'Failed to load "path/to/index" - rebuilding'
Odd. Can you elaborate?
- where did you get the build from?
- what parameters and data did you use?
Some conditions under which this would occur
- The index got corrupted somehow, e.g. because the disk was full or your storage quota exceeded
- You changed the size of k between runs
- You updated the source database (should give a different message though)
I am using Sina in pipeline. It needs to get run through an iteration with the arguments below: sina-1.7 --in {} --search --meta-fmt csv --threads {} --lca-fields tax_slv --turn --search-max-result 100 --db {the absolute path to SILVA ARB SSURef99 } --out {The path to output file} --search-min-sim 0.9 --lca-quorum 0.7 --search-no-fast > /dev/null
Even after removing the build and letting the Sina creates it and store it, it still throws the error : "[Search (internal)] argument index 14:23:42 [Search (internal)] Failed to load "/path/to/db/SILVA-RefNR/SILVA138_SSURefNR99_120620.sidx" - rebuilding" The data files are regular FASTA files.
Thank you very much.
I've experienced the same problem.
(1) I've downloaded "sina-1.7.2-linux.tar.gz" from this repository, decompressed it, and used the binary within "sina-1.7.2-linux/bin".
(2) My command was as below. I had used the same command for the previous run, except for input and output files.
sina -i input.fasta -r SILVA_138.1_SSURef_NR99_12_06_20_opt.arb -o output.fasta -o output.csv --turn --search --fs-full-len=1300 --lca-fields=tax_slv --show-conf --intype=fasta --preserve-order --fasta-write-dna --fasta-write-dots --csv-crlf --overhang=remove --lowercase=unaligned --insertion=forbid --pen-gapext=2 --calc-idty --fs-kmer-no-fast --search-iupac=pessimistic
(3) Free space in my disk is >20 TB, so I think that the index was not corrupted.
(4) I used the same source database and didn't changed the size of k.
Anyway, SINA worked fine after rebuilding the index, which required 15-20 min on my machine (Ubuntu 18.04).
Below, I attach the stdout of the run. ----- 19:21:54 [SINA] This is SINA 1.7.2. Effective parameters: add-relatives = 0 auto-filter-field = auto-filter-threshold = 0.8 calc-idty = 1 colors = 0 csv-crlf = 1 csv-id = name csv-sep = db = "SILVA_138.1_SSURef_NR99_12_06_20_opt.arb" debug-graph = 0 fasta-block = 0 fasta-idx = 0 fasta-write-dna = 1 fasta-write-dots = 1 filter = fs-cover-gene = 0 fs-engine = internal fs-full-len = 1300 fs-kmer-len = 10 fs-kmer-mm = 0 fs-kmer-no-fast = 1 fs-kmer-norel = 0 fs-leave-query-out = 0 fs-max = 40 fs-min = 40 fs-min-len = 150 fs-msc = 0.7 fs-msc-max = 2 fs-no-graph = 0 fs-oldmatch = 0 fs-req = 1 fs-req-full = 1 fs-req-gaps = 10 fs-weight = 1 gene-end = 0 gene-start = 0 in = "input.fasta" insertion = forbid intype = FASTA lca-fields = tax_slv lca-quorum = 0.7 line-length = 0 lowercase = unaligned markaligned = 0 markcopied = 0 match-score = 2 max-in-flight = 160 meta-fmt = none min-idty = 0 mismatch-score = -1 no-align = 0 num-pts = 80 out = "output.fasta" "output.csv" overhang = remove pen-gap = 5 pen-gapext = 2 prealigned = 0 preserve-order = 1 prot-level = 4 ptport = :/tmp/sina_pt_81559 realign = 0 search = 1 search-all = 0 search-copy-fields = search-correction = none search-cover = query search-filter-lowercase = 0 search-ignore-super = 0 search-iupac = pessimistic search-kmer-candidates = 1000 search-kmer-len = 10 search-kmer-mm = 0 search-kmer-norel = 0 search-max-result = 10 search-min-sim = 0.7 search-no-fast = 0 search-port = :/tmp/sina_pt2_81559 select-file = select-skip = 0 select-step = 1 show-conf = show-diff = 0 show-dist = 0 threads = 4294967295 turn = revcomp use-subst-matrix = 0 write-used-rels = 0
Processing: 0 [00:00:04] 19:21:58 [Search (internal)] Failed to load "SILVA_138.1_SSURef_NR99_12_06_20_opt.sidx" - rebuilding 19:31:20 [famfinder] Using internal engine for reference search Processing: 0 [00:09:26]██████████████████████████████████████████████████████████████████████████████████████| 1048576/1048576 [00:09:18 / 00:00:00] 19:31:20 [Search (internal)] Failed to load "SILVA_138.1_SSURef_NR99_12_06_20_opt.sidx" - rebuilding 19:39:58 [SINA] Aligner ready. Processing sequences 19:39:59 [SINA] Took 0.721s to align 24 sequences (33.2428 sequences/s) 19:39:59 [SINA] SINA finished. 19:39:59 [ARB I/O] Closing ARB database '"SILVA_138.1_SSURef_NR99_12_06_20_opt.arb"' ... -----
Hello,
I have a similar issue which seems to be linked to the --search-no-fast & --fs-kmer-no-fast parameters. I am trying to replicate online SILVA tool default parameters and when I use --search-no-fast or --fs-kmer-no-fast, my database (SSU Ref NR99) which worked with the parameter off, goes into rebuilding the index and after, I believe, yields a similar result (almost) to when the parameter is off (as I compared with the online SILVA tool's output) & needs to rebuild the index everytime one of those parameter is switched on.
Below is my command: (my input is a fasta file)
sina -i $query -o $output -r SILVA_138.1_SSURef_NR99.arb --outtype=csv --search --search-db SILVA_138.1_SSURef_NR99.arb --lca-quorum=0.8 --min-idty=0.9 --lca-fields tax_slv --fields align_quality_slv,lca_tax_slv --preserve-order --show-conf --search-no-fast
Find below the stdout of the run
08:23:14 [SINA] This is SINA 1.7.2. Effective parameters: add-relatives = 0 auto-filter-field = auto-filter-threshold = 0.8 calc-idty = 0 colors = 0 csv-crlf = 0 csv-id = name csv-sep = db = "/Raw_data/SILVA_138.1_SSURef_NR99.arb" debug-graph = 0 fasta-block = 0 fasta-idx = 0 fasta-write-dna = 0 fasta-write-dots = 0 fields = align_quality_slv,lca_tax_slv filter = fs-cover-gene = 0 fs-engine = internal fs-full-len = 1400 fs-kmer-len = 10 fs-kmer-mm = 0 fs-kmer-no-fast = 0 fs-kmer-norel = 0 fs-leave-query-out = 0 fs-max = 40 fs-min = 40 fs-min-len = 150 fs-msc = 0.7 fs-msc-max = 2 fs-no-graph = 0 fs-oldmatch = 0 fs-req = 1 fs-req-full = 1 fs-req-gaps = 10 fs-weight = 1 gene-end = 0 gene-start = 0 in = "/Raw_data/input.fasta" insertion = shift intype = AUTO lca-fields = tax_slv lca-quorum = 0.8 line-length = 0 lowercase = none markaligned = 0 markcopied = 0 match-score = 2 max-in-flight = 20 meta-fmt = none min-idty = 0.9 mismatch-score = -1 no-align = 0 num-pts = 10 out = "/Results/result.csv" outtype = CSV overhang = attach pen-gap = 5 pen-gapext = 2 prealigned = 0 preserve-order = 1 prot-level = 4 ptport = :/tmp/sina_pt_188530 realign = 0 search = 1 search-all = 0 search-copy-fields = search-correction = none search-cover = query search-db = "/Raw_data/SILVA_138.1_SSURef_NR99.arb" search-filter-lowercase = 0 search-ignore-super = 0 search-iupac = optimistic search-kmer-candidates = 1000 search-kmer-len = 10 search-kmer-mm = 0 search-kmer-norel = 0 search-max-result = 10 search-min-sim = 0.7 search-no-fast = 1 search-port = :/tmp/sina_pt2_188530 select-file = select-skip = 0 select-step = 1 show-conf = show-diff = 0 show-dist = 0 threads = 4294967295 turn = none use-subst-matrix = 0 write-used-rels = 0
08:23:20 [famfinder] Using internal engine for reference search Processing: 0 [00:00:05]███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 510508/510508 [00:00:00 / 00:00:00] 08:23:20 [Search (internal)] Failed to load "/Raw_data/SILVA_138.1_SSURef_NR99.sidx" - rebuilding 08:29:10 [SINA] Aligner ready. Processing sequences 08:29:44 [SINA] Took 34.455s to align 1258 sequences (36.5108 sequences/s) 08:29:44 [SINA] SINA finished. 08:29:44 [ARB I/O] Closing ARB database '"/Raw_data/SILVA_138.1_SSURef_NR99.arb"' ...