Error: libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string
- OS: OS X
- Command:
bcalm -in "hg38/hg38.fna" -out "hg38/hg38.bc31.fa" -kmer-size "31" -nb-cores "1" -abundance-min 1 - Dataset: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
BCALM 2, git commit c8ac60252fa0b2abf511f7363cff7c4342dac2ee
setting storage type to hdf5
[Approximating frequencies of minimizers ] 100 % elapsed: 0 min 10 sec remaining: 0 min 0 sec cpu: 99.8 % mem: [6721, 6721, 0] MB
[DSK: Collecting stats on hg38 ] 100 % elapsed: 0 min 14 sec remaining: 0 min 0 sec cpu: 99.9 % mem: [1049, 1107, 0] MB
[DSK: nb solid kmers found : 2503985560 ] 100 % elapsed: 9 min 8 sec remaining: 0 min 0 sec cpu: 94.2 % mem: [1584, 8561, 0] MB
bcalm_algo params, prefix:hg38/hg38.bc31.fa.unitigs.fa k:31 a:1 minsize:10 threads:1 mintype:1
DSK used 1 passes and 3 partitions
prior to queues allocation 15:44:10 memory [current, maxRSS]: [1593, 0] MB
Starting BCALM2 15:44:10 memory [current, maxRSS]: [1593, 0] MB
[Iterating DSK partitions ] 0 % elapsed: 0 min 0 sec remaining: 0 min 0 sec
Iterated 887340388 kmers, among them 143140847 were doubled
In this superbucket (containing 242671 active minimizers), sum of time spent in lambda's: 1475479.9 msecs
longest lambda: 610.8 msecs
tot time of best scheduling of lambdas: 1475479.9 msecs
best theoretical speedup: 2415.7x
Done with partition 0 16:18:30 memory [current, maxRSS]: [25559, 0] MB
Iterated 903788377 kmers, among them 123275495 were doubled
Loaded 40462618 doubled kmers for partition 1
In this superbucket (containing 59110 active minimizers), [0/1873]
sum of time spent in lambda's: 1558374.0 msecs
longest lambda: 1442.2 msecs
tot time of best scheduling of lambdas: 1558374.0 msecs
best theoretical speedup: 1080.6x
Done with partition 1 16:55:08 memory [current, maxRSS]: [15037, 0] MB
[Iterating DSK partitions ] 33.3 % elapsed: 70 min 58 sec remaining: 141 min 55 sec
Iterated 712856795 kmers, among them 126149961 were doubled
Loaded 70376914 doubled kmers for partition 2
In this superbucket (containing 198005 active minimizers),
sum of time spent in lambda's: 2675057.4 msecs
longest lambda: 1654.8 msecs
tot time of best scheduling of lambdas: 2675057.4 msecs
best theoretical speedup: 1616.5x
Done with partition 2 17:47:00 memory [current, maxRSS]: [21140, 0] MB
[Iterating DSK partitions ] 100 % elapsed: 122 min 50 sec remaining: 0 min 0 sec
Number of sequences in glue: 431789853
Number of pre-tips removed : 0
Buckets compaction and gluing : 7369.8 secs
Within that,
creating buckets from superbuckets: 1659.4 secs
bucket compaction (wall-clock during threads): 5710.3 secs
within all bucket compaction threads,
adding nodes to subgraphs: 1570.5 secs
subgraphs constructions and compactions: 1722.2 secs
compacted nodes redistribution: 2416.1 secs
Sum of CPU times for bucket compactions: 7368.2 secs
Discrepancy between sum of fine-grained timings and total wallclock of buckets compactions step: 1.5 secs
BCALM total wallclock (excl kmer counting): 7369.9 secs
Maximum number of kmers in a subgraph: 62422
Performance of compaction step:
Wallclock time spent in parallel section : 5710.3 secs
Best theoretical speedup in parallel section : 1676.8x
Best theor. speedup in parallel section using 1 threads : 1.0x
Sum of longest bucket compaction for each sb : 3.7 secs
Sum of best scheduling for each sb : 5708.9 secs
Done with all compactions 17:47:00 memory [current, maxRSS]: [21131, 0] MB
bglue_algo params, prefix:hg38/hg38.bc31.fa.unitigs.fa k:31 threads:1
Starting bglue with 1 threads 17:47:02 memory [current, maxRSS]: [ 88, 0] MB
number of sequences to be glued: 431789853 17:47:02 memory [current, maxRSS]: [ 88, 0] MB
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string
/bin/bash: line 1: 18662 Abort trap: 6 bcalm -in "hg38/hg38.fna" -out "hg38/hg38.bc31.fa" -kmer-size "31" -nb-cores "1" -abundance-min 1
Hi Karel, Thanks for the detailed bug report. I tried on my machine, and it worked, but required around 40GB of ram. Does you machine had enough? If so, could you perhaps try on another machine (possibly Linux) to see if the bug occurs there? Best, Rayan
Then I probably ran out of memory. Unfortunately, the exception (std::out_of_range: basic_string) is not very informative and intuitive in such a case.
yes, clearly not. Are you still stuck with this problem or..?
I'm fine, I can use a cluster.
okay, just let me know if you ever get this again. On the other hand, 40GB to compact a human genome seems a bit high.. I might have to revisit that later.