graphmap icon indicating copy to clipboard operation
graphmap copied to clipboard

failed to build index

Open mictadlo opened this issue 8 years ago • 6 comments

Hi, Graphmap failed to build index:

....
[08:33:21 BuildIndexes] Loading reference sequences.
[08:34:27 SetupIndex_] Building the index for shape: '11110111101111'.
[08:34:48 Create] Allocated memory for a list of 4901727095 seeds (128 bits each) (0.00001 sec, diff: 20.13761 sec).
[08:34:48 Create] Memory consumption: [currentRSS = 28082 MB, peakRSS = 28082 MB]
[08:34:48 Create] Collecting seeds.
[08:34:48 Create] Minimizer seeds will be used. Minimizer window is 5.
[08:42:20 Create] [currentRSS = 102863 MB, peakRSS = 102863 MB] Sequence: 13062/22198, len: 6841557, name: 'gi|291297538|ref|NC_013947.1|'terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
/var/spool/PBS/mom_priv/jobs/2121287.pbs.SC: line 11: 22750 Aborted                 graphmap align -I -r kraken-bacteria-and-viruses-combine.fasta

What did I miss?

Thank you in advance

Michal

mictadlo avatar Jun 12 '17 22:06 mictadlo

Hi Michal,

How much memory does your machine have? The index construction will consume a huge amount of space in your case (though the final index should be smaller if you are using the latest version).

Ivan

isovic avatar Jun 13 '17 04:06 isovic

I am also seeing a similar issue trying to build an index for the lastest human reference

> graphmap align -I -t 12 -r GRCh38_full_analysis_set_plus_decoy_hla.fa
[07:23:42 BuildIndexes] Loading reference sequences.
[07:24:16 SetupIndex_] Building the index for shape: '11110111101111'.
[07:24:24 Create] Allocated memory for a list of 1608673459 seeds (128 bits each) (0.00002 sec, diff: 8.30890 sec).
[07:24:24 Create] Memory consumption: [currentRSS = 9216 MB, peakRSS = 9216 MB]
[07:24:24 Create] Collecting seeds.
[07:24:24 Create] Minimizer seeds will be used. Minimizer window is 5.
[07:28:37 Create] [currentRSS = 33499 MB, peakRSS = 33499 MB] Sequence: 3373/6732, len: 159345973, name: 'chr7'terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

Any ideas? I have v0.5.2

mbhall88 avatar Jun 21 '17 07:06 mbhall88

> graphmap align -I -r reference.fna
[10:42:58 BuildIndexes] Loading reference sequences.
[10:43:42 SetupIndex_] Building the index for shape: '11110111101111'.
[10:44:23 Create] Allocated memory for a list of 2138863289 seeds (128 bits each) (0.00002 sec, diff: 41.31062 sec).
[10:44:23 Create] Memory consumption: [currentRSS = 12262 MB, peakRSS = 12262 MB]
[10:44:23 Create] Collecting seeds.
[10:44:23 Create] Minimizer seeds will be used. Minimizer window is 5.
[10:50:00 Create] [currentRSS = 44887 MB, peakRSS = 44887 MB] Sequence: 3096/4558, len: 2603898, name: 'NZ_LT599049.1|kraken:taxid|1360'

Think I have the same issue, although it doesn't throw a 'std::bad_alloc' message for me it suddenly stop to create the index. Also using v0.5.2.

emilhaegglund avatar Jul 04 '17 08:07 emilhaegglund

Same here. Just stopped working on my Mac (set up linux environment), had a bad_alloc on our Linux server, but then on another server it seems to run through...

Installed version 0.5.2 over bioconda.

Let me know if you need any further information. Fritz

fritzsedlazeck avatar Jan 19 '18 00:01 fritzsedlazeck

I have the same problem (v0.5.2) while building index for hg38. My machine has 64 Gb memory, what would be the memory requirements?

Thanks, Vincenzo

VinceDi avatar Apr 19 '18 10:04 VinceDi

For others who arrive here, it took at least 80 GB of RAM and swap space on my system to index the human genome (GRCh38 / Hg38) using v0.5.2. If you are running Linux and have insufficient virtual memory, you can setup a swap file. This is not a good long term solution for running out of virtual memory, but can be a reasonable solution for some cases like this one. FWIW.

SCDealy avatar Nov 01 '18 17:11 SCDealy