MMseqs2 icon indicating copy to clipboard operation
MMseqs2 copied to clipboard

Ungapped prefilter died during GPU-accelerated search

Open TheChosenOneJG opened this issue 1 year ago • 8 comments

When I ran my scrpit below to generate MSA against uniref90 by MMseqs2 GPU-accelerated searching, it reported an error as follows. However, when I replaced the targetDB uniref90 with a smaller one (consisit of about thousands of sequences), such error would not appear. My Linux system contains:

  • 40 CPUs,
  • 400+ GB ram,
  • 4 GPUs,
  • enough storage.

Could you please help me figure this out? Thanks in advance for your expert help.

Error

ungappedprefilter /a100_nas/ai4s/MSA/queries/testDB /a100_nas/ai4s/MSA/uniref90DB_gpu.idx /a100_nas/ai4s/MSA/tmp/15602816422822286028/pref_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -c 0 -e 0.001 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --min-ungapped-score 15 --max-seqs 300 --db-load-mode 2 --gpu 1 --gpu-server 1 --prefilter-mode 1 --threads 42 --compressed 0 -v 3 

Index version: 16
Generated by:  16.747c6
ScoreMatrix:  VTML80.out
--gpu-server /dev/shm/8478586279687262130 does not existError: Ungapped prefilter died

part of my script

mmseqs createdb $UNIREFFASTA $UNIREFDB
mmseqs makepaddedseqdb $UNIREFDB $UNIREFDB_GPU

QUERY_DB="${QUERY_DIR}/testDB"
mmseqs createdb $QUERY $QUERY_DB

# searching with GPU 
mmseqs createindex $UNIREFDB_GPU $TMP --index-subset 2
mmseqs gpuserver $UNIREFDB_GPU --gpu 1 &
PID=$!
mmseqs search $QUERY_DB $UNIREFDB_GPU $RESULTS/aln_test $TMP --gpu 1 --gpu-server 1 --db-load-mode 2 --remove-tmp-files 1 --max-seq 10000
kill $PID

TheChosenOneJG avatar Dec 11 '24 06:12 TheChosenOneJG

hello, it seems i have the same problem with you.I wanna know that if you used the command mmseqs makepaddedseqdb for the targetdb and it returns the segment error?

unknow1024 avatar Dec 11 '24 07:12 unknow1024

hello, it seems i have the same problem with you.I wanna know that if you used the command mmseqs makepaddedseqdb for the targetdb and it returns the segment error?

Hello, I definitely used mmseqs makepaddedseqdb command, and it succeeded and no error occurred. How large is your targetDB?

TheChosenOneJG avatar Dec 11 '24 08:12 TheChosenOneJG

Nearly 43GB.It is uniclust30-hhsuite and used mmseqs createdb to get the targetdb.

unknow1024 avatar Dec 11 '24 08:12 unknow1024

Sorry I missed this issue. I recommend adding a sleep command after the gpuserver start to make sure its actually ready.

This is definitely something to still improve for the future

milot-mirdita avatar Jan 16 '25 18:01 milot-mirdita

Sorry I missed this issue. I recommend adding a sleep command after the gpuserver start to make sure its actually ready.

This is definitely something to still improve for the future

Thank you for your reply! It helps a lot!

TheChosenOneJG avatar Jan 17 '25 09:01 TheChosenOneJG

Hi, I'm experiencing the same issue when trying to start a GPU server with a targetDB_pad. A sleep command after starting the GPU server doesn't resolve the issue for me (tried sleep of up to 120).

I'm using the latest docker container of mmseqs2-cuda12 and running it on an NVIDIA A40 (CUDA v12.2) with 900GB RAM plus 48 CPUs:

#!/bin/bash
#SBATCH -D ./
#SBATCH -J docker_mmseqs
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=48
#SBATCH --gres=gpu:A40:1
#SBATCH --mem=900000
#SBATCH --time=1:00:00
## other sbatch stuff...

CACHE="/home/path/to/singularity/cache"
SIF="/home/path/to/container/mmseqs2_master-cuda12.sif"

# make GPU DB from CPU DB
singularity run --nv \
    -B $CACHE:/cache \
    -B $(pwd):/work \
    $SIF makepaddedseqdb \
    /work/db/targetDB \
    /work/db/targetDB_pad

# start GPU server
singularity run --nv \
    -B $CACHE:/cache \
    -B $(pwd):/work \
    $SIF gpuserver \
    /work/db/targetDB_pad --max-seqs 10000 --db-load-mode 0 --prefilter-mode 1 &
PID1=$!

sleep 120

# run mmseqs
singularity run --nv \
    -B $CACHE:/cache \
    -B $(pwd):/work \
    $SIF easy-search \
    /work/input.fasta \
    /work/db/targetDB_pad \
    /work/output.m8 \
    /work/tmp \
    --gpu 1 \
    --gpu-server 1 \
    --remove-tmp-files 1

The stdout:

/work/db/targetDB_pad exists and will be overwritten
makepaddedseqdb /work/db/targetDB /work/db/targetDB_pad

MMseqs Version:                         eaecacf4ba24e9c8a0f2a1da115603ebc80710ad
Substitution matrix                     aa:blosum62.out,nucl:nucleotide.out
Score bias                              0
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Mask lower letter repeating N times     0
Write lookup file                       1
Threads                                 64
Verbosity                               3

[=================================================================] 3.26K 0s 15ms
Time for merging to hydDB1_pad: 0h 0m 0s 190ms
Time for merging to hydDB1_pad_h: 0h 0m 0s 152ms
Time for processing: 0h 0m 1s 216ms
gpuserver /work/db/targetDB_pad --max-seqs 10000 --db-load-mode 0 --prefilter-mode 1

MMseqs Version:         eaecacf4ba24e9c8a0f2a1da115603ebc80710ad
Use GPU                 0
Max results per query   10000
Preload mode            1
Prefilter mode          1

374968733484103649
easy-search /work/input.fasta /work/db/targetDB_pad /work/output.m8 /work/tmp --gpu 1 --gpu-server 1 --remove-tmp-files 1

MMseqs Version:                         eaecacf4ba24e9c8a0f2a1da115603ebc80710ad
Substitution matrix                     aa:blosum62.out,nucl:nucleotide.out
Add backtrace                           false
Alignment mode                          3
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.001
Seq. id. threshold                      0
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Coverage threshold                      0
Coverage mode                           0
Max sequence length                     65535
Compositional bias                      1
Compositional bias scale                1
Max reject                              2147483647
Max accept                              2147483647
Include identical seq. id.              false
Preload mode                            0
Pseudo count a                          substitution:1.100,context:1.400
Pseudo count b                          substitution:4.100,context:5.800
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Correlation score weight                0
Gap open cost                           aa:11,nucl:5
Gap extension cost                      aa:1,nucl:2
Zdrop                                   40
Threads                                 64
Compressed                              0
Verbosity                               3
Seed substitution matrix                aa:VTML80.out,nucl:nucleotide.out
Sensitivity                             5.7
k-mer length                            0
Target search mode                      0
k-score                                 seq:2147483647,prof:2147483647
Alphabet size                           aa:21,nucl:5
Max results per query                   300
Split database                          0
Split mode                              2
Split memory limit                      0
Diagonal scoring                        true
Exact k-mer matching                    0
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Mask lower letter repeating N times     0
Minimum diagonal score                  15
Selected taxa
Spaced k-mers                           1
Spaced k-mer pattern
Local temporary path
Use GPU                                 1
Use GPU server                          1
Wait for GPU server                     600
Prefilter mode                          0
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Mask profile                            1
Profile E-value threshold               0.001
Global sequence weighting               false
Allow deletions                         false
Filter MSA                              1
Use filter only at N seqs               0
Maximum seq. id. threshold              0.9
Minimum seq. id.                        0.0
Minimum score per column                -20
Minimum coverage                        0
Select N most diverse seqs              1000
Pseudo count mode                       0
Profile output mode                     0
Min codons in orf                       30
Max codons in length                    32734
Max orf gaps                            2147483647
Contig start mode                       2
Contig end mode                         2
Orf start mode                          1
Forward frames                          1,2,3
Reverse frames                          1,2,3
Translation table                       1
Translate orf                           0
Use all table starts                    false
Offset of numeric ids                   0
Create lookup                           0
Overlap between sequences               0
Sequence split mode                     1
Header split mode                       0
Chain overlapping alignments            0
Merge query                             1
Search type                             0
Search iterations                       1
Start sensitivity                       4
Search steps                            1
Exhaustive search mode                  false
Filter results during exhaustive search 0
Strand selection                        1
LCA search mode                         false
Disk space limit                        0
MPI runner
Force restart with latest tmp           false
Remove temporary files                  true
Translation mode                        0
Alignment format                        0
Format alignment output                 query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits
Database output                         false
Overlap threshold                       0
Database type                           0
Shuffle input database                  true
Createdb mode                           0
Write lookup file                       0
Greedy best hits                        false

search /work/tmp/995132545111804393/query /work/db/targetDB_pad /work/tmp/995132545111804393/result /work/tmp/995132545111804393/search_tmp --alignment-mode 3 -s 5.7 --gpu 1 --gpu-server 1 --remove-tmp-files 1

Error: Ungapped prefilter died
Error: Search died

And the stderror:

INFO:    underlay of /usr/bin/nvidia-smi required more than 50 (259) bind mounts
INFO:    underlay of /usr/bin/nvidia-smi required more than 50 (259) bind mounts
INFO:    underlay of /usr/bin/nvidia-smi required more than 50 (259) bind mounts
malloc(): corrupted top size
Aborted (core dumped)

If I skip starting a GPU server for the targetDB_pad, it works normally. So for now I'll just skip the gpuserver step, but was wondering if there is any way to resolve this issue. Thanks

jlingford avatar Mar 06 '25 23:03 jlingford

What query/target set is this? the one in example/?

milot-mirdita avatar Mar 07 '25 07:03 milot-mirdita

No, this is one I built with mmseqs createdb and contains 3261 sequences of roughly 400-900aa length. The query fasta file is also my own:

>NuoD
MTEKYAPPIPETSDYAISVGPQHPTHKEPVRFIFQVKGETVQDVDLRIGFNHRGIEKAFENRTWLKNLYLVTRLCGICSVAHQLAYVHAAEKCMIIQDSVPERAHFIRLIIAELERVQSHILWYGVLAHDTGYDTLFHITWRDREIVNDILELISGNRVNYAMYTLGGVRRDISREQKEKIVPKLKDLRKKCEYHRAVMMKERSFIVRQKGVAILSKKDAKKYCAVGPTVRASGVNIDLRKVDPYSVYDKVSFDVPLYSEGDILGGLYNRLDETLISIDIILDALDAMPAGDIRLPWREVPRRPETSEGIQRVEAPRGEDIHYIRSNGTDKPDRHKIRAPTFQNFPSLVHRLKGVQVADIPPVIRVIDPCIGCCERVTFVKAGSRKKLTLNGHHLVSRANRFYRSGTKVLDF

jlingford avatar Mar 07 '25 08:03 jlingford