Ungapped prefilter died during GPU-accelerated search
When I ran my scrpit below to generate MSA against uniref90 by MMseqs2 GPU-accelerated searching, it reported an error as follows. However, when I replaced the targetDB uniref90 with a smaller one (consisit of about thousands of sequences), such error would not appear. My Linux system contains:
- 40 CPUs,
- 400+ GB ram,
- 4 GPUs,
- enough storage.
Could you please help me figure this out? Thanks in advance for your expert help.
Error
ungappedprefilter /a100_nas/ai4s/MSA/queries/testDB /a100_nas/ai4s/MSA/uniref90DB_gpu.idx /a100_nas/ai4s/MSA/tmp/15602816422822286028/pref_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -c 0 -e 0.001 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --min-ungapped-score 15 --max-seqs 300 --db-load-mode 2 --gpu 1 --gpu-server 1 --prefilter-mode 1 --threads 42 --compressed 0 -v 3
Index version: 16
Generated by: 16.747c6
ScoreMatrix: VTML80.out
--gpu-server /dev/shm/8478586279687262130 does not existError: Ungapped prefilter died
part of my script
mmseqs createdb $UNIREFFASTA $UNIREFDB
mmseqs makepaddedseqdb $UNIREFDB $UNIREFDB_GPU
QUERY_DB="${QUERY_DIR}/testDB"
mmseqs createdb $QUERY $QUERY_DB
# searching with GPU
mmseqs createindex $UNIREFDB_GPU $TMP --index-subset 2
mmseqs gpuserver $UNIREFDB_GPU --gpu 1 &
PID=$!
mmseqs search $QUERY_DB $UNIREFDB_GPU $RESULTS/aln_test $TMP --gpu 1 --gpu-server 1 --db-load-mode 2 --remove-tmp-files 1 --max-seq 10000
kill $PID
hello, it seems i have the same problem with you.I wanna know that if you used the command mmseqs makepaddedseqdb for the targetdb and it returns the segment error?
hello, it seems i have the same problem with you.I wanna know that if you used the command mmseqs makepaddedseqdb for the targetdb and it returns the segment error?
Hello, I definitely used mmseqs makepaddedseqdb command, and it succeeded and no error occurred. How large is your targetDB?
Nearly 43GB.It is uniclust30-hhsuite and used mmseqs createdb to get the targetdb.
Sorry I missed this issue. I recommend adding a sleep command after the gpuserver start to make sure its actually ready.
This is definitely something to still improve for the future
Sorry I missed this issue. I recommend adding a sleep command after the
gpuserverstart to make sure its actually ready.This is definitely something to still improve for the future
Thank you for your reply! It helps a lot!
Hi, I'm experiencing the same issue when trying to start a GPU server with a targetDB_pad. A sleep command after starting the GPU server doesn't resolve the issue for me (tried sleep of up to 120).
I'm using the latest docker container of mmseqs2-cuda12 and running it on an NVIDIA A40 (CUDA v12.2) with 900GB RAM plus 48 CPUs:
#!/bin/bash
#SBATCH -D ./
#SBATCH -J docker_mmseqs
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=48
#SBATCH --gres=gpu:A40:1
#SBATCH --mem=900000
#SBATCH --time=1:00:00
## other sbatch stuff...
CACHE="/home/path/to/singularity/cache"
SIF="/home/path/to/container/mmseqs2_master-cuda12.sif"
# make GPU DB from CPU DB
singularity run --nv \
-B $CACHE:/cache \
-B $(pwd):/work \
$SIF makepaddedseqdb \
/work/db/targetDB \
/work/db/targetDB_pad
# start GPU server
singularity run --nv \
-B $CACHE:/cache \
-B $(pwd):/work \
$SIF gpuserver \
/work/db/targetDB_pad --max-seqs 10000 --db-load-mode 0 --prefilter-mode 1 &
PID1=$!
sleep 120
# run mmseqs
singularity run --nv \
-B $CACHE:/cache \
-B $(pwd):/work \
$SIF easy-search \
/work/input.fasta \
/work/db/targetDB_pad \
/work/output.m8 \
/work/tmp \
--gpu 1 \
--gpu-server 1 \
--remove-tmp-files 1
The stdout:
/work/db/targetDB_pad exists and will be overwritten
makepaddedseqdb /work/db/targetDB /work/db/targetDB_pad
MMseqs Version: eaecacf4ba24e9c8a0f2a1da115603ebc80710ad
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Score bias 0
Mask residues 1
Mask residues probability 0.9
Mask lower case residues 0
Mask lower letter repeating N times 0
Write lookup file 1
Threads 64
Verbosity 3
[=================================================================] 3.26K 0s 15ms
Time for merging to hydDB1_pad: 0h 0m 0s 190ms
Time for merging to hydDB1_pad_h: 0h 0m 0s 152ms
Time for processing: 0h 0m 1s 216ms
gpuserver /work/db/targetDB_pad --max-seqs 10000 --db-load-mode 0 --prefilter-mode 1
MMseqs Version: eaecacf4ba24e9c8a0f2a1da115603ebc80710ad
Use GPU 0
Max results per query 10000
Preload mode 1
Prefilter mode 1
374968733484103649
easy-search /work/input.fasta /work/db/targetDB_pad /work/output.m8 /work/tmp --gpu 1 --gpu-server 1 --remove-tmp-files 1
MMseqs Version: eaecacf4ba24e9c8a0f2a1da115603ebc80710ad
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Add backtrace false
Alignment mode 3
Alignment mode 0
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0
Min alignment length 0
Seq. id. mode 0
Alternative alignments 0
Coverage threshold 0
Coverage mode 0
Max sequence length 65535
Compositional bias 1
Compositional bias scale 1
Max reject 2147483647
Max accept 2147483647
Include identical seq. id. false
Preload mode 0
Pseudo count a substitution:1.100,context:1.400
Pseudo count b substitution:4.100,context:5.800
Score bias 0
Realign hits false
Realign score bias -0.2
Realign max seqs 2147483647
Correlation score weight 0
Gap open cost aa:11,nucl:5
Gap extension cost aa:1,nucl:2
Zdrop 40
Threads 64
Compressed 0
Verbosity 3
Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out
Sensitivity 5.7
k-mer length 0
Target search mode 0
k-score seq:2147483647,prof:2147483647
Alphabet size aa:21,nucl:5
Max results per query 300
Split database 0
Split mode 2
Split memory limit 0
Diagonal scoring true
Exact k-mer matching 0
Mask residues 1
Mask residues probability 0.9
Mask lower case residues 0
Mask lower letter repeating N times 0
Minimum diagonal score 15
Selected taxa
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Use GPU 1
Use GPU server 1
Wait for GPU server 600
Prefilter mode 0
Rescore mode 0
Remove hits by seq. id. and coverage false
Sort results 0
Mask profile 1
Profile E-value threshold 0.001
Global sequence weighting false
Allow deletions false
Filter MSA 1
Use filter only at N seqs 0
Maximum seq. id. threshold 0.9
Minimum seq. id. 0.0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Pseudo count mode 0
Profile output mode 0
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Chain overlapping alignments 0
Merge query 1
Search type 0
Search iterations 1
Start sensitivity 4
Search steps 1
Exhaustive search mode false
Filter results during exhaustive search 0
Strand selection 1
LCA search mode false
Disk space limit 0
MPI runner
Force restart with latest tmp false
Remove temporary files true
Translation mode 0
Alignment format 0
Format alignment output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits
Database output false
Overlap threshold 0
Database type 0
Shuffle input database true
Createdb mode 0
Write lookup file 0
Greedy best hits false
search /work/tmp/995132545111804393/query /work/db/targetDB_pad /work/tmp/995132545111804393/result /work/tmp/995132545111804393/search_tmp --alignment-mode 3 -s 5.7 --gpu 1 --gpu-server 1 --remove-tmp-files 1
Error: Ungapped prefilter died
Error: Search died
And the stderror:
INFO: underlay of /usr/bin/nvidia-smi required more than 50 (259) bind mounts
INFO: underlay of /usr/bin/nvidia-smi required more than 50 (259) bind mounts
INFO: underlay of /usr/bin/nvidia-smi required more than 50 (259) bind mounts
malloc(): corrupted top size
Aborted (core dumped)
If I skip starting a GPU server for the targetDB_pad, it works normally. So for now I'll just skip the gpuserver step, but was wondering if there is any way to resolve this issue. Thanks
What query/target set is this? the one in example/?
No, this is one I built with mmseqs createdb and contains 3261 sequences of roughly 400-900aa length. The query fasta file is also my own:
>NuoD
MTEKYAPPIPETSDYAISVGPQHPTHKEPVRFIFQVKGETVQDVDLRIGFNHRGIEKAFENRTWLKNLYLVTRLCGICSVAHQLAYVHAAEKCMIIQDSVPERAHFIRLIIAELERVQSHILWYGVLAHDTGYDTLFHITWRDREIVNDILELISGNRVNYAMYTLGGVRRDISREQKEKIVPKLKDLRKKCEYHRAVMMKERSFIVRQKGVAILSKKDAKKYCAVGPTVRASGVNIDLRKVDPYSVYDKVSFDVPLYSEGDILGGLYNRLDETLISIDIILDALDAMPAGDIRLPWREVPRRPETSEGIQRVEAPRGEDIHYIRSNGTDKPDRHKIRAPTFQNFPSLVHRLKGVQVADIPPVIRVIDPCIGCCERVTFVKAGSRKKLTLNGHHLVSRANRFYRSGTKVLDF