MMseqs2 icon indicating copy to clipboard operation
MMseqs2 copied to clipboard

GPU easy-cluster and easy-linclust write out ASCII headers but binary sequences to _rep_seq.fasta

Open jecorn opened this issue 2 months ago • 0 comments

Using MMseqs2 Version: 8cc5ce367b5638c4306c2d7cfc652dd099a4643f (release) Running on a RTX 4090 GPU or CPU only.

easy-cluster and easy-linclust run on the CPU give normal results in _rep_seq.fasta. But running the exact same command on the GPU gives headers in ASCII but sequences in binary.

Command (same with easy-cluster) > mmseqs easy-linclust <input.fasta> <output> tmp --min-seq-id 0.9 --threads 72 --gpu 1

Result:

>tr|A0A8X7MI69|A0A8X7MI69_9BASI Uncharacterized protein (Fragment) OS=Tilletia controversa OX=13291 GN=A4X06_0g9490 PE=4 SV=1
����C�?�����ə?���

Command (same with easy-cluster) > mmseqs easy-linclust <input.fasta> <output> tmp --min-seq-id 0.9 --threads 72

Result:

>tr|A0A0X8HR55|A0A0X8HR55_9SACH HDL273Cp OS=Eremothecium sinecaudum OX=45286 GN=AW171_hschr42364 PE=4 SV=1
MSDDAGEIYLEKSVSDELFGRLNSNPENKICFDCGNKNPTWTSVPFGIMLCIQCSGEHRKLGVHITFVKSSNLDKWTLNNLRRFKVGGNHRARAFFLKNNGKQFLDYKTDKNVKYTSQVAKNYKAHLDRKAARDREQHPSEIVFSTEDEVESSDSGSSKNNSVDDFFSSWEKPAASPSNTKLLTPTSTSGSQKTGRSSILSAPSNRRRTPLASGNSSSGGRNHPILSSSRKPISRAGAKKVDADMFDQFEKEAQEERETAAIARSTNSISGEGFKPSQKPTYSAVQFHPTSSESSLNAKDYDVEENPYNDGIKFDQVRAGGVVPSVDDVQPKLAKLSFGMTKNDAKKLADDSKPAARAPTGPKYTGQIAAKYGSQKAISSDQVFGRGGYDEGTSRAAQERLKSNFGNATSISSASYFGEDSAEQAQTGRSVDQGNNLIEVTLGKDEDIELVKQALELGAEKLGSYLRDYLRK

jecorn avatar Nov 26 '25 08:11 jecorn