MMseqs2
MMseqs2 copied to clipboard
update mmseqs database with non-mmseqs database
Hey, thank you so providing and maintaining mmseq2. I have the following workflow question. Let's assume I have
- A 75 GB nucleotide database (x.fna) clustered (95% threshold) with a different method.
mmseqs createdb x.fna x_db
- A 52 GB nucleotide database (y.fna) that was clustered with linclust (95 % threshold).
mmseqs createdb y.fna y_db
mmseqs linclust y_db y_clust temp/ -c 0.95 --min-seq-id 0.95 --cov-mode 1
I want to combine databases X and Y without deletion:
mmseqs concatdbs y_db x_db mergedDB
mmseqs concatdbs y_h x_db_h mergedDB_h
mmseqs clusterupdate y_db mergedDB y_clust merged_seq merged_clust tmp --search-type 3 --min-seq-id 0.95 -c 0.8
Can I combine the databases even though the first one was not clustered with linclust? Is the proposed workflow correct or would you recommend to merge the FASTA files, generate a mmseq2 database and re-cluster completely by applzing the linclust algorithm to the combined version? Many thanks!