dashing icon indicating copy to clipboard operation
dashing copied to clipboard

Any preconfigured databases?

Open jolespin opened this issue 4 years ago • 1 comments

I'm looking for a method that can say whether an assembly is prokaryotic or eukaryotic. Was thinking this software might be helpful. Do you have any preconfigured databases that have both prokaryotes and eukaryotes?

jolespin avatar Oct 06 '21 18:10 jolespin

Hi there!

We don't have any preconfigured databases currently, but that is something we plan to put together for Dashing2 in the near future.

You'd have to download the set of genomes from RefSeq. I have a script which you could use to download them, at which point you could compare your assembly against them.

You could do something like the following:

python3 download_genomes.py all
find ref -name '*fna.gz' > refs.txt
echo $PATH_TO_ASSEMBLY > query.txt
dashing dist -Q query.txt -F refs.txt -k11 -Orefseq.matches -o refseq.sizes -p24

But a pre-built database would be much easier to work with, and you wouldn't need the disk space. I'll let you know when that changes.

Thanks!

Daniel

dnbaker avatar Oct 06 '21 19:10 dnbaker