khmer icon indicating copy to clipboard operation
khmer copied to clipboard

Need Help -

Open taranglute opened this issue 7 years ago • 0 comments

Next Generation sequencing data are usually available in multiple compressed files.

I want to pass multiple input files (compressed or uncompressed FASTQ files) to Khmer, (i) abundance-dist-single.py which generate the full k-mer abundance histogram and (ii) unique-kmers.py which estimates the total number of distinct k-mers (F0) for large k lengths.

In the help document of khmer it is stated that ‘To count k-mers in multiple files use ‘load-into-counting.py’ and ‘./abundance-dist.py’ and hence I followed the following commands for two input FASTQ files,

./load-into-counting.py -k 25 -x 5e7 -T 16 out123 SRR072005.fastq SRR072006.fastq

./abundance-dist.py out123 SRR072005.fastq SRR072006.fastq histo

usage: abundance-dist.py [--version] [--info] [-h] [-z] [-s] [-b] [-f] [-q] input_count_graph_filename input_sequence_filename output_histogram_filename

abundance-dist.py: error: unrecognized arguments: histogram

What is the correct sequence of commands to run Khmer to generate the full k-mer abundance histogram and the total number of distinct k-mers (F0) on input having multiple (compressed/uncompressed) FASTQ files?

Please Help

taranglute avatar Sep 18 '18 16:09 taranglute