vdjer icon indicating copy to clipboard operation
vdjer copied to clipboard

Extracting useful reads

Open PoslavskySV opened this issue 9 years ago • 2 comments

Hi,

Is there way to extract reads that were successfully assembled by vdjer from the original bam file?

I mean the following. I supposed that vdjer.sam contains reads that were assembled by vdjer. But, when I try to extract fastq reads from vdjer.sam and re-analyse them, vdjer produces empty result.

Steps to reproduce:

./STAR --readFilesIn demo_R1.fastq demo_R2.fastq ... > demo.sort.bam
./samtools index demo.sort.bam
./vdjer --in demo.sort.bam ... > vdjer.sam

At this point all ok, vdjer.sam contains a lot of alignments and vdj_contigs.fa a lot of records.

Then I do:

./SamToFastq I=vdjer.sam F=demo2_R1.fastq F2=demo2_R2.fastq
./STAR --readFilesIn demo2_R1.fastq demo2_R2.fastq ... > demo2.sort.bam
./samtools index demo2.sort.bam
./vdjer --in demo2.sort.bam ... > vdjer2.sam

and now vdjer2.sam is empty and vdj_contigs.fa is empty too. Why?

Thanks!

PoslavskySV avatar Sep 30 '16 17:09 PoslavskySV

Hi,

We don't currently have a way to do that, but we may be able to add a debug option to write the extracted reads to disk in a future release.

The final output SAM file contains only those reads that map to the output 360 base contigs.

There is an internal representation of those contigs that is a fair bit longer than the 360 bases which allows us to measure coverage across the entirety of the final output contig (including partially overlapping reads). Those partially overlapping reads are not emitted in the final output. Using just the final set of reads will not result in sufficient contig length for the assembly.

mozack avatar Oct 05 '16 12:10 mozack

Thanks for the answer! Such debug option for writing extracted reads may be really useful!

PoslavskySV avatar Oct 10 '16 12:10 PoslavskySV