RNA-Bloom icon indicating copy to clipboard operation
RNA-Bloom copied to clipboard

reduce redundancy in direct RNA long-read only assembly?

Open mjudd8 opened this issue 1 year ago • 1 comments

Hello,

I used rnabloom to construct a transcriptome from direct RNA data with the following:

rnabloom -long all_reads.fastq -stranded -t 25 -outdir dir/ -u true

The rnabloom.transcripts.fa assembly file seems to have a lot of redundant transcripts with very small variations - is there a way to generate a rnabloom.transcripts.nr.fa file with just the long read data?

Thanks!

mjudd8 avatar May 02 '24 19:05 mjudd8

Some settings can be changed to reduce the redundancy of the assembly, e.g.

-indel 100 -tip 100 -p 0.6

The default for these for long reads are -indel 50 -tip 50 -p 0.7. The other source for redundancy came from a bug in minimap2 not outputting some overlaps correctly.

kmnip avatar May 03 '24 01:05 kmnip