RNA-Bloom icon indicating copy to clipboard operation
RNA-Bloom copied to clipboard

Resolve Racon Conflict with Numeric Named Reads

Open ad3002 opened this issue 2 years ago • 3 comments

This pull request addresses an open issue in Racon (https://github.com/isovic/racon/issues/233), where Racon encounters an error if reads and contigs have identical names. In our project, we have read files with numeric names generated by an upstream tool, leading to a naming conflict in Racon.

To resolve this, I have implemented a solution where a 'unitig' prefix is added to unitig fasta records. This change effectively prevents the name conflict in Racon, and subsequent tests confirm that RNA-Bloom now operates as expected. This update ensures compatibility and stability in RNA-Bloom, addressing the named issue without affecting other functionalities.

ad3002 avatar Jan 22 '24 11:01 ad3002

Hi @ad3002 , Instead of modifying the code of RNA-Bloom, you can work around the issue by simply giving the read names a "proper" prefix (e.g. "seq"). You can do so easily with seqtk:

seqtk rename reads.fq seq > renamed_reads.fq

Ka Ming

kmnip avatar Jan 23 '24 02:01 kmnip

Yes, I did exactly that, another possible fix is to add this possible caveat to the RNA-Bloom documentation. Because it crashes without any errors that can be linked to contig/rides matching. And without experience, it's impossible to find a solution.

ad3002 avatar Jan 26 '24 10:01 ad3002

Thanks for the suggestion, I have added a note about it in the readme.

kmnip avatar Jan 28 '24 00:01 kmnip