poretools icon indicating copy to clipboard operation
poretools copied to clipboard

Conversion to Fasta generates same read names

Open athulmenon opened this issue 9 years ago • 2 comments

Hi, I converted fast5 raw reads to fasta using poretools. But I came to see so many reads with same read name again. Is there any options to change it? I am finding difficulty in assembly due to the same read names for different reads. Can you please tell the solution for this? In my file around 18000 duplicated read names are found .

I used :+1:

poretools fasta fast5/

for the conversion.

Thanks in Advance!

Athul Menon K

athulmenon avatar Mar 15 '16 13:03 athulmenon

Would you be able to share a couple of the underlying FAST5 files that generate this behavior so that I can debug?

On March 15, 2016 at 7:22:58 AM, athulmenon ([email protected]) wrote:

Hi, I converted fast5 raw reads to fasta using poretools. But I came to see so many reads with same read name again. Is there any options to change it? I am finding difficulty in assembly due to the same read names for different reads. Can you please tell the solution for this? In my file around 18000 duplicated read names are found to be duplicated again.

I used

poretools fasta fast5/

for the conversion.

Thanks in Advance!

Athul Menon K

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/arq5x/poretools/issues/74

arq5x avatar Mar 17 '16 21:03 arq5x

Dear Arq5x,

Please find two fast5 files and there respective fasta files converted using poretools below. https://drive.google.com/folderview?id=0B6z33yOKO0MhdEtWRUM3ZVBCc3M&usp=sharing

Each fasta read is having same header from fast5 file. I think the file is generating forward, reverse and 2D read from a single fast5 file with same name, correct me if I am wrong. I came to see this problem when I tried to do an assembly with miniasm, where duplicated read headers made problems, but If we change the headers with some suffix and run the assembly it runs, but again while coming back for polishing assembly using Nanopolish the Suffix added to read name cause problems.

Please look into this! Any way we are loving the tool.

Thank you. Athul

athulmenon avatar Mar 18 '16 07:03 athulmenon