xTea icon indicating copy to clipboard operation
xTea copied to clipboard

VCF output for xtea_long?

Open eyalmpeer opened this issue 4 years ago • 6 comments

Hello and thank you for sharing this tool. I ran xtea_long, per the instructions on the xtea_long branch, on Pac Bio data. The final output was only txt files. Is it possible to generate the VCF files mentioned in the article that can aid in determining the zygosity of the insertions? Or any way to extract from the xtea_long output how many reads in the insertion location support the insertion and how many reads do not support it? Thanks.

eyalmpeer avatar Dec 15 '21 07:12 eyalmpeer

Yeah, this is in my to-do-list. I'll export a vcf file format. For the current output, each column representation could be find here: https://github.com/parklab/xTea_paper/tree/main/run_tools/xTea/HG002. There is a intermediate file called candidate_list_from_clip.txt has the number of clipped reads (third column), but I didn't count the mapped...

simoncchu avatar Dec 15 '21 13:12 simoncchu

Thanks for the pipeline, it is a very useful tool.

Follow up on the xtea_long output: why do the SVA insertion positions often start from a negative value? Like the 2nd line here: https://github.com/parklab/xTea_paper/blob/main/run_tools/xTea/HG002/HG002_hg38_Nanopore_xTea_SVA.txt

chr6 138775846 SVA None -1411:1274:+ None

Thank you very much!

xzhuo avatar Jun 23 '22 16:06 xzhuo

negative value indicates the insertion is started/ended from the flanking region (likely to be transduction). But maybe also the reported annotation is incorrect.

simoncchu avatar Jun 23 '22 17:06 simoncchu

Thanks for your swift reply! Are there ways to infer the correct consensus position for SVA?

xzhuo avatar Jun 23 '22 17:06 xzhuo

It's not straightforward as the reference SVA annotation is fragmented and inaccurate (because of the tandem repeats expansion). For a simple way, just consider position 0 as the start position on the consensus, but it may be inaccurate.

simoncchu avatar Jun 23 '22 19:06 simoncchu

Thank you, a useful tool for analyzing TEs. Can xtea_long now generate vcf file directly? I find my output is still classified_results*.txt. I don't know if I made a mistake.

evayfang2019 avatar Apr 09 '24 13:04 evayfang2019