merge error
Hi,
I tried to merge files from 206 samples, and encountered this error. Do you have any idea where should I look at? The merging job ends prematurely, and left "merged.lsort.vcf" and "merged.sites.vcf.gz"
"merged.lsort.vcf" is the file that contains all the variants from 206 samples, and "merged.sites.vcf.gz" is the sorted file. I could see where it stops.
###error messege###
[smoove] 2020/02/23 12:14:38 starting with version 0.2.5
[smoove] 2020/02/23 12:14:38 merging 206 files
[smoove] 2020/02/23 12:14:38 finished sorting 206 files; merge starting.
[smoove] 2020/02/23 12:37:37 Traceback (most recent call last):
File "/home/u/f056598/miniconda2/bin/svtools", line 11, in
Thanks a lot for your help! :)
Hi @lee039 where you able to solve this? I am also getting this same error - IndexError: list index out of range.
Hi,
I repeated the merging step several times, but it aborted at the exact same position. You can see which position it stopped by going to the tail of the intermediate files ("merged.lsort.vcf" and "merged.sites.vcf.gz")
The position that comes after what is written in "merged.lsort.vcf" and "merged.sites.vcf.gz was causing the issue. I think the problem was that during the merging, Smoove take into account the probability of each breakpoint bp, and if the breakpoint resolution is low (i.e. due to repeats) this problem could happen. In my case, I inspected the position where the problem occurs (using IGV), and indeed the deletion was found in ~15 animals but the breakpoint was +/- 300 bp (resolution was too bad for Smoove to process? I am not entirely sure).
Thus, I parsed out all the positions that cause the problem, and save them as $1(chr) $2(pos) in a text file. As I mentioned I had ~15 samples where a deletion was discovered, however, none of them had the same starting position. Probably this indicates that the deletion is of low quality... Then used for loop & bcftools to eliminate these positions in all vcf files using code below.
for vcf in ${PATH_TO_THE_VCF_FILES}/vcf.gz;
do
sname=echo $vcf | cut -d'/' -f11 | cut -d'-' -f1
bcftools view -T ^$Problem_sites $vcf -O z -o results-smoove/$sname.subset.vcf.gz
bcftools index -c results-smoove/$sname.subset.vcf.gz
done
smoove merge --name merged -f $FASTA --outdir results-smoove/ results-smoove/.subset.vcf.gz
Afterwards, the merging went without problems. Hope this works for you! :)