Starting with comma in alt field
Hi, I was trying to filter a vcf created with the 4.0 beta version, when the filtering script crashed with this error:
Traceback (most recent call last):
File "/projects/cees/bin/lobstr/lobSTR-bin-Linux-x86_64-4.0.0/share/lobSTR/scripts/lobSTR_filter_vcf.py", line 132, in
The following entry is likely the culprit: LG15 1591293 . AAATAAAAATAAAATAAAA ,AAATAAAA,AAATAAAAATAAAATAAAAAAAATAAAAT 669.516 . END=1591311;MOTIF=AAATA;NS=10;REF=3.8;RL=19;RU=AAATA;VT=STR;RPA=0,1.6,5.8 GT:ALLREADS:AML:DISTENDS:DP:GB:PL:Q:SB:STITCH 0/3:0|1;10|1:0.980102/0.961652:10:2:0/10:16,22,191,22,125,122,0,26,24,20:0.941778:2:0 0/3:0|2;10|3:0.999776/0.999994:28:5:0/10:52,67,478,67,347,341,0,52,49,37:0.999769:1.44444:0 0/3:0|1;10|1:0.980102/0.961652:51:2:0/10:16,22,191,22,125,122,0,26,24,20:0.941778:4.25:0 3/3:10|2:0.999969/0.999969:67:2:10/10:44,50,197,50,197,197,5,6,6,0:0.499146:2:0 0/3:0|4;10|2:1/0.997848:41.6667:6:0/10:26,44,574,44,311,299,0,105,99,87:0.997848:26:0 2/3:-11|4;-1|1;10|2:0.999984/0.999784:8:7:-11/10:165,149,354,44,191,176,140,112,0,385:0.999784:26:0 1/0:-19|1;0|7;10|1:0.998671/0.999881:-3.875:9:-19/0:71,0,740,29,283,288,73,161,180,233:0.998671:4.25:0 0/0:0|1:0.993932/0.993932:-28:1:0/0:0,3,98,3,33,30,3,29,27,26:0.330906:5:0 3/3:10|4:1/1:1.5:4:10/10:89,101,395,101,395,395,11,12,12,0:0.798921:5:0 0/3:0|2;10|1:0.999907/0.940512:6.66667:3:0/10:13,22,287,22,155,149,0,52,49,43:0.940419:9.25:0
The ALT field starts with comma, which I don't think is according to specs.
allelotype was run on a set of BWA aligned bams, like this:
allelotype --command classify --bam bam1,bam2,bam3,bam4,bam5,bam6,bam7,bam8,bam9,bam10
--strinfo genome_strinfo.tab --noise_model /projects/cees/bin/lobstr/lobSTR-bin-Linux-x86_64-3.0.2/share/lobSTR/models/illumina_v2.0.3
--index-prefix genome_index/lobSTR_ --out genome_ncc
--filter-mapq0 --realign --max-repeats-in-ends 3 --min-read-end-match 10 >allelotype_ncc.out 2> allelotype_ncc.err