bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

bcftools norm --multialllelics -any v1.12 does not split FORMAT/AD correctly

Open priesgo opened this issue 4 years ago • 1 comments

I hope this is not me misusing bcftools, but there seens to be an issue between bcftools norm v1.9 and v1.12 (I did not check intermediate versions).

chr1	13324	.	C	G,T	.	MULTIALLELIC		GT:AD	0:229,1,1	1/2:196,24,1

With bcftools v1.12 bcftools norm --multiallelics -any -old-rec-tag OLD_VARIANT becomes:

chr1	13324	.	C	G	.	MULTIALLELIC	OLD_VARIANT=chr1|13324|C|G,|1	GT:AD	0:229,229	1/0:196,196
chr1	13324	.	C	T	.	MULTIALLELIC	OLD_VARIANT=chr1|13324|C|G,|2	GT:AD	0:229,229	0/1:196,196

Note, that AD values always get the value from the reference base. Also, the value stored in INFO/OLD_VARIANT refers to the G alternate in both cases.

With bcftools v1.9 bcftools norm --multiallelics -any becomes:

chr1	13324	.	C	G	.	MULTIALLELIC	.	GT:AD	0:229,1	1/0:196,24
chr1	13324	.	C	T	.	MULTIALLELIC	.	GT:AD	0:229,1	0/1:196,1

priesgo avatar May 30 '21 04:05 priesgo

Can you please try if the commit I just pushed fixes the issue? I could reproduce the problem only partially, the commit fixes the malformed INFO tag. However, I was not able to reproduce the incorrect FORMAT/AD values.

pd3 avatar Jun 09 '21 14:06 pd3