d4-format icon indicating copy to clipboard operation
d4-format copied to clipboard

Ignore "chr" prefix when running stats using an intervals file

Open northwestwitch opened this issue 11 months ago • 0 comments

Hello! It's not a huge deal but today It took us a while to understand why the stats returned always 0 for one of our samples. Basically if the sample has the chromosome format with the "chr" prefix then also the interval file has to have chromosomes with the prefix. Since we are working with different pipelines and apparently they have different output in terms of d4 files it would be nice to have this issue covered directly in d4tools.

Thanks in advance!

(base) chiararasi@n159-p41 d4_data % cat  intervals.bed
1	11785723	11806455
11	72189558	72196323
17	28394642	28407197
18	6941742	7117797
5	80626226	80654983
(base) chiararasi@n159-p41 d4_data % d4tools stat --region intervals.bed mildlywittybat.per-base.d4 --stat mean
1	11785723	11806455	0
11	72189558	72196323	0
17	28394642	28407197	0
18	6941742	7117797	0
5	80626226	80654983	0
(base) chiararasi@n159-p41 d4_data % cat  intervals_with_chr.bed
chr1	11785723	11806455
chr11	72189558	72196323
chr17	28394642	28407197
chr18	6941742	7117797
chr5	80626226	80654983
(base) chiararasi@n159-p41 d4_data % d4tools stat --region intervals_with_chr.bed mildlywittybat.per-base.d4 --stat mean
chr1	11785723	11806455	29.785597144510902
chr11	72189558	72196323	31.807538802660755
chr17	28394642	28407197	28.50115491835922
chr18	6941742	7117797	31.84388969356167
chr5	80626226	80654983	25.601175365997843
(base) chiararasi@n159-p41 d4_data %

northwestwitch avatar Mar 03 '25 14:03 northwestwitch