genomecov -d and -3/-5 options are incompatible, but no error is given
Thank you for maintaining bedtools. It's really excellent software.
An old post on Biostars suggests using the following command to count coverage at the 5' ends of reads:
bedtools genomecov -d -5 -ibam input.bam -g genome.fa > output.txt
I have some stranded transcriptomics data, and I have been using a similar command (bedtools v2.29.2) to extract strand-specific 3' end depth profiles for my alignments:
bedtools genomecov -pc -3 -d -strand + -ibam sample.bam > sample_plus_3p.txt
bedtools genomecov -pc -3 -d -strand - -ibam sample.bam > sample_minus_3p.txt
My expectation when running this command is that I will get a per-base genome coverage report (-d) for the 3' ends (-3) of my paired-end fragments (-pc), with zeros given at all positions where the 3' end of a paired-end read does not fall. No error is given running the above commands, and I indeed get a report listing every position in my genome.
However, I noticed that the exclusion of the -3 option (or inclusion of -5) gives the same results.
tested commands
bedtools genomecov -pc -d -strand + -ibam sample.bam | md5sum
bedtools genomecov -pc -3 -d -strand + -ibam sample.bam | md5sum
bedtools genomecov -pc -5 -d -strand + -ibam sample.bam | md5sum
results
d1ca8002cfe34024374fe9f695c55c64 -
d1ca8002cfe34024374fe9f695c55c64 -
d1ca8002cfe34024374fe9f695c55c64 -
I noticed a similar problem when using the -3 and -5 options in conjunction with BedGraph output (-bga).
My questions:
- Are the
-3and-5flags incompatible with-dand-bga, and, if so, why is there no error thrown? (Or, at least, why isn't that mentioned in the documentation?) - Is there a correct combination of flags that will give an every-position report that excludes all but the ends of fragments, as I was attempting to implement?
Related issue: https://github.com/arq5x/bedtools2/issues/905
And a Biostars post with more context: https://www.biostars.org/p/9490546/
My workaround: https://www.biostars.org/p/9490546/#9493168