deepTools icon indicating copy to clipboard operation
deepTools copied to clipboard

bamCoverage bin size is variable within a file and between files

Open mheskett opened this issue 3 years ago • 4 comments

I'm calling bamcoverage on several bam files and specifying the binsize to 50,000. I expect every file to have identical coordinates (first three columns in bed file), otherwise I wont be able to compare them against each other. however after just a few lines I get variable bin sizes (between 50 and 300kb), and its different between to files that i paste together here

chr1    50000   150000  1       chr1    50000   100000  6
chr1    150000  200000  0       chr1    100000  200000  0
chr1    200000  250000  1       chr1    200000  300000  1
chr1    250000  500000  0       chr1    300000  600000  0
chr1    500000  550000  1       chr1    600000  650000  3```

**Welcome to deepTools GitHub repository! Before opening the issue please check
that the following requirements are met :**

 - [ x] Search whether this issue (or a similar issue) has been solved before
 using the search tab above. Link the previous issue if appropriate below.

 - [ x] Paste your deepTools version (`deeptools --version`) and your python
 version (`python --version`) below.
deeptools 3.5.1

 - [x ] Paste the full deepTools command that produces the issue below
 (ignore if you simply spotted the issue in the code/documentation).
bamCoverage --numberOfProcessors 4 -b $input  -o $outdir$filename.coverage.bw -of bedgraph --binSize 50000
 
- [ x] Paste the output printed on screen from the command that produces the issue
 below (ignore if you simply spotted the issue in the code/documentation).
```binLength: 50000
numberOfSamples: None
blackListFileName: None
skipZeroOverZero: False
bed_and_bin: False
genomeChunkSize: None
defaultFragmentLength: read length
numberOfProcessors: 4
verbose: False
region: None
bedFile: None
minMappingQuality: None
ignoreDuplicates: False
chrsToSkip: []
stepSize: 50000
center_read: False
samFlag_include: None
samFlag_exclude: None
minFragmentLength: 0
maxFragmentLength: 0
zerosToNans: False
smoothLength: None
save_data: False
out_file_for_raw_data: None
maxPairedFragmentLength: 1000```

mheskett avatar Jun 16 '22 22:06 mheskett

If I am not wrong its default behaviour is to merge bins of no coverage that is why you got bins of different length. To skip the those regions you could use --skipNAs

LeilyR avatar Jun 17 '22 07:06 LeilyR

Hey thanks for the response @LeilyR . skipNAs wont help since i need all files across samples to have the same set of regions. Is there any way to force the binsize to remain the same, no merging, and report all regions with 0 coverage as well? If not I can just use BedTools against a set of genomic windows

mheskett avatar Jun 17 '22 16:06 mheskett

Hey, Have you solved the problem? I have met the same problem I would much appreciated it if you could give any suggestions!

Aannaw avatar Jul 09 '22 15:07 Aannaw

Yeah you need to use bedtools intersect with sliding windows instead (see bedtools make windows)

On Sat, Jul 9, 2022 at 8:34 AM Aannaw @.***> wrote:

Hey, Have you solved the problem? I have met the same problem I would much appreciated it if you could give any suggestions!

— Reply to this email directly, view it on GitHub https://github.com/deeptools/deepTools/issues/1144#issuecomment-1179562829, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUDDNQ34QUWBV5ELJV7SODVTGLYVANCNFSM5ZAJLBFA . You are receiving this because you authored the thread.Message ID: @.***>

mheskett avatar Jul 10 '22 07:07 mheskett