straw icon indicating copy to clipboard operation
straw copied to clipboard

Python: when c2 <= c1 output is not flipped back

Open lldelisle opened this issue 3 years ago • 1 comments

Describe the bug When fetching the contacts between 'chr1' 'chrM', it gives the same result as 'chrM' 'chr1' without any notice to the user.

To Reproduce

import numpy as np
import hicstraw
hic_file =  'ENCFF080DPJ.hic'
chrom1 = 'chr1'
chrom2 = 'chr2'
result = hicstraw.straw('observed', 'NONE', hic_file, chrom1, chrom2, 'BP', 1000000)
for i in range(10):
     print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))
result = hicstraw.straw('observed', 'NONE', hic_file, chrom2, chrom1, 'BP', 1000000)
for i in range(10):
     print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))

Expected behavior I would expect to have the same result but with column 1 and 2 shifted. I got exactly the same result.

The problem is that the chromosomes are flipped here: https://github.com/aidenlab/straw/blob/2525edc29bbb48463799cad94cbd6e5e810210a0/pybind11_python/src/straw.cpp#L1222-L1227 but this is information is not stored. Therefore, the results are not flipped back in: https://github.com/aidenlab/straw/blob/2525edc29bbb48463799cad94cbd6e5e810210a0/pybind11_python/src/straw.cpp#L1368-L1374 and the matrix is not transposed in: https://github.com/aidenlab/straw/blob/2525edc29bbb48463799cad94cbd6e5e810210a0/pybind11_python/src/straw.cpp#L1430

lldelisle avatar Oct 31 '22 16:10 lldelisle

Bump. This is really nasty behavior that is guaranteed to lead to miserably confusing bug hunting for anyone unlucky enough to be handed non-"canonical" data to be fed into straw. Just throw an exception if the ordering is that important.

fredlas avatar Jun 29 '25 21:06 fredlas