Python: when c2 <= c1 output is not flipped back
Describe the bug When fetching the contacts between 'chr1' 'chrM', it gives the same result as 'chrM' 'chr1' without any notice to the user.
To Reproduce
import numpy as np
import hicstraw
hic_file = 'ENCFF080DPJ.hic'
chrom1 = 'chr1'
chrom2 = 'chr2'
result = hicstraw.straw('observed', 'NONE', hic_file, chrom1, chrom2, 'BP', 1000000)
for i in range(10):
print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))
result = hicstraw.straw('observed', 'NONE', hic_file, chrom2, chrom1, 'BP', 1000000)
for i in range(10):
print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))
Expected behavior I would expect to have the same result but with column 1 and 2 shifted. I got exactly the same result.
The problem is that the chromosomes are flipped here: https://github.com/aidenlab/straw/blob/2525edc29bbb48463799cad94cbd6e5e810210a0/pybind11_python/src/straw.cpp#L1222-L1227 but this is information is not stored. Therefore, the results are not flipped back in: https://github.com/aidenlab/straw/blob/2525edc29bbb48463799cad94cbd6e5e810210a0/pybind11_python/src/straw.cpp#L1368-L1374 and the matrix is not transposed in: https://github.com/aidenlab/straw/blob/2525edc29bbb48463799cad94cbd6e5e810210a0/pybind11_python/src/straw.cpp#L1430
Bump. This is really nasty behavior that is guaranteed to lead to miserably confusing bug hunting for anyone unlucky enough to be handed non-"canonical" data to be fed into straw. Just throw an exception if the ordering is that important.