Cannot save asymmetric matrices
Saving asymmetric matrices with hicMatrix.save(...) does not work when pSymmetric is set to False. Instead, an exception is thrown.
For the following example, hicexplorer 3.3.1 and 1000kb HMEC matrix from Rao et al. were used, the latter downloaded from ftp://cooler.csail.mit.edu/coolers/hg19/Rao2014-HMEC-MboI-allreps-filtered.1000kb.cool
from hicmatrix import HiCMatrix as hm
hmecMatrix = hm.hiCMatrix("Rao2014-HMEC-MboI-allreps-filtered.1000kb.cool")
hmecMatrix.save("test.cool", pSymmetric = False)
This makes cooler throw the following exception:
WARNING:hicmatrix.lib.cool:Writing non-standard cooler matrix. Datatype of matrix['count'] is: float64
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/hicmatrix/HiCMatrix.py", line 111, in save
self.matrixFileHandler.save(pMatrixName, pSymmetric=pSymmetric, pApplyCorrection=pApplyCorrection)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/hicmatrix/lib/matrixFileHandler.py", line 51, in save
self.matrixFile.save(pName, pSymmetric, pApplyCorrection)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/hicmatrix/lib/cool.py", line 406, in save
temp_dir=local_temp_dir)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_create.py", line 925, in create_cooler
lock=lock)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_create.py", line 577, in create
file_path, target, meta.columns, iterable, h5opts, lock)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_create.py", line 213, in write_pixels
for i, chunk in enumerate(iterable):
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_ingest.py", line 298, in _validate_pixels
raise BadInputError("Found bin1_id greater than bin2_id")
cooler.create._ingest.BadInputError: Found bin1_id greater than bin2_id
Storing as h5 matrix doesn't work either:
hmecMatrix.save("test.h5", pSymmetric = False)
WARNING:hicmatrix.lib.cool:Writing non-standard cooler matrix. Datatype of matrix['count'] is: float64
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/hicmatrix/HiCMatrix.py", line 111, in save
self.matrixFileHandler.save(pMatrixName, pSymmetric=pSymmetric, pApplyCorrection=pApplyCorrection)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/hicmatrix/lib/matrixFileHandler.py", line 51, in save
self.matrixFile.save(pName, pSymmetric, pApplyCorrection)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/hicmatrix/lib/cool.py", line 406, in save
temp_dir=local_temp_dir)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_create.py", line 925, in create_cooler
lock=lock)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_create.py", line 577, in create
file_path, target, meta.columns, iterable, h5opts, lock)
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_create.py", line 213, in write_pixels
for i, chunk in enumerate(iterable):
File "/X/miniconda3/envs/hicSymm/lib/python3.6/site-packages/cooler/create/_ingest.py", line 298, in _validate_pixels
raise BadInputError("Found bin1_id greater than bin2_id")
cooler.create._ingest.BadInputError: Found bin1_id greater than bin2_id
Remarks: The matrix used here probably is symmetric. Storing doesn't work with other, really asymmetric matrices either. Cooler version was 0.8.5. I've also tried using hicexplorer 2.2, but the same exception is thrown (Cooler version was 0.8.5, too).
Thanks for reporting it. I will have a look at it asap.
Hello everyone,
after some debugging, I'm quite sure I found the bug. It's not in HicExplorer, but in hicmatrix/lib/cool.py, line 399 to 406.
The problem here is that saving asymmetric matrices via cooler.create_cooler(...) requires at least the parameter symmetric_upper being set to False, which does not happen.
The exception reported above originates from cooler's triucheck; since it has not been told not to do so, cooler checks whether the matrix to be stored is an upper triangular one, and Found bin1_id greater than bin2_id is just a very mathematical way of telling us that this is not the case.
My suggestion to resolve this issue is changing the call to cooler_create(...) in hicmatrix/lib/cool.py, line 399 etc. as follows:
cooler.create_cooler(cool_uri=pFileName,
bins=bins_data_frame,
pixels=matrix_data_frame,
mode=self.appendData,
dtypes=dtype_pixel,
ordered=True,
metadata=self.hic_metadata,
temp_dir=local_temp_dir, #up to here, same code as before
triucheck=pSymmetric, #inserted
symmetric_upper=pSymmetric) #inserted
Cooler would actually set triucheck=False by itself when symmetric_upper=False, but this will cause a (justified) warning.
I tested the suggested patch by adding the upper triangular matrix of an arbitrary cooler matrix to the lower triangular matrix of another one, saving the resulting matrix in cooler format and plotting it. Seems to work for cooler format, but haven't tried for h5 format.
Kind regards, Ralf