Inconsistency in Distance Calculation and Occurrences in `expected_interactions` and `obs_exp_matrix`
-
[x] Search whether this issue (or a similar issue) has been solved before using the search tab above. Link the previous issue if appropriate below.
-
[x] Paste your HiCExplorer version (
hicInfo --version) and your python version (python --version) below.- hicInfo 3.7.6
- Python 3.11.10
-
[x] Have you checked our documentation on hicexplorer.readthedocs.io?
-
[x] Do you use conda to install HiCExplorer?
-
[x] Do you use the latest HiCExplorer release? If not, please install it via a conda environment:
conda create --name hicexplorer hicexplorer=3.6 python=3.8 -c bioconda -c conda-forgeand activate the environment:conda activate hicexplorer. Retry your command. You can exit a conda environment viaconda deactivate. To learn more about conda and environments, please consider the following documentation.
Retry your command, is it solved now? If not please continue with the following:
-
[x] Paste the full HiCExplorer command that produces the issue below (ignore if you simply spotted the issue in the code/documentation).
-
[x] Paste the output printed on screen from the command that produces the issue below (ignore if you simply spotted the issue in the code/documentation).
Description
There is an inconsistency in the way distances and occurrences are handled in the expected_interactions obs_exp_matrix functions in utilities.py. Specifically, the obs_exp_matrix function divides the distance by 2, while the expected_interactions function does not account for the symmetry of the Hi-C matrix in its occurrences calculation.
Expected Behavior
The distance calculation and occurrences should be consistent across both functions. The occurrences should account for the symmetry of the Hi-C matrix, except for the main diagonal.
Actual Behavior
- The
obs_exp_matrixfunction divides the distance by 2. Line 578 in utilities.py:distance = np.ceil(np.absolute(row - col) / 2).astype(np.int32) - The
expected_interactionsfunction does not account for the symmetry of the Hi-C matrix in its occurrences calculation. Line 364 in utilities.py:occurrences = np.arange(pSubmatrix.shape[0] + 1, 1, -1)
I'm concerned about this implementation but unsure if my concerns are valid. I look forward to your response. Thank you!