HiCExplorer icon indicating copy to clipboard operation
HiCExplorer copied to clipboard

Inconsistency in Distance Calculation and Occurrences in `expected_interactions` and `obs_exp_matrix`

Open mhjiang97 opened this issue 1 year ago • 0 comments

  • [x] Search whether this issue (or a similar issue) has been solved before using the search tab above. Link the previous issue if appropriate below.

  • [x] Paste your HiCExplorer version (hicInfo --version) and your python version (python --version) below.

    • hicInfo 3.7.6
    • Python 3.11.10
  • [x] Have you checked our documentation on hicexplorer.readthedocs.io?

  • [x] Do you use conda to install HiCExplorer?

  • [x] Do you use the latest HiCExplorer release? If not, please install it via a conda environment: conda create --name hicexplorer hicexplorer=3.6 python=3.8 -c bioconda -c conda-forge and activate the environment: conda activate hicexplorer. Retry your command. You can exit a conda environment via conda deactivate. To learn more about conda and environments, please consider the following documentation.

Retry your command, is it solved now? If not please continue with the following:

  • [x] Paste the full HiCExplorer command that produces the issue below (ignore if you simply spotted the issue in the code/documentation).

  • [x] Paste the output printed on screen from the command that produces the issue below (ignore if you simply spotted the issue in the code/documentation).


Description

There is an inconsistency in the way distances and occurrences are handled in the expected_interactions obs_exp_matrix functions in utilities.py. Specifically, the obs_exp_matrix function divides the distance by 2, while the expected_interactions function does not account for the symmetry of the Hi-C matrix in its occurrences calculation.

Expected Behavior

The distance calculation and occurrences should be consistent across both functions. The occurrences should account for the symmetry of the Hi-C matrix, except for the main diagonal.

Actual Behavior

  • The obs_exp_matrix function divides the distance by 2. Line 578 in utilities.py:
    distance = np.ceil(np.absolute(row - col) / 2).astype(np.int32)
    
  • The expected_interactions function does not account for the symmetry of the Hi-C matrix in its occurrences calculation. Line 364 in utilities.py:
    occurrences = np.arange(pSubmatrix.shape[0] + 1, 1, -1)
    

I'm concerned about this implementation but unsure if my concerns are valid. I look forward to your response. Thank you!

mhjiang97 avatar Dec 08 '24 14:12 mhjiang97