3.7.2 hicPCA produces extreme eigenvalues
Hi,
I'm using the latest version of HiCExplorer, and I have had used version 3.6.* last year. The eigenvalues produced are completly different. I have read the code and commit info saying 3.7.2 is more Lieberman-Aiden way, PCA on an obs/exp matrix, and in 3.6 it's PCA on an pearson's matrix.
Python version: 3.8.13
I used the same command, as below, one for pearson matrix and eigenvalue bedgraph, and one for eigenvalue bw,
hicPCA -m ${inp} --outputFileName ${out}.pca1.bedgraph -we 1 --format bedgraph \
--pearsonMatrix ${out}_pearson_all.h5 \
--extraTrack ../histonemark/ENCODE_H1_H3K27ac.bigwig
hicPCA -m ${inp} --outputFileName ${out}.pca1.bw -we 1 --format bigwig \
--extraTrack ../histonemark/ENCODE_H1_H3K27ac.bigwig
here is the result produced by 3.6, I use np.histogram to have a quick glance
np.histogram(bed[3])
(array([ 43, 486, 3351, 3882, 4228, 5106, 2034, 413, 109, 31]), array([-0.10961274, -0.08537601, -0.06113927, -0.03690253, -0.0126658 , 0.01157094, 0.03580768, 0.06004441, 0.08428115, 0.10851789, 0.13275462]))
So the range of PC1 is about -0.1 to 0.13.

and this is by 3.7.2
np.histogram(bed2[3])
(array([ 1, 2, 5, 882, 20226, 117, 21, 7, 2, 431]), array([-0.73764337, -0.56387903, -0.3901147 , -0.21635036, -0.04258602, 0.13117831, 0.30494265, 0.47870699, 0.65247133, 0.82623566, 1. ]))
And now the range of PC1 is about -0.7 to 1, and most of the values are very close to 0.

Personally I don't think the results from 3.7.2 looks right.
In this paper they said PCA was done on contact matrix. And the distribution of PC1 is similar to the results from hicPCA 3.6.

Thank you.
Hi, same issue. The pearson correlation matrix looks nice and the bigwig/bedgraph values are extreme an don`t match. I checked -we 1 and 2. Thanks
I also found the same issue, I wonder if anyone has a better explanation. I guess that with the shortening of bin length, it may be more likely to have some abnormally high observations, and therefore extreme eigenvalues.
@xscapintime @ralfgilsbach I wonder how you finally dealt with this problem?
We moved back to homertools for eigenvector calculations. It should be fixed in hicexplorer to work in a comparable manner.
@zhongzheng1999 Hi, I changed to cooltoos for all the analysis.
@xscapintime @ralfgilsbach Thank you for your reply! I think it's more reliable to use some good old tools to do the work.