ACE sometimes gives negative Maximal Correlation values
Maximal Correlation (MC) values should always be between 0 and 1. However, when I calculate the MC values of x1 and x2 with y for values of x1 = [ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.] x2 = [ 2., 5., 9., 7., 4., 8., 1., 6., 3., 10.] y = [ 3., 9., 11., 8., 4., 15., 14., 20., 30., 32.] I get a negative MC between x2 and y.
Running the same problem using the R library acepack yields an MC value within the proper range.
Python calculation:
def ACE(x, y):
'''
Output MCs: Maximal Correlations (MCs) for each variable x
Input x: list of 1D numpy arrays, one for each input variable
Input y: 1D numpy array of responses
'''
ace_solver = ace.ACESolver()
ace_solver.specify_data_set(x, y)
ace_solver.solve()
MCs = [] # mutual correlations
for i in range(len(x)):
(MC, Pval) = stats.pearsonr( ace_solver.x_transforms[i], ace_solver.y_transform )
MCs.append( MC )
return(MCs)
from ace import ace
from scipy import stats
import numpy as np
x = [np.array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]), np.array([ 2., 5., 9., 7., 4., 8., 1., 6., 3., 10.])]
y = np.array([ 3., 9., 11., 8., 4., 15., 14., 20., 30., 32.])
MCs = ACE(x, y)
print('MCs = ', MCs)
yields
MCs = [0.9523, -0.0577]
Meaning the Maximal Correlation value between x2 and y is -0.058.
R acepack calculation:
library(acepack)
x1 = 1:10
x2 = c(2., 5., 9., 7., 4., 8., 1., 6., 3., 10.)
x <- cbind(x1, x2)
y = c( 3., 9., 11., 8., 4., 15., 14., 20., 30., 32.)
ace_model = ace(x, y)
MC = cor(ace_model$tx, ace_model$ty)
yields MC values of
x1 0.9427068
x2 0.3442552
Giving a positive Maximal Correlation value between x2 and y of 0.344
Thanks, great report. That is indeed a defect. I'll look into it.