Mahal in matlab vs mahal in python
I'm not sure this is the right place for this question. But here goes I'm trying to code the maskedClusterQuality function in python and I'm having trouble replicating the mahal function outputs. Here is a sample of what I did.
PYTHON ` import numpy import scipy.io x = numpy.random.normal(0,1,[100,12]) y = numpy.random.normal(17,1,[1000,12])
cov_x_inv = numpy.linalg.inv(numpy.cov(x,rowvar=False)) cdist = scipy.spatial.distance.cdist md = cdist(y,x,'mahalanobis',cov_x_inv) md_py = numpy.mean(md,axis=1)
scipy.io.savemat('D:\sample_feature.mat',dict(x=x,y=y,md=md,md_py=md_py)) `
MATLAB
load('D:\sample_features.mat') md_mat = mahal(y,x) plot(md_py,sqrt(md_mat),'ko')
Heres what I get...

Whats going on here?
If it's still useful, the idea is that mahal(y, x) in matlab computes distance in between each y and the mean of x with the covariance matrix of x:
d(k) = d(y(k,:), mu) = (y(k,:)-mu)*inv(sigma)*(y(k,:)-mu)',
where where mu and sigma are the mean and the covariance matrix of x. So you need to change the python code like the following (it's not what you are computing now):
md_py = cdist(y,numpy.reshape(numpy.mean(x, axis=0), (1, -1)),'mahalanobis',cov_x_inv) to get the desired plot (indicating that python computes similar values as matlab):
