PHATE icon indicating copy to clipboard operation
PHATE copied to clipboard

Be able to use minkowski metric with p < 1

Open ivan-marroquin opened this issue 4 years ago • 2 comments

Describe the bug It seems that PHATE supports minkowski metric for both mds and knn computations. So, I would like to use this metric with p= 0.3 for running experiments. The code does not recognize the use of 'p= 0.3' when calling phate.PHATE

Thanks for your help,

Ivan

To Reproduce embedding= phate.PHATE(n_components= intrinsic_dim, knn= 5, decay= None, n_landmark= 2000, t= 'auto', gamma= 1.0, n_pca= input_data.shape[1], mds_solver= 'smacof', knn_dist= 'minkowski', mds_dist= 'minkowski', mds= 'classic', random_state= 1969, n_jobs= cpu_count, verbose= False, p= 0.3)

Expected behavior The initialization of phate object should take 'p= 0.3' as part of the parameters to initialize phate object

Actual behavior Traceback (most recent call last): File "test_phenograph_clustering.py", line 94, in projected_data= embedding.fit_transform(X= input_data) File "C:\Temp\Python\Python3.6.5\lib\site-packages\phate\phate.py", line 961, in fit_transform self.fit(X) File "C:\Temp\Python\Python3.6.5\lib\site-packages\phate\phate.py", line 853, in fit **(self.kwargs) File "C:\Temp\Python\Python3.6.5\lib\site-packages\graphtools\api.py", line 288, in Graph return Graph(**params) File "C:\Temp\Python\Python3.6.5\lib\site-packages\graphtools\graphs.py", line 132, in init super().init(data, n_pca=n_pca, **kwargs) File "C:\Temp\Python\Python3.6.5\lib\site-packages\graphtools\graphs.py", line 524, in init super().init(data, **kwargs) File "C:\Temp\Python\Python3.6.5\lib\site-packages\graphtools\base.py", line 1019, in init super().init(data, **kwargs) File "C:\Temp\Python\Python3.6.5\lib\site-packages\graphtools\base.py", line 135, in init super().init(**kwargs) File "C:\Temp\Python\Python3.6.5\lib\site-packages\graphtools\base.py", line 505, in init super().init(**kwargs) TypeError: init() got an unexpected keyword argument 'p'

System information:

Output of phate.__version__:

Please run phate.__version__ and paste the results here.

You can do this with `python -c 'import phate; print(phate.__version__)'`
phate-1.0.7

Output of pd.show_versions():

Please run pd.show_versions() and paste the results here.

You can do this with `python -c 'import pandas as pd; pd.show_versions()'`
INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.5.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
machine          : AMD64
processor        : Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : None.None

pandas           : 0.25.0
numpy            : 1.19.5
pytz             : 2018.5
dateutil         : 2.7.3
pip              : 9.0.3
setuptools       : 41.0.1
Cython           : 0.29.14
pytest           : 6.0.1
hypothesis       : None
sphinx           : 2.3.1
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : 0.9999999
pymysql          : None
psycopg2         : None
jinja2           : 2.11.0
IPython          : 7.11.1
pandas_datareader: None
bs4              : None
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : None
matplotlib       : 3.2.2
numexpr          : 2.7.3
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
s3fs             : None
scipy            : 1.5.4
sqlalchemy       : None
tables           : 3.6.1
xarray           : None
xlrd             : 1.2.0
xlwt             : None
xlsxwriter       : None

Additional context Python 3.6.5 with Deprecated-1.2.12 graphtools-1.5.2 phate-1.0.7 pygsp-0.5.1 s-gd2-1.8 scprep-1.1.0 tasklogger-1.1.0

ivan-marroquin avatar Jun 10 '21 14:06 ivan-marroquin

Hi @ivan-marroquin ,

PHATE doesn't currently offer this functionality, though the new maintainers may choose to add it. In the meantime, you should be able to achieve you desired outcome with a custom metric function as follows:

import phate
from functools import partial
from scipy.spatial.distance import minkowski

dist_fn = partial(minkowski, p=0.3)
phate_op = phate.PHATE(knn_dist=dist_fn, mds_dist=dist_fn)

I'm leaving this issue open as a feature request but please let me know if the proposed alternative doesn't work and we can open up a bug report separately.

scottgigante avatar Jun 10 '21 15:06 scottgigante

Hi @scottgigante ,

Many thanks for the tip! Following your advice, I decided to use numba to define a minkowski metric. Here is the code:

@numba.njit(fastmath= True) def fractional_dist(p_vec, q_vec, fraction= 0.1): result= 0.0

    for isamp in range(0, p_vec.shape[0]):
        if (p_vec[isamp] > q_vec[isamp]):
            result += (p_vec[isamp] - q_vec[isamp]) ** fraction
            
        else:
            result += (q_vec[isamp] - p_vec[isamp]) ** fraction
    
    dist= result ** (1 / fraction)
    
    return dist

Then, I will compare the result using your proposed approach.

Ivan

ivan-marroquin avatar Jun 10 '21 15:06 ivan-marroquin