statannotations icon indicating copy to clipboard operation
statannotations copied to clipboard

Feature request: permutation test

Open naureeng opened this issue 1 year ago • 2 comments

Can the built-in scipy.stats permutation_test function be an option for statistical tests? Many Thanks!

naureeng avatar May 24 '24 07:05 naureeng

Since I also spent a long time solving this problem, I'd like to say that this is already possible in the current version, you just need to set it in the anotator class, here's my code. anno = Annotator(axb1, pairs=pairs, data=paint_df, x='Gene_length', y='clade', orient='h', order = list(template_color.keys()), hue='Group',hue_order=['receive','donor','native'],permutations=1000)

Ne0tea avatar Nov 27 '24 02:11 Ne0tea

Hello @naureeng , Thanks for the question! Applogies for the delay in answering. hopefully it can still help you or the next readers.

It is indeed possible with statannotations already, using the extension API.

There is some help in our documentation and in this gist, but here is an example inspired by scipy's own documentation of the function:

import matplotlib.pyplot as plt

from scipy.stats import pearsonr, permutation_test
import seaborn as sns

from statannotations.Annotator import Annotator
from statannotations.stats.StatTest import StatTest

# See scipy example from https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.permutation_test.html
def statistic(x, y, axis=-1):
    return pearsonr(x, y, axis=axis).statistic

def pearson_permutation_test(group_data1, group_data2, *args, **kwargs):
    res = permutation_test((group_data1, group_data2), statistic, vectorized=True,
                            permutation_type='pairings',
                            alternative='greater')
    return res.statistic, res.pvalue


stat_test = StatTest(func=pearson_permutation_test,
                     test="pearson",
                     test_long_name="Pearson correlation permutation test",
                     test_short_name="Pearson",
                     stat_name="r")

data = [[1, 2, 4, 3], [2, 4, 6, 8]]
pairs = [(0, 1)]

ax = sns.boxplot(data=data)
annotator = Annotator(ax=ax, data=data, pairs=pairs)
annotator.configure(comparisons_correction=None, text_format="simple", test=stat_test)
annotator.apply_and_annotate()

result = annotator.annotations[0].data
print(f'r={result.stat_value}, p={result.pvalue}')

plt.show()

This will print

0 vs. 1: Pearson correlation permutation test, P_val:1.667e-01 r=8.000e-01
r=0.7999999999999999, p=0.16666666666666666

and show permutations

Note that to capture and examine the null_distributions, you'd have to do something like


# add import
import numpy as np

# add `distributions` parameter
def pearson_permutation_test(group_data1, group_data2, *args,  distributions=None, **kwargs):
    res = permutation_test((group_data1, group_data2), statistic, vectorized=True,
                            permutation_type='pairings',
                            alternative='greater')
    distributions.append((group_data1, group_data2, res.null_distribution))
    return res.statistic, res.pvalue
...
# change apply_and_annotate() with:
null_distributions = []
annotator.apply_test(distributions=null_distributions).annotate()

# leave the rest of the above, then add
...
r=result.stat_value
null = null_distributions[0][2]
unique = np.unique(null)
unique[np.isclose(r, unique)].tolist()

returning the expected

 [0.7999999999999998, 0.7999999999999999, 0.8]

trevismd avatar Nov 30 '24 08:11 trevismd