Question regarding the statscondCluster test.

Open emilia-30 opened this issue 1 year ago • 1 comments

Hello HyPyP community,

First of all, thank you so much for your efforts and developments into the hyperscanning analysis!

I have a question regarding the statscondCluster test.

I am trying to use it to compare the two conditions from my experiment. In a nutshell, in my experiment participants are engaged in an interaction, and there are condition 1 and condition 2. Predictably, I want to see whether brain-to-brain connectivity in condition 1 is in any way different than that in condition 2. Here is the code I use to loop over my dyads to create the connectivity matrix per each dyad, including the bits where I slice the matrix to obtain the inter-brain and intra-brain parts of the matrix.

for i, x in enumerate(data_all):
    x = i * 2
    y = x + 1
    T2_animal_sp = data_all[x]['animal/clean_1_sp'].get_data()
    T2_animal_sp = np.pad(T2_animal_sp, ((0,0), (0,0), (634, 634)), 'constant')
    T2_animal_lis = data_all[y]['animal/clean_1_lis'].get_data()
    T2_animal_lis = np.pad(T2_animal_lis, ((0, 0), (0, 0), (634, 634)), 'constant')
    T2_animal = np.array([T2_animal_sp, T2_animal_lis])
    complex_signal_animal_T2 = analyses.compute_freq_bands(T2_animal, sampling_rate, freq_bands)
    result_animal_T2 = analyses.compute_sync(complex_signal_animal_T2, mode='ccorr')  # , epochs_average=False)
    delta, theta, alpha, beta = result_animal_T2[:, 0:n_ch, n_ch: 2 * n_ch]
    values = alpha
    C_T2_animal = (values - np.mean(values[:])) / np.std(values[:])
    filename = 'Ccorr_T2_animal_' + str(i) + '.npy'
    f = open(filename, "a")
    np.save(filename, values)
    for k in [0, 1]:
        result_intra = []
        delta, theta, alpha, beta = result_animal_T2[:, k * n_ch: (k + 1) * n_ch, k * n_ch: (k + 1) * n_ch]
        # choosing Alpha_Low for futher analyses for example
        values_intra = alpha
        values_intra -= np.diag(np.diag(values_intra))
        # computing Cohens'D for further analyses for example
        C_intra_T2_animal = (values_intra - np.mean(values_intra[:])) / np.std(values_intra[:])
        # can also sample CSD values directly for statistical analyses
        result_intra.append(C_intra_T2_animal)
        filename1 = 'Ccorr_T2_animal_intra' + str(i) + str(k) + '.npy'
        f = open(filename1, "a")
        np.save(filename1, C_intra_T2_animal)

Now, when I try to move on to the stats tests, I don't exactly understand which values I need to pass to the formula (I know a similar question was raised here before but unfortunately, it didn't clarify this for me).

If I want to compare condition 1 and condition 2, do I do something like this (or not?):


data = [np.array([values_dyad1_cond_1, values_dyad2_cond_1, etc]), np.array([values_dyad1_cond_2, values_dyad2_cond_2, etc])]

F_obs, clusters, cluster_pv, H0, F_obs_plot = stats.statscondCluster(data=data,
                                          freqs_mean=np.arange(8, 13),
                                          ch_con_freq=None,
                                          tail=0,
                                          n_permutations=10000,
                                          alpha=0.05)

However, in the tutorial, conditions seem to be compared against the intra-brain part of the matrix, which I don't understand for the following reasons: (a) per condition, there are number_dyadsXvalues for the inter-brain but 2Xnumber_dyadsXvalues for the intra-brain (per participant, right?), will the test make sense if the number of arguments is different, ie twice as many in the intra-brain part? (b) in the tutorial it seems that for the statscondCluster test (where two 'fake' groups are created) the inter-brain part are the 'raw' values from the matrix but for the intra-brain part they are transformed to Cohen's D - or am I not seeing something? (c) lastly, I don't get it conceptually, why, to compare 2 conditions, these conditions are collapsed together and compared to the 'random signal' via the intra-brain values...

If none of these question make sense, then I am completely lost so any clarification is much appreciated.

Best, Emilia

Apr 24 '24 13:04 emilia-30

Dear Emilia,

Thank you so much for pointing out this confusion in the HyPyP tutorial ! My name is Rémy, and along with my colleague Patrice, we’re two engineers recently hired to help maintain and develop the HyPyP package. Please accept our apologies for the late reply — your question is absolutely relevant, and it highlights an area where our documentation could (and will) be much clearer.

If I understand correctly, you're trying to compare brain-to-brain connectivity between two experimental conditions, and you're unsure about how to structure the data for the statscondCluster test.

You’re right that the example in the tutorial doesn’t help much here. It doesn’t actually compare two experimental conditions; it’s more of an illustration of the general way to do a contrast, using intra- vs inter-brain connectivity.

In your case, where the goal is to compare Condition 1 with Condition 2, the structure you proposed is exactly the right approach. You’d define your data like this:

data = [
    np.array([values_dyad1_cond_1, values_dyad2_cond_1, ...]),
    np.array([values_dyad1_cond_2, values_dyad2_cond_2, ...])
]

Then you can call the function as:

F_obs, clusters, cluster_pv, H0, F_obs_plot = stats.statscondCluster(
    data=data,
    freqs_mean=np.arange(8, 13),  # alpha band for instance
    ch_con_freq=ch_con_freq,
    tail=0,
    n_permutations=10000,
    alpha=0.05
)

Behind the scenes, this function is a wrapper around the MNE function mne.stats.permutation_cluster_test, and the shape of the data you pass in should follow the same logic: each element in the list is one group (one condition), and inside each array you have the observations (dyads), with each observation shaped as your connectivity matrix or feature array.

To quote the MNE documentation directly:

“Each array in X should contain the observations for one group. The first dimension of each array is the number of observations from that group; remaining dimensions comprise the size of a single observation.”

So if each dyad produces one matrix or vector of values (e.g., average alpha-band inter-brain connectivity), you can stack those directly, one array per condition.

Your questions about the intra-brain matrices and Cohen's d also make sense. In the tutorial, the intra-brain matrices were used not for comparison across experimental conditions, but more to illustrate how you could compute contrasts in general (for example, using Cohen’s d between participants or conditions). That section can be misleading if you're looking for a clean example of a condition-vs-condition test — so again, thank you for flagging this.

If you've already moved forward and tried other approaches, we’d be really curious to hear what worked for you. And again, your message is helping us plan the next wave of improvements to HyPyP’s documentation and tutorials — so thank you!

Regards, Rémy

May 14 '25 19:05 Ramdam17