optunity icon indicating copy to clipboard operation
optunity copied to clipboard

clusters= in cross_validated doesn't seem to work properly with more than 3 clusters

Open vmorozov opened this issue 10 years ago • 1 comments

clusters= in cross_validated doesn't seem to work properly with more than 3 clusters:

import optunity import optunity.cross_validation import numpy as np

def f(x_train, y_train, x_test, y_test): if(bool(set(x_test)&set(x_train))): print("test and set clusters overlap:",set(x_test)&set(x_train)) print("train data:\t" + str(x_train) + "\t train labels:\t" + str(y_train)) print("test data:\t" + str(x_test) + "\t test labels:\t" + str(y_test)) return 0.0

function to create a list of clusters from a group/cluster variable

def ind_group(gr3): i1=[] for n1 in set(gr3): i2=np.in1d(gr3,n1).nonzero()[0].astype(int).tolist() i1.append(i2) return(i1)

create data with group/cluster structure:

data = np.repeat(range(4), 2) print("data:",data) groups = ind_group(data) print("groups",groups) f_clustered = optunity.cross_validated(x=data, y=data, clusters=groups[1:], num_folds=3)(f) f_clustered()

('data:', array([0, 0, 1, 1, 2, 2, 3, 3])) ('groups', [[0, 1], [2, 3], [4, 5], [6, 7]])

('test and set clusters overlap:', set([0])) train data: [2 2 1 1 0] train labels: [2 2 1 1 0] test data: [3 3 0] test labels: [3 3 0]

('test and set clusters overlap:', set([0])) train data: [2 2 3 3 0] train labels: [2 2 3 3 0] test data: [1 1 0] test labels: [1 1 0]

Process finished with exit code 0

vmorozov avatar Nov 19 '15 17:11 vmorozov

Thanks for reporting this bug and a the MWE, I will look into this.

claesenm avatar Feb 26 '16 08:02 claesenm