min_samples_split == 1 raises ValueError in Decision Tree Classifier
Hi:
I tested the simplest call of ensemble_train and got a ValueError for the parameter min_samples_split:
Traceback (most recent call last): File "pyensemble/ensemble_train.py", line 202, in
ens.fit(X_train, y_train) File "/home/mourao/income_prediction/pyensemble/ensemble.py", line 290, in fit self.fit_models(X, y) File "/home/mourao/income_prediction/pyensemble/ensemble.py", line 325, in fit_models model.fit(X[train_inds], y[train_inds]) File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 790, in fit X_idx_sorted=X_idx_sorted) File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 194, in fit % self.min_samples_split) ValueError: min_samples_split must be an integer greater than 1 or a float in (0.0, 1.0]; got the integer 1
I solved the problem removing 1 from the list in the file model_library.py:
def build_decisionTreeClassifiers(random_state=None):
rs = check_random_state(random_state)
param_grid = {
'criterion': ['gini', 'entropy'],
'max_features': [None, 'auto', 'sqrt', 'log2'],
'max_depth': [None, 1, 2, 5, 10],
'min_samples_split': [2, 5, 10],
'random_state': [rs.random_integers(100000) for i in xrange(3)],
}
return build_models(DecisionTreeClassifier, param_grid)