rustlearn icon indicating copy to clipboard operation
rustlearn copied to clipboard

How to run for one hot encoding

Open infinite-Joy opened this issue 6 years ago • 0 comments

In the examples shown using iris dataset, y is a vector of dimension 1 which is essentially a labelencoded vector. Running that on a one-hot encoded vector for y is not working out for me. please help on this. Below is an example code.


use rustlearn::ensemble::random_forest::Hyperparameters;
use rustlearn::trees::decision_tree;

fn main() {
    let data = Array::from(&vec![vec![0.0, 1.0], vec![2.0, 3.0], vec![3.0, 4.0], vec![5.0, 6.0], vec![7.0, 8.0], vec![9.0, 10.0]]);
    let target = Array::from(&vec![vec![0.0, 1.0], vec![0.0, 1.0], vec![0.0, 1.0], vec![1.0, 0.0], vec![1.0, 0.0], vec![1.0, 0.0]]);
    let test = Array::from(&vec![vec![0.0, 1.0]]);

    println!("{:?}", data);
    println!("{:?}", target);

    let mut tree_params = decision_tree::Hyperparameters::new(data.cols());
    tree_params.min_samples_split(2)
        .max_features(2);

    let mut model = Hyperparameters::new(tree_params, 2)
        .one_vs_rest();

    model.fit(&data, &target).unwrap();

    let prediction = model.predict(&test).unwrap();
    print!("{:?}", prediction);
}

The output of this code is

Array { rows: 6, cols: 2, order: RowMajor, data: [0.0, 1.0, 2.0, 3.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0] }
Array { rows: 6, cols: 2, order: RowMajor, data: [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0] }
Array { rows: 1, cols: 1, order: RowMajor, data: [0.0] }

As you can see the dimension of the predicted values is only 1.

infinite-Joy avatar Mar 19 '19 11:03 infinite-Joy