tydypredict fails with randomforest regression
library(randomForest);
library(tidypredict);
library(dbplyr);
library(dplyr);
head(iris[,1:4]);
model <- randomForest(iris$Sepal.Length~ ., data = iris[,2:4], ntree = 1);
tidypredict_sql(model,dbplyr::simulate_mssql());
Partial listing
[[1]]
<SQL> CASE
WHEN (`Petal.Length` >= 5.7 AND `Sepal.Width` < 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Petal.Width` < 0.65 AND `Sepal.Width` < 2.55 AND `Sepal.Width` < 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Petal.Length` >= 3.2 AND `Sepal.Width` < 3.9 AND `Sepal.Width` >= 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Petal.Width` < 0.3 AND `Sepal.Width` >= 3.9 AND `Sepal.Width` >= 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Petal.Width` >= 0.3 AND `Sepal.Width` >= 3.9 AND `Sepal.Width` >= 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Petal.Length` < 5.2 AND `Petal.Length` < 5.7 AND `Sepal.Width` < 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Petal.Length` >= 5.2 AND `Petal.Length` < 5.7 AND `Sepal.Width` < 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Petal.Length` < 3.65 AND `Petal.Width` >= 0.65 AND `Sepal.Width` < 2.55 AND `Sepal.Width` < 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Sepal.Width` < 2.65 AND `Sepal.Width` < 2.85 AND `Sepal.Width` >= 2.55 AND `Sepal.Width` < 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Petal.Length` < 2.75 AND `Sepal.Width` >= 2.85 AND `Sepal.Width` >= 2.55 AND `Sepal.Width` < 3.05 AND `Petal.Length` < 4.6) THEN (NULL)
WHEN (`Sepal.Width` < 2.9 AND `Petal.Width` < 1.75 AND `Petal.Width` < 1.85 AND `Sepal.Width` >= 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Petal.Length` >= 5.8 AND `Petal.Width` >= 1.75 AND `Petal.Width` < 1.85 AND `Sepal.Width` >= 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Petal.Length` >= 5.85 AND `Petal.Length` < 6.0 AND `Petal.Width` >= 1.85 AND `Sepal.Width` >= 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Sepal.Width` < 3.4 AND `Petal.Length` >= 6.0 AND `Petal.Width` >= 1.85 AND `Sepal.Width` >= 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
WHEN (`Sepal.Width` >= 3.4 AND `Petal.Length` >= 6.0 AND `Petal.Width` >= 1.85 AND `Sepal.Width` >= 2.75 AND `Petal.Length` >= 4.6) THEN (NULL)
Hi tidypredict team
I realize the code for just one tree is low priority, but I like to look at the code and the tree diagram to get insight into how randomforest is slicing up the data.
I am a SAS programmer but find myself leaning on tidy and haven packages more and more as time goes on.
Thanks for providing these packages!
Roger
Hi @rogerjdeangelis, what is the issue that you are seeing? I'm not getting any errors.
If what you want to see is the structure of the tree, you could use parse_model() to get an object that reads the Random Forest model, and breaks it down into a somewhat readable list, here is an example:
library(randomForest)
library(tidypredict)
model <- randomForest(iris$Sepal.Length~ ., data = iris[,2:4], ntree = 1)
parsedmodel <- parse_model(model)
str(parsedmodel$trees)
#> List of 1
#> $ :List of 48
#> ..$ :List of 2
#> .. ..$ prediction: NULL
#> .. ..$ path :List of 3
#> .. .. ..$ :List of 4
#> .. .. .. ..$ type: chr "conditional"
#> .. .. .. ..$ col : chr "Petal.Width"
#> .. .. .. ..$ val : num 1.15
#> .. .. .. ..$ op : chr "less"
#> .. .. ..$ :List of 4
#> .. .. .. ..$ type: chr "conditional"
#> .. .. .. ..$ col : chr "Petal.Length"
#> .. .. .. ..$ val : num 3.4
#> .. .. .. ..$ op : chr "more-equal"
#> .. .. ..$ :List of 4
.... more