Integrate with formula
Hi, I just notice your work on stackoverflow.
Do you like to integrate the feature of expanding into formula? I have a prototype package to do this:
https://github.com/wush978/FeatureHashing
In this prototype package, I implemented an API so that:
library(FeatureHashing)
data1 <- data.frame(a = c("1,2,3", "2,3,3", "1,3", "3"), type = c("a", "b", "a", "a"), stringsAsFactors = FALSE)
interpret.tag( ~ tag(a, split = ",", type = "existence") + tag(a, split = ",", type = "count"):type, data = data1)
will produces a data.frame with expanded columns and a expanded formula to run some advaced model such as lm.
Do you want to integrate this feature?
Moreover, I notice that this way will consume lots of memory, so I am wondering if there is a way to directly convert such data.frame to sparse matrix directly. I am still working on this.
@wush978, that looks like great work. I'll try to look over your package in the next few days as I'm travelling at the moment.