splitstackshape icon indicating copy to clipboard operation
splitstackshape copied to clipboard

Integrate with formula

Open wush978 opened this issue 11 years ago • 1 comments

Hi, I just notice your work on stackoverflow.

Do you like to integrate the feature of expanding into formula? I have a prototype package to do this:

https://github.com/wush978/FeatureHashing

In this prototype package, I implemented an API so that:

library(FeatureHashing)
data1 <- data.frame(a = c("1,2,3", "2,3,3", "1,3", "3"), type = c("a", "b", "a", "a"), stringsAsFactors = FALSE)
interpret.tag( ~ tag(a, split = ",", type = "existence") + tag(a, split = ",", type = "count"):type, data = data1)

will produces a data.frame with expanded columns and a expanded formula to run some advaced model such as lm.

Do you want to integrate this feature?


Moreover, I notice that this way will consume lots of memory, so I am wondering if there is a way to directly convert such data.frame to sparse matrix directly. I am still working on this.

wush978 avatar Oct 25 '14 03:10 wush978

@wush978, that looks like great work. I'll try to look over your package in the next few days as I'm travelling at the moment.

mrdwab avatar Oct 25 '14 12:10 mrdwab