mlr3 backend
Thinking about how to plug the data into various models.
https://github.com/mlr-org/mlr3/issues/274#issuecomment-510618455
Given there is a mlr3 dplyr backend (https://github.com/mlr-org/mlr3db/blob/master/R/DataBackendDplyr.R) and data.table backend (https://github.com/mlr-org/mlr3/blob/master/R/DataBackend.R) it doesn't seem too hard to adapt it for disk.frame.
Not familiar with mlr3 at all. I might look into it, does it accept chunk by chunk processin?
It's meant to be the next-gen of mlr, which is like a next-gen of caret.
Like caret, it wraps glmnet, xgboost etc. so doesn't come with chunk by chunk out of the box unless the underlying learners do.
But what is interesting is the model ensembling in mlr. What I'm thinking is that you could potentially chunk by chunk and ensemble chunk level models, or wrap online trainable learners within that framework.
What I'm thinking is that you could potentially chunk by chunk and ensemble chunk level models, or wrap online trainable learners within that framework.
Just stopping by to say that you are right. Online learning is not a focus but mlr3 was developed with that in mind. Building ensembles from chunks of out of memory data should however be already possible using mlr3pipelines.