Merge with StatsBase?
This package on its own is not all that discoverable, plus a lot of the methodology is also relevant to "classical" statistics, not just to machine learning (e.g. cross validation, classification, etc.). Thoughts?
cc @nalimilan
FWIW, I'd rather have lots of small packages (e.g., Classification.jl, CrossValidation.jl, Bootstrap.jl, ModelTuning.jl) that remain outside StatsBase since they are somewhat specific techniques and problem spaces.
I agree these features sound broader than machine learning, but I'm not sure whether they should live in StatsBase or in separate packages. I guess it depends on whether each package offering a new kind of model will have to override some functions (and therefore depend on the package providing them) or not. Ideally a common interface would live in StatsBase and e.g. Bootstrap.jl would only use these functions to automatically support bootstrap for any model.
Ideally a common interface would live in StatsBase and e.g. Bootstrap.jl would only use these functions to automatically support bootstrap for any model.
Yeah, that's what I was thinking. I figured StatsBase could have a simple Resample interface that could be supported for bootstrapping, cross-validation, jackknifing, etc.
Also, might be worth contacting the JuliaML folks as the features here have some overlap with their packages (e.g., MLDataPattern.jl)
There is indeed overlap with JuliaML/LearnBase.jl in purpose at least, if not in naming. @Evizero
I don't think I have anything insightful or useful to contribute to this conversation. Maybe a good course of action is to give whoever wants to dedicate time and effort into this package some flexibility to do so