Add model selection split for automl search

Open angela97lin opened this issue 4 years ago • 0 comments

Rather than relying on the CV scores to rank the pipelines on the leaderboard, perhaps we should have a model selection split where we hold out some data and rank the pipelines depending on how well it performs on this holdout data.

I vaguely remember @dsherry mentioning that there was discussion about this with @kmax12, and @freddyaboulton mentioned a good point in #2260, so I figured it'd be a good thing to track.

May 17 '21 20:05 angela97lin