miceforest icon indicating copy to clipboard operation
miceforest copied to clipboard

Multiple Imputation with LightGBM in Python

Results 28 miceforest issues
Sort by recently updated
recently updated
newest added

Hi @AnotherSamWilson, The related issue is the fact that for large dataset (my case at least) the imputed values for the same dataset differ before (trained kernel) and after (that...

Right now random state management is all over the place, and inelegant, especially inside the default mean matching functions. See if most random processes can be switched over to use...

enhancement

lightgbm can handle the following. No reason we can't add: H2O DataTable's Frame scipy.sparse Look into lightgbm.Sequence... might be no point.

enhancement

Actually storing the latest imputation values can take up a lot of memory. We have all the information we need to generate imputation values when complete_data() is called, why not...

enhancement

Getting this reliably. Chased it down to scipy.Spatial.KDtree. Changing leafsize doesn't help. candidate_preds is float64, shape (294695, 1).

bug

For correlations, could be the % of matching imputed categories. For distributions, could be a bar/boxplot (depending on datasets?) of the histograms values.

enhancement

Have had good experience with Bayesian optimization in the past. Lightweight implementation: https://github.com/fmfn/BayesianOptimization

enhancement

Hello, Hope you are doing well!! I was working with MiceForest on a toy dataset to understand how it works. And, during that came across "LinAlgError: singular matrix" while generating...

bug