gverbock

Results 8 comments of gverbock

I like the idea to use the number of points to set a common threshold for all features. On the other hand the number of bins is arbitrary so it...

What I find interesting in the discussion is how to deal with unstable categories in a feature. The DropHighPSI approoach is designed to work with numeric variable and the topic...

Could be nice to consider the removal based on the expect impact. For example two features correlated at 0.90 could have different correlation with the target. Then select the one...

I would close this issue and refer to `feature_engine` for feature elimination based on correlation. I believe it make more sense to suggest change in `feature_engine` existing functionality than build...

My thoughts were to start simple: Having something like ``` class PerformanceOverTimeEstimator(model, X, y, scorer_list, dates, frequency) def boosting_process(self,...) X_proba = model.predict_proba(X) for boost in range(0, 1000) X_boost, Y_boost =...

Good points Mateusz. * Frequency would provide the level of aggregation over time. Monthly, quarterly, ... * time_stratified_sampling would ensure the bootstrap is homogeneously distributed across time. * compute scores_over_time...

You understood it correctly. I am not sure the limitation you raise on the cross-section would have a large impact.

To compute the PSI you need to define bins (for example based on the decile of the population). In PSI(a,b), a is selected to define the bins. Using the switch...