feature_engine icon indicating copy to clipboard operation
feature_engine copied to clipboard

Remove `switch` arg in `DropHighPSIFeatures`

Open sancholp opened this issue 9 months ago • 2 comments

Closes #858.

PSI is a symmetric metric and as such the switch arg doesn't add functionality.

sancholp avatar Apr 30 '25 09:04 sancholp

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 98.04%. Comparing base (ead2457) to head (202ae5c). Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #859   +/-   ##
=======================================
  Coverage   98.04%   98.04%           
=======================================
  Files         111      111           
  Lines        4643     4648    +5     
  Branches      739      740    +1     
=======================================
+ Hits         4552     4557    +5     
  Misses         56       56           
  Partials       35       35           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Apr 30 '25 09:04 codecov[bot]

@solegalli Whenever you find time please have a look

sancholp avatar Apr 30 '25 09:04 sancholp

two remarks from me (just browsing through PR):

  • I think it is indeed correct, PSI is symmetric. To see this, switch the two batches and observe that the difference term and the log term both switch signs.
    • however, it might not be true for the estimator, or if cutoffs are dynamically determined (in this case the switch will not result in equal value) - have you checked this?
  • removing an existing argument from an sklearn class is a breaking change. If this would be sklearn, it would have to go through a deprecation, i.e., a warning is raised that the argument will be removed, so users can adapt and their code does not break in case they use the arg.

fkiraly avatar Jul 02 '25 22:07 fkiraly

As per discussion in #858, we won't remove the argument, we'll make it clear in the docs that the PSI is symmetric, but the estimator calculates it is not.

solegalli avatar Jul 03 '25 13:07 solegalli

Waiting for this PR https://github.com/sancholp/feature_engine/pull/1 to merge.

solegalli avatar Jul 05 '25 18:07 solegalli

For some reason, this didn't work. I reverted the files to the original ones to only change the docstrings but something went wrong somewhere.

solegalli avatar Jul 06 '25 12:07 solegalli

Anything you need me to do?

sancholp avatar Jul 06 '25 16:07 sancholp