Add Noise Functionality to DataFrame Columns in Utils Package
Description:
This PR introduces a new function, add_noise_to_df_column, designed to add noise to a specified column in a DataFrame. The function addresses issue #63, where there was a request to incorporate noise addition functionality into a specific DataFrame column.
Changes Made:
- Added a new function named add_noise_to_df_column in the utils package. The function utilizes numpy and pandas libraries for numerical operations and DataFrame manipulation. Implemented noise addition functionality based on the type of data in the column: For numerical columns (int or float), Gaussian noise with a mean of 0 and standard deviation equal to the specified noise level is added.
For string columns (object), characters of some strings are randomly permuted with a probability determined by the noise level. This addition enhances the utility of the utils package by providing a flexible method to introduce noise into DataFrame columns, facilitating various data processing and analysis tasks.
The commented code at the bottom of the file is left on purpose to allow testing of the function's behavior.
Please review at your earliest convenience.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 87.23%. Comparing base (
705339c) to head (8d209f1). Report is 4 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #72 +/- ##
==========================================
+ Coverage 87.10% 87.23% +0.12%
==========================================
Files 40 40
Lines 1760 1770 +10
==========================================
+ Hits 1533 1544 +11
+ Misses 227 226 -1
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.