gafe
gafe copied to clipboard
Genetic Algorithm Feature Engineering
GAFE - Genetic Algorithm Feature Engineering
Simple algorithm for new features engineering.
- gafe tries different combination of features with operators:
+,-,* - add it to your dataset
- re-evaluate the classifier performance with new features
Example
- In binary classification problem, you have dataset with following 20 features:
feature1,feature2,feature3, ...,feature20and binary target column. - GAFE computes the base score for your dataset using Random Forest (32 trees), 5-fold CV and negative log loss.
- The algorithm is starting with random population of new feature sets. Each new feature set contains from
new_features_lower_cnttonew_features_upper_cntnew features. Each new feature is combination of original features with operators:+,-,*, for example new feature can look like:feature1-feature2-feature3. - Each new feature set is scored with the same classifier as in step 2. For scoring are used concatenated original and new features.
- The genetic algorithm is applied to mutate new feature sets to find better features.
- At the end, the best feature set is selected based on classifier performance.