Data type detection: integer column with small distinct values as categorical
Is your feature request related to a problem? Please describe.
Currently eda.plot detects a column type based on its pandas dataframe type. Sometimes this may not be ideal. For example, in a dataset the gender column may contain two values: 0 for male and 1 for female. And this will be detected as numerical column, while categorical column makes more sense.
Describe the solution you'd like As a start point, we could handle some simple cases. For example, when a column's dataframe type is integer and its distinct values are smaller than a threshold (we can use the default displayed bars as the threshold), then we detect it as categorical column.
related to #99
- [ ] Add code to detect ordinal types in
dtypes.py - [ ] Handle ordinal plotting through the EDA module.