plotly.py icon indicating copy to clipboard operation
plotly.py copied to clipboard

ecdf with normed histogram

Open luifire opened this issue 1 year ago • 1 comments

I really like the plotly.express.ecdf and have been using it a lot in my daily work. When I show ecdf plots in meetings, I usually show it with marginal='histogram', since this is easier understandable for the non-data-scientists in the room. However, since the amount of data varies I would like to have a normalized histogram, i.e. have percent values.

I know this would be possible with subplots, but there are really a lot of ugly adjustments to make. So a solution could be to show the percentage in the hint as well, or something like histnorm from plotly.express.histogram.

Example for easy testing:

import plotly.express as px
import numpy as np
import pandas as pd

# Generate random data
np.random.seed(42)  # For reproducibility
data = np.random.normal(loc=0, scale=1, size=1000)  # Normal distribution data

# Create a pandas dataframe
df = pd.DataFrame({'Values': data})

# Create ECDF plot with histogram
fig = px.ecdf(data, 
              ecdfnorm='percent', 
              marginal='histogram')

# Show the figure
fig.show()

luifire avatar Sep 30 '24 08:09 luifire

Feature Request: ECDF with Normed Histogram Support

Issue Description:

I'm a frequent user of the plotly.express.ecdf feature and find it very useful for showcasing empirical cumulative distribution functions (ECDFs) in meetings. To make the plots more understandable for non-data scientists, I often pair the ECDF with a marginal histogram (marginal='histogram'), which provides an intuitive visual aid.

However, as the amount of data varies across datasets, it would be beneficial to have the option to normalize the marginal histogram to display percentages instead of raw counts. Currently, plotly.express.histogram supports histnorm, but it's not available when using ECDF plots with a marginal histogram.

While it’s possible to achieve this using subplots, this requires a lot of manual and cumbersome adjustments. A simpler solution would be to incorporate normalization as an option within px.ecdf (similar to histnorm) or to display the percentage value in the tooltip for histograms.

Example:

Below is a simplified code to generate ECDF with a marginal histogram. It would be great to have an option like histnorm in this case to normalize the histogram.

import plotly.express as px
import numpy as np
import pandas as pd

# Generate random data
np.random.seed(42)  # For reproducibility
data = np.random.normal(loc=0, scale=1, size=1000)  # Normal distribution data

# Create a pandas dataframe
df = pd.DataFrame({'Values': data})

# Create ECDF plot with histogram (currently without normalization support)
fig = px.ecdf(data, 
              ecdfnorm='percent', 
              marginal='histogram')

# Show the figure
fig.show()

sOnU1002 avatar Oct 02 '24 17:10 sOnU1002