Scattertext-PyData icon indicating copy to clipboard operation
Scattertext-PyData copied to clipboard

Harmonic mean error

Open gskarp opened this issue 4 years ago • 1 comments

Hello, I use the Jupyter Notebook with my own data. When running the following part of the code

def normcdf(x):
    return norm.cdf(x, x.mean(), x.std())
term_freq_df['eight_precision_normcdf'] = normcdf(term_freq_df['eight_precision'])
term_freq_df['eight_freq_pct_normcdf'] = normcdf(term_freq_df['eight_freq_pct'])
term_freq_df['eight_scaled_f_score'] = hmean([term_freq_df['eight_precision_normcdf'], term_freq_df['eight_freq_pct_normcdf']])
term_freq_df.sort_values(by='eight_scaled_f_score', ascending=False).iloc[:10]

I get the following error

image

The column categories run from 'zero' to 'eight'. Any suggestion to overcome this problem is welcome

gskarp avatar Apr 16 '21 16:04 gskarp

Impossible to know what's going on without the data. I'd bet you have a very low value in one which is getting marked as 0 by normcdf due to floating point precision issues.

JasonKessler avatar Apr 16 '21 19:04 JasonKessler