How to use countplot() in plotly with VAEX data frame?
Some one please give me an alternate plotly code for this one : sns.countplot(x='Census_ProcessorClass', hue='HasDetections',data=df_train) plt.show()
both are int64
This is basically px.histogram.
This is basically
px.histogram.
df_train = vaex DataFrame when I tried using this :
fig = px.histogram(df_train, x ='Census_ProcessorClass' , color = 'HasDetections', barmode = 'relative') fig.show()
I am getting this Value error :
ValueError Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/plotly/express/_chart_types.py in histogram(data_frame, x, y, color, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, hover_name, hover_data, animation_frame, animation_group, category_orders, labels, color_discrete_sequence, color_discrete_map, marginal, opacity, orientation, barmode, barnorm, histnorm, log_x, log_y, range_x, range_y, histfunc, cumulative, nbins, title, template, width, height) 454 histnorm=histnorm, histfunc=histfunc, cumulative=dict(enabled=cumulative), 455 ), --> 456 layout_patch=dict(barmode=barmode, barnorm=barnorm), 457 ) 458
/opt/conda/lib/python3.7/site-packages/plotly/express/_core.py in make_figure(args, constructor, trace_patch, layout_patch) 1859 apply_default_cascade(args) 1860 -> 1861 args = build_dataframe(args, constructor) 1862 if constructor in [go.Treemap, go.Sunburst] and args["path"] is not None: 1863 args = process_dataframe_hierarchy(args)
/opt/conda/lib/python3.7/site-packages/plotly/express/_core.py in build_dataframe(args, constructor) 1376 1377 df_output, wide_id_vars = process_args_into_dataframe( -> 1378 args, wide_mode, var_name, value_name 1379 ) 1380
/opt/conda/lib/python3.7/site-packages/plotly/express/_core.py in process_args_into_dataframe(args, wide_mode, var_name, value_name)
1181 if argument == "index":
1182 err_msg += "\n To use the index, pass it in directly as df.index."
-> 1183 raise ValueError(err_msg)
1184 elif length and len(df_input[argument]) != length:
1185 raise ValueError(
ValueError: Value of 'x' is not the name of a column in 'data_frame'. Expected one of [0] but received: Census_ProcessorClass
Try converting your Vaex df to a Pandas one to see if that resolves things?
Try converting your Vaex df to a Pandas one to see if that resolves things?
Yeah Nic I am pretty sure it will resolve the issue but it will take a lot of time and memory to convert my data into pandas dataframe. I think my system may crash.
I am looking for more efficient ways. Is there any method to make Vaex dataframe acceptable by plotly.
PX doesn't natively accept Vaex data frames at the moment, no. Part of the reason for that is that for plots like these histograms, it doesn't do Python-side aggregation: all the data is sent to the browser for aggregation, so there's a bit of an upper bound on the dataset size that px.histogram can handle anyway.
See https://github.com/plotly/plotly.py/issues/2649 for more details
See plotly/plotly.py#2649 for more details
Hey after lot of trail and errors, I think I found a better way. Check this code it worked
fig = px.histogram (x = df_train['Census_ProcessorClass'].tolist(), color= df_train['HasDetections'].tolist()) fig.show()

See plotly/plotly.py#2649 for more details
Hey after lot of trail and errors, I think I found a better way. Check this code it worked
fig = px.histogram (x = df_train['Census_ProcessorClass'].tolist(), color= df_train['HasDetections'].tolist()) fig.show()
I found a much better method:
df_train.select(df_train['Census_ProcessorClass'] ,'Census_ProcessorClass' != 'None' ) x_axis = df_train.evaluate(df_train['Census_ProcessorClass'], selection = True) color_axis = df_train.evaluate(df_train['HasDetections'], selection = True)
%%time
fig = px.histogram (x = x_axis, color= color_axis, width = 300, height = 400)
fig.show()

CPU times: user 761 ms, sys: 33.1 ms, total: 794 ms Wall time: 811 ms