cleanlab
cleanlab copied to clipboard
Sanitize label column when initializing a Datalab instance.
Check for nan values in the label column.
This cannot be handled by the NullIssueManager, because it occurs in Datalab(data=df_with_nan_value_in_label_column, label_name="label_column").
For now, we need better error reporting.
@elisno @jwmueller Including code block
if np.isnan(labels).any():
raise ValueError("Labels must not contain null values")
under cleanlab.datalab.internal.data.Multiclass.extract_labels . does this look appropriate?