dvclive icon indicating copy to clipboard operation
dvclive copied to clipboard

Add user friendly warning/error messages and helpers for log_plot()

Open mnrozhkov opened this issue 2 years ago • 3 comments

When people start using Live.log_plot(), they could struggle with getting an expected visualization because of 2 reasons

  1. log_plot() is very opinionated about the data format required for every template
  2. there are not user-friendly data checks and warning messages

Here are some ideas to help with DVCLive onboarding:

1. "Relax" requirements for data formats supported

For example, the bar_horizontal template expects smth like this:

datapoints = [
    {"name": "petal_width", "importance": 0.4},
    {"name": "petal_length", "importance": 0.33},
    {"name": "sepal_width", "importance": 0.24},
    {"name": "sepal_length", "importance": 0.03}
]

It would be cool to support other formats like:

  1. Pandas DataFrame image

  2. Dict with automatically extracts keys as y' and values as x.`

{'petal_width': 0.4,
 'petal_length': 0.33,
 'sepal_width': 0.24,
 'sepal_length': 0.03}

2. Provide minimal sanity checks for data/configs provides For example, if I run this code snippet:

from dvclive import Live

datapoints = [
   {"name": "petal_width", "importance": 0.4},
   {"name": "petal_length", "importance": 0.33},
   {"name": "sepal_width", "importance": 0.24},
   {"name": "sepal_length", "importance": 0.03}
]

with Live() as live:
   live.log_plot(
       "iris_feature_importance",
       datapoints,
       x="name",
       y="importance",
       template="bar_horizontal",
       title="Iris Dataset: Feature Importance",
       y_label="Feature Name",
       x_label="Feature Importance"
   )

I'll not get any error, but there is nothing showing in VSCode after that: image

Reason? There is a mistake in x and y arguments assignment, the correct is y="name", x="importance". But, it's very easy to oversee this typo and spend a lot of time trying to figure it out.

How can we help?

  • check that the bar_horizontal template expects numerical data for x

3. Provide good warning messages and hints if formats incompatible If we have data/args checks, we may tell about this in warning messages and this will help a lot to see smth like:

Data provided for x has str type bit numerical data type is expected

mnrozhkov avatar Dec 07 '23 16:12 mnrozhkov

Another thought on a lightweight way to help here: better docs in https://dvc.org/doc/dvclive/live/log_plot. Having an example of the input format for each template could go a long way. There are already examples of different templates in https://dvc.org/doc/command-reference/plots/show that we could use as a starting point.

dberenbaum avatar Dec 07 '23 18:12 dberenbaum

Background on the current implementation: https://github.com/iterative/dvclive/pull/543#pullrequestreview-1402602708

dberenbaum avatar Dec 07 '23 18:12 dberenbaum

Marking as p2 since I don't think log_plot() is frequently used, but still would be really nice to have these improvements

dberenbaum avatar Dec 07 '23 18:12 dberenbaum