piperider icon indicating copy to clipboard operation
piperider copied to clipboard

Don't generate the assertion files in `piperider run`

Open popcornylu opened this issue 3 years ago • 3 comments

Summary

Some users use piperider as the data profiling tool only. However, in current journey, it will always generate the assertion files for the first run.

fetching metadata
[1/1] data ━━━━━━━━━━━  5/5 0:00:00
No assertion found
Do you want to auto generate recommended assertions for this datasource [Yes/no]?

The problem would be

  1. User don't know what will happen when I enter yes or no
  2. Even say NO, there is still empty assertion files generated. But why don't we generate it only when the user would like to write the tests?
  3. If the user say YES, the assertions files are generated for current profiling result. However, if the user is not intended to write assertion files right away, the generated assertions would be confusing for the future runs.
  4. Another problem is that, all the assertion files for every tables are generated. It would be not realistic to write all the tests at the same time.

Intended Outcome

  • Don't generate assertions in piperider run, use generate-assertions command instead to generate template or assertions.
  • The real case to writetest is table by table. It would be more reasonable to generate assertions -> edit assertion file -> test by table basic.

How will it work?

  1. The piperider run will not generate assertions.

  2. In generate-assertions, we have to specify the table to generate rather than all tables. (e.g. piperider generate-assertions --table mytable)

  3. In generate-assertions, user can select empty template or suggestion assertions.

    $ piperider generate-assertions --table mytable
    [?] Which type of strategy to generate assertions:
    * Empty assertions with column structure
      Suggest the assertions by the profiling result.
    

Internal ticket sc-28737

popcornylu avatar Sep 21 '22 03:09 popcornylu

  1. Another problem is that, all the assertion files for every tables are generated. It would be not realistic to write all the tests at the same time.

but what if the user just wants to generate the assertions by the profiling results instead of writing all the tests at the early time? can the user execute the single command to generate all suggested assertions?

  1. The piperider run will not generate assertions.

if the user already had experience with the profiling/testing of piperider, and the user wants to start a new data project. can they execute piperider run with generated assertions by passing the option? if so, it can ease the user effort to generate the assertions in the new project

ggosiang avatar Sep 21 '22 04:09 ggosiang

but what if the user just wants to generate the assertions by the profiling results instead of writing all the tests at the early time? can the user execute the single command to generate all suggested assertions?

I don't think there is a perfect rules to generate ready-to-use result for suggested assertions. I prefer to make it a baseline of assertions rather than a perfect ready-to-use suggestion rule.

can they execute piperider run with generated assertions by passing the option? if so, it can ease the user effort to generate the assertions in the new project

like piperider run --generate-assertions?

Sorry that I prefer to separate the two journeys. The reason is

  1. Same above. I don't think there is a perfect way to generate the suggested assertion
  2. We can have better experience to tell user what assertions are generated by optimize the generate-assertion output and ask user to edit it.
  3. There would be some interaction in the generate-assertion, I prefer to make piperider run simple that the it oinly run a profiling/test pipeline rather than generating assertions as well.

popcornylu avatar Sep 23 '22 09:09 popcornylu

okay, that makes sense

i'd like to see user can have a better experience with what assertions are generated and how they can edit it

ggosiang avatar Sep 26 '22 02:09 ggosiang

Available in v0.13.0

popcornylu avatar Nov 14 '22 09:11 popcornylu