rtables icon indicating copy to clipboard operation
rtables copied to clipboard

Include warning/stop/sanitation for degenerate tables

Open Melkiades opened this issue 2 years ago • 5 comments

As we have tried, and it seems the better way to proceed (degenerate tables are not supported, either sanitize or reconsider the table construction), I think we need to track down what to do next. Following up on my comment here is what I propose:

Here we have two options: a) We sanitize automatically if pruning/trimming or build_table produces degenerate tables / sanitize in input for exporters b) we rely on the user to apply sanitize (and add a better error handling, i.e. with assert_valid_table)

I am inclined to do the second. The involved functions are: (still to complete)

  • [ ] Exporters
  • [ ] Sorting
  • [ ] Pruning & trimming
  • [ ] Utilities
  • [ ] Pagination

Issues that this would close:

  • https://github.com/insightsengineering/rtables/issues/661
  • https://github.com/insightsengineering/rtables/issues/651
  • https://github.com/insightsengineering/rtables/issues/608
  • https://github.com/insightsengineering/rtables/issues/566
  • https://github.com/insightsengineering/rtables/issues/613

Melkiades avatar Jul 07 '23 14:07 Melkiades

By the way, this connects to many many tern issues regarding the various cases where there are only summarize_row_groups. For example:

lyt <- basic_table() %>% 
    split_cols_by("ARM") %>% 
    split_rows_by("SEX", split_fun = drop_split_levels) %>% 
    summarize_row_groups() %>% 
    split_rows_by("STRATA1") %>% 
    summarize_row_groups()
tbl <- lyt %>% build_table(DM)
assert_valid_table(tbl)
sanitize_table_struct(tbl)

This needs to be fixed across tern imo. @edelarua @ayogasekaram @BFalquet fyi

Melkiades avatar Jul 07 '23 14:07 Melkiades

This is a duplicate of #661, but we can put the conversation here.

I think we should warn on the construction of degenerate tables at build_table/subsetting/pruning time. That seems like a good balance of user agency and protection. We can have a silent argument which quashes the warning for build_table, if we want.

gmbecker avatar Jul 07 '23 17:07 gmbecker

Alternatively, for build_table only, we can have it optionally accept an "empty section" message and sanitize if it is given one. This would be in addition to the default warning behavior.

gmbecker avatar Jul 07 '23 17:07 gmbecker

Stupid question: could we even have the warning at the layout construction step? (I dont know if it makes more sense but i am in favour of catching error early)

BFalquet avatar Jul 09 '23 20:07 BFalquet

@BFalquet there are a couple of problems with doing things "at the layout construction step".

The first and largest of them is that there is no layout construction step, the way there is a table construction step (build_table). Layouts are built up via piped expressions. The issue there, nthen, would be determining when we should do the validation (as all layouts will start out "invalid", since they won't have any analysis instructions yet).

The second issue is that this would only detect some degenerate table situatoins. It will do "what it says on the tin", ie it will identify places where the layout is the issue causing degenerate tables to be created (ie no analysis instructions for a faceting hierarchy), but NOT places where the data is causing degenerate tables (unobserved levels, no data in at all, ...)

Taken together, this all means we could have a "validate_layout" or "assert_valid_layout" function, but

  1. it would need to be called explicitly, once the user/developer knows layout construction is complete,
  2. Passing these assertions would not be sufficient to guarantee a priori that a degenerate table will not be created, and
  3. There can't be any sort of corresponding sanitize_layout the way we can do with a constructed table.

gmbecker avatar Jul 11 '23 01:07 gmbecker