"ratio" argument in ggpiestats/ggbarstats seems dysfunctional
When I specify the "ratio" argument in "ggbarstats()" or "ggpiestats()", the test results seem wrong. However, manual application of the "contingency_table()" function from the "statsExpressions" package to a grouped tibble seems to give the right output.
I inspected the definition of "ggbarstats()" and I fail to understand why the results differ; see the example below.
library(ggstatsplot); library(statsExpressions); library(tidyverse); library(reprex)
#> You can cite this package as:
#> Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
#> Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167
data <- data.frame(
x = factor(c('low', 'high', 'low', 'low')),
type = factor(c(0, 0, 1, 1))
)
data %>%
group_by(type) %>%
group_modify(~ contingency_table(.x, x,
ratio = c(.001, .999))) %>%
ungroup() %>% suppressWarnings
#> # A tibble: 2 × 14
#> type statistic df p.value method effec…¹ estim…² conf.…³ conf.…⁴ conf.…⁵
#> <fct> <dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 0 499. 1 2.01e-110 Chi-s… Pearso… 0.998 0.95 0.998 1
#> 2 1 0.00200 1 9.64e- 1 Chi-s… Pearso… 0.0316 0.95 0 1
#> # … with 4 more variables: conf.method <chr>, conf.distribution <chr>,
#> # n.obs <int>, expression <list>, and abbreviated variable names ¹effectsize,
#> # ²estimate, ³conf.level, ⁴conf.low, ⁵conf.high
extract_stats(ggpiestats(data, x = x, y = type,
ratio = c(.001, .999)))$one_sample_data
#> # A tibble: 2 × 10
#> type counts perc N statistic df p.value method .label .p.la…¹
#> <fct> <int> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 1 2 50 (n = 2) 2 1 0.157 Chi-squared… list(… list(~…
#> 2 0 2 50 (n = 2) 0 1 1 Chi-squared… list(… list(~…
#> # … with abbreviated variable name ¹.p.label
Created on 2022-11-28 with reprex v2.0.2
I am adding my comment to this post cause I have found the same thing. The parameter ratio doesn't seem to work when I run ggbarstat or ggpiestat and specify also the y argument. Is it normal?
E.g.:
ggpiestats(
data = Titanic_full,
x = Survived,
y = Sex,
ratio = c(.73, .27)
)
ggpiestats(
data = Titanic_full,
x = Survived,
y = Sex,
ratio = c(.27, .73)
)
If I run the two lines of code, the results won't change. Is there a way to fix this?
Thanks