Skip to contents

This function returns a summary statistics tibble. It will use the y column from the tidy_ distribution function.

Usage

tidy_distribution_summary_tbl(.data, ...)

Arguments

.data

The data that is going to be passed from a a tidy_ distribution function.

...

This is the grouping variable that gets passed to dplyr::group_by() and dplyr::select().

Value

A summary stats tibble

Details

This function takes in a tidy_ distribution table and will return a tibble of the following information:

  • sim_number

  • mean_val

  • median_val

  • std_val

  • min_val

  • max_val

  • skewness

  • kurtosis

  • range

  • iqr

  • variance

  • ci_hi

  • ci_lo

The kurtosis and skewness come from the package healthyR.ai

Author

Steven P. Sanderson II, MPH

Examples

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

tn <- tidy_normal(.num_sims = 5)
tb <- tidy_beta(.num_sims = 5)

tidy_distribution_summary_tbl(tn)
#> # A tibble: 1 × 12
#>   mean_val median_…¹ std_val min_val max_val skewn…² kurto…³ range   iqr varia…⁴
#>      <dbl>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl> <dbl>   <dbl>
#> 1   0.0124    0.0194   0.962   -3.04    2.59  0.0121    2.99  5.63  1.25   0.926
#> # … with 2 more variables: ci_low <dbl>, ci_high <dbl>, and abbreviated
#> #   variable names ¹​median_val, ²​skewness, ³​kurtosis, ⁴​variance
tidy_distribution_summary_tbl(tn, sim_number)
#> # A tibble: 5 × 13
#>   sim_num…¹ mean_…² median…³ std_val min_val max_val skewn…⁴ kurto…⁵ range   iqr
#>   <fct>       <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl> <dbl>
#> 1 1          0.0154 -0.0612    0.955   -2.38    2.59  0.343     3.39  4.97 1.10 
#> 2 2          0.118   0.0751    1.00    -1.97    2.13 -0.234     2.53  4.10 1.28 
#> 3 3          0.206   0.127     0.838   -1.54    2.02  0.0930    2.75  3.56 0.943
#> 4 4         -0.142  -0.316     0.981   -2.08    2.21  0.338     2.63  4.29 1.37 
#> 5 5         -0.135   0.00496   1.01    -3.04    2.29 -0.207     3.51  5.33 1.28 
#> # … with 3 more variables: variance <dbl>, ci_low <dbl>, ci_high <dbl>, and
#> #   abbreviated variable names ¹​sim_number, ²​mean_val, ³​median_val, ⁴​skewness,
#> #   ⁵​kurtosis

data_tbl <- tidy_combine_distributions(tn, tb)

tidy_distribution_summary_tbl(data_tbl)
#> # A tibble: 1 × 12
#>   mean_val median_…¹ std_val min_val max_val skewn…² kurto…³ range   iqr varia…⁴
#>      <dbl>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl> <dbl>   <dbl>
#> 1    0.239     0.305   0.748   -3.04    2.59  -0.660    4.60  5.63 0.688   0.560
#> # … with 2 more variables: ci_low <dbl>, ci_high <dbl>, and abbreviated
#> #   variable names ¹​median_val, ²​skewness, ³​kurtosis, ⁴​variance
tidy_distribution_summary_tbl(data_tbl, dist_type)
#> # A tibble: 2 × 13
#>   dist_type  mean_…¹ media…² std_val min_val max_val skewn…³ kurto…⁴ range   iqr
#>   <fct>        <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl> <dbl>
#> 1 Gaussian …  0.0124  0.0194   0.962 -3.04     2.59   0.0121    2.99 5.63  1.25 
#> 2 Beta c(1,…  0.466   0.457    0.304  0.0138   0.999  0.104     1.68 0.985 0.556
#> # … with 3 more variables: variance <dbl>, ci_low <dbl>, ci_high <dbl>, and
#> #   abbreviated variable names ¹​mean_val, ²​median_val, ³​skewness, ⁴​kurtosis