library(DFBA)

Introduction

An important aspect of research is planning a forthcoming empirical study. Time and financial resources are limited, so future studies need to be planned so that there is a good chance that the research question is answered. Too few samples compromises the chance that the research question can be answered, but too many samples sinks more time and money than needed into one experiment.

The DFBA package has two functions that are designed to assist researchers in their planning the design and sample size for a forthcoming study where there are two conditions and where there is a univariate continuous measure for each observation. The dfba_bayes_vs_t_power() function and dfba_power_curve() function compute frequentist and Bayesian statistical power. These two functions are discussed together in this vignette. These two functions rely heavily on simulated data from the dfba_sim_data() function. There is a separate vignette that deals with the dfba_sim_data() function. For each function call of the dfba_sim_data() function, data are randomly sampled from two different populations that correspond to generically a control condition and an experimental condition. The difference between the two distributions can be stipulated via an argument delta. The user can also stipulate the probability distribution for the scores from a list of nine different probability models.

The dfba_sim_data() function performs frequentist and Bayesian distribution-free analyses: t-tests and either the Bayesian Wilcoxon signed-rank test or the Bayesian Mann-Whitney test depending on whether a design is repeated-measures or independent-groups study. Given a user-defined criterion stipulated by the effect_crit() argument, the posterior distribution from the Bayesian analysis for a difference between the conditions either exceeds the value of effect_crit() or it does not. The dfba_bayes_vs_t_power() and the dfba_power_curve() functions conduct Monte Carlo experiments by repeatedly calling the dfba_sim_data() function for a given sample size and the value of the delta argument. The proportion of the dfba_sim_data() function calls that obtains a posterior probability greater than the value of the effect_crit argument is an estimate of Bayesian power. The same Monte Carlo collection of data samples is also examined to find the proportion of the samples that resulted in a frequentist p-value from a t-test that is less than 1 minus the effect_crit value. This proportion is an estimate of the frequentist power. For a user-stipulated experimental design, stochastic model for the variates, and a fixed value delta for separation between the conditions, the dfba_bayes_vs_t_power() function computes frequentist and Bayesian power estimates for 11 sample sizes, which vary upward from a user-defined minimum in steps of 5. This function is designed to help the researcher ascertain a good sample size for a forthcoming study given a stipulated separation between the conditions. For a user stipulated experimental design, stochastic model for the variates, and fixed value for the sample size, the dfba_power_curve() function computes power for 21 cases that vary the separation between the conditions from 0 in even increasing steps of a magnitude stipulated in the delta.step argument. This function is designed to help the researcher discover what the influence of the separation between the conditions has on the power. The details of the dfba_bayes_vs_t_power() are discussed in Using the dfba_bayes_vs_t_power() Function; the corresponding discussion of the dfba_power_curve() function is in Using the dfba_power_curve() Function.

As a general rule, the power estimates for the frequentist t-test are somewhat higher than the corresponding Bayesian power estimates for cases where the data originate from a normal distribution. This follows from the assumption of the frequentist t-test – a parametric statistical procedure – that the data are sampled from a normal distribution. However, the Bayesian power is based on rank-order information, and as such does not require the assumption of normally distributed data. Consequently, the user will see that as a general rule the power estimates are higher for the Bayesian distribution-free procedures than the corresponding frequentist power when the data originate from non-normal probability models.

Using the `dfba_bayes_vs_t_power()` Function

The dfba_bayes_vs_t_power() function has three required arguments and nine optional arguments. The three required arguments are: delta, model, and design. The delta argument is the difference between the centrality of the two conditions – e.g., experimental and control at the population level. The model argument is a string indicating one of nine probability models: "normal", "weibull", "cauchy", "lognormal", "chisquare", "logistic", "exponential", "gumbel", or "pareto" (see the vignette for the dfba_sim_data() function for a brief tutorial about any of these distributions). The design argument is a string that takes either "independent" or "paired". When design = "paired", the power estimates are for the case where the scores for the experimental and control conditions are within a block such as scores from the same participant. When design = "independent", the power estimates are for the case where the two conditions are independent (i.e., a repeated-measures design).

The nine optional arguments (along with their default values) are: n_min = 20, effect_crit = .95, shape1 = 1, shape2 = 1, samples = 1000, a0 = 1, b0 = 1, block.max = 0, and hide_progress = FALSE. It is likely that for a particular value of the delta argument the user might vary the value for n_min to see the power estimates for a different range of 11 sample sizes for the two groups. Values greater than 20 can be entered for the n_min argument, but the function will not allow for values less than 20 in order to assure that the sample sizes are large enough to justify the large-n approximation method used for the Bayesian distribution-free analyses.

The shape1 and shape2 inputs allow the user to change features of the population model. The meanings of these two arguments vary with values of the model input. For example, when model = "normal", the shape parameters correspond to the standard deviations for the two variates in the population; for the case of model = "weibull", the shape parameters are the Weibull distribution shape parameters (See the vignette about the dfba_sim_data() function for more details about each distribution).

The samples argument can be adjusted to vary the number of Monte Carlo data sets drawn by the function. The default value of 1000 means that there are 1000 function calls to the dfba_sim_data() function for each of the 11 sample sizes run. If the user wishes to vary that number to obtain either more or less precision in the power estimates, then the samples argument can be adjusted.

The a0 and b0 arguments represent the two shape parameters for the prior beta distribution for the Bayesian distribution-free analyses; the default values a0 = 1 and b0 = 1 correspond to a non-informative prior distribution. Other values of a0 and b0 may be specified to use a different prior distribution, for example, to use an informative prior.

The block.max argument allows the user to examine the effect of block differences on power estimates when design = "independent".

Finally, because Monte Carlo sampling takes a substantial amount of processing time, the function provides a progress counter indicating the percentage of the sampling that has completed; the counter is shown when hide_progress takes the default value of TRUE and is hidden when hide_progress = FALSE (see the example section for an example case where hide_progress = TRUE is used to prevent printing the progress steps to this vignette).

Example 1

The following example shows a power analysis for testing the difference between two independent samples, each drawn from normal distributions where the stipulated difference between the means of the distributions is delta = 0.3. The argument n_min is omitted from the example code, so the default value of n_min = 20 is used. Prospective frequentist and Bayesian power estimates are therefore given for n = {20, 25, …, 70}.

set.seed(1)
A <- dfba_bayes_vs_t_power(delta = 0.3,
                           model = "normal",
                           design = "independent",
                           hide_progress = TRUE)

A

#> Power results for the proportion of samples detecting effects   
#>   where the variates are distributed as a normal random variable 
#>   and where the design is independent 
#>   The number of Monte Carlo samples are:   
#>   1000   
#>   Criterion for detecting an effect is   
#>   0.95   
#>   The delta offset parameter:   
#>   0.3   
#> Output Results: 
#>  sample_size Bayes_power t_power
#>           20       0.219   0.230
#>           25       0.248   0.268
#>           30       0.268   0.290
#>           35       0.336   0.366
#>           40       0.318   0.337
#>           45       0.400   0.422
#>           50       0.418   0.459
#>           55       0.451   0.470
#>           60       0.477   0.507
#>           65       0.474   0.479
#>           70       0.523   0.554

The plot() method produces a visualization of the data in the outputdf object from the dfba_bayes_vs_t_power() values: frequentist and Bayesian power estimates as a function of prospective sample size.

plot(A)

Using the `dfba_power_curve()` Function

The dfba_power_curve() function has two required arguments and ten optional arguments. The required arguments are model and design. As in the dfba_bayes_vs_t_power() function, the model argument is the string name for one of the nine probability models. This argument requires one from the following list: "normal", "weibull", "cauchy", "lognormal", "chisquare", "logistic", "exponential", "gumbel", "pareto" (see the vignette for the dfba_sim_data() function for a brief tutorial about any of these distributions). The design argument is a string variable that takes either "independent" or "paired" as possible values. When design = "paired", the power estimates are for the case where the experimental and the control condition scores are within a block such as in the case of repeated measures. For an independent input, the power estimates are for the case where the two conditions are independent (i.e., a between-groups design). Eight of the optional arguments also belong to the dfba_bayes_vs_t_power() function; listed again here with their default values: a0 = 1, b0 = 1, effect_crit = .95, shape1 = 1, shape2 = 1, block.max = 0, samples = 1000, and hide_progress = FALSE. The other two optional arguments – with their default values – are n = 20 and delta.step = .05. The n argument is an integer value for the sample size in each of the two conditions; it must take values greater than or equal to 20 in order for the Bayesian analyses to use a large-n approximation. Unlike the dfba_bayes_vs_t_power() function, all the power estimates calculated by the dfba_power_curve() function use the same sample size for each condition; the separation between the two condition varies instead. The step-size increments for the delta offset between the conditions is the value for the optional delta.step argument.

Examples

Example 2: Paired Design with Differences Sampled from a Normal Distribution

In the following example, power is estimated for frequentist t-test analysis and Bayesian Wilcoxon test analysis of paired data where the differences are stipulated to be drawn from a normal distribution. In this example code, the argument delta_step is omitted and the default value delta_step = .05 is used; thus: the power analysis evaluates frequentist and Bayesian power at stipulated mean differences of δ = {0, 0.05, 0.10, …, 1}.

B <- dfba_power_curve(n = 40,
                      model = "normal",
                      design = "paired",
                      hide_progress = TRUE)

B

#> Power results for the proportion of samples detecting effects   
#>   where the variates are distributed as a normal random variable 
#>   and where the design is paired 
#>   with a blocking max of  0 
#>   The number of Monte Carlo samples are:   
#>   1000   
#>   Criterion for detecting an effect is   
#>   0.95   
#> The n value per condition:   
#> 40    
#> Output Results: 
#>    delta_value Bayes_power t_power
#> 1         0.00       0.047   0.042
#> 2         0.05       0.077   0.080
#> 3         0.10       0.109   0.111
#> 4         0.15       0.140   0.144
#> 5         0.20       0.230   0.246
#> 6         0.25       0.272   0.294
#> 7         0.30       0.358   0.367
#> 8         0.35       0.443   0.453
#> 9         0.40       0.489   0.527
#> 10        0.45       0.614   0.644
#> 11        0.50       0.666   0.692
#> 12        0.55       0.752   0.780
#> 13        0.60       0.822   0.826
#> 14        0.65       0.865   0.878
#> 15        0.70       0.930   0.935
#> 16        0.75       0.938   0.944
#> 17        0.80       0.967   0.978
#> 18        0.85       0.971   0.979
#> 19        0.90       0.987   0.990
#> 20        0.95       0.992   0.993
#> 21        1.00       0.994   0.998

Note that the differences in power estimates in this example are small but indicate greater for the parametric t-test than for the Bayesian Wilcoxon test at nearly all stipulated mean difference from 0 to 1.

plot(B)

Example 3: Paired Design with Differences Sampled from a Weibull Distribution

In the following example, power is estimated for frequentist t-test analysis and Bayesian Wilcoxon test analysis of paired data where the differences are stipulated to be drawn from a Weibull distribution. As in the example 2 above, the argument delta_step is omitted and the default value delta_step = .05 is used; thus: the power analysis evaluates frequentist and Bayesian power at stipulated mean differences of δ = {0, 0.05, 0.10, …, 1}.

C<- dfba_power_curve(n = 40,
                     model = "weibull",
                     design = "paired",
                     shape1=.8,
                     shape2=.8,
                     hide_progress = TRUE)

C

#> Power results for the proportion of samples detecting effects   
#>   where the variates are distributed as a weibull random variable 
#>   and where the design is paired 
#>   with a blocking max of  0 
#>   The number of Monte Carlo samples are:   
#>   1000   
#>   Criterion for detecting an effect is   
#>   0.95   
#> The n value per condition:   
#> 40    
#> Output Results: 
#>    delta_value Bayes_power t_power
#> 1         0.00       0.052   0.051
#> 2         0.05       0.076   0.062
#> 3         0.10       0.113   0.108
#> 4         0.15       0.159   0.126
#> 5         0.20       0.224   0.141
#> 6         0.25       0.306   0.206
#> 7         0.30       0.379   0.269
#> 8         0.35       0.457   0.302
#> 9         0.40       0.557   0.384
#> 10        0.45       0.617   0.426
#> 11        0.50       0.680   0.509
#> 12        0.55       0.725   0.545
#> 13        0.60       0.800   0.607
#> 14        0.65       0.858   0.665
#> 15        0.70       0.884   0.707
#> 16        0.75       0.904   0.744
#> 17        0.80       0.924   0.787
#> 18        0.85       0.949   0.847
#> 19        0.90       0.957   0.872
#> 20        0.95       0.969   0.877
#> 21        1.00       0.982   0.909

Please note that with the stipulation of differences drawn not from a normal distribution (rather, in this case, from a Weibull distribution), that the power estimates are consistently greater in this example for the Bayesian distribution-free analysis than for the corresponding parametric frequentist analysis. The pattern can be visualized using the plot() method:

plot(C)

dfba_power_functions