This vignette provides a guidance for designing an optimal two-period
K+M-experimental arm platform trial with M delayed experimental arms
added during the trial using the package PlatformDesign
,
controlling for family-wise type-I error rate (FWER) or pair-wise type-I
error rate (PWER). The K+M-experimental arm trial has K experimental
arms and one control arm during the first period, and later M
experimental arms are added on the start of the second period. The one
common control arm is shared among all experimental arms across the
trial. The design method suits any K+M-experimental arm trial, with the
examples here shows how to design a 2+2-experimental arm trial (see Fig
1 for the scheme). The goal of the proposed method is to find the
optimal design with a minimum total sample size while making the
marginal and disjunctive power no less than their counterparts in the
K-experimental arm trial it is based on, controlling for FWER or
PWER.
The following steps contain two parts: 1) Steps 1 to 5 derive the design parameters in the K-experimental arm trial that the K+M-experimental arm trial is based on. The K+M-experimental arm trial is based on the K-experimental arm trial in the sense that we will keep the same FWER and marginal power for the K+M-experimental arm trial as in the K-experimental arm trial, despite M new experimental arms are added during the second period of the K+M experimental arm trial. 2) Steps 6 to 14 illustrate how the design parameters are calculated for the K+M-experimental arm trial.
For users who are less interested in the theoretical details, you can skip other steps and focus on Steps 1, 7, 13, and 14.
Four design parameters for the K-experimental arm trial should be pre-specified: the number of experimental arms (K), the family-wise error rate (FWER1), the marginal type II error (β1), and the allocation ratio (control-to-each experimental arm, denoted as A1). In our method, we use $A_1=\sqrt{K}$, according to the K-root optimal allocation rule. In the following code, we assume K = 2, FWER1 = 0.025, β1 = 0.2, and $A_1=\sqrt{2}$. In addition, zβ1 ( in the following code) is the corresponding critical value for the power of 1 − β1.
We use Z1 and Z2 to denote the two test statistics for comparing the two experimental arm to the control in the original design. Given A1, the correlation between Z1 and Z2 (denoted as ρ0) and the correlation matrix (denoted as Σ1) can be calculated (see below).
First, we have, $$\rho_{0} = \frac{n_{0kk^{'}}}{\frac{(n_{01})^{2}}{n_{1}}+n_{01}}$$ where n1 is the number of patients on each of the experimental arm, n01 is the number of patients on the control in the K-experimental arm trial, and n0kk′ is number of control patients shared between 2 experimental arms k and k′.
Since the two experimental arms share a common control arm, that is, n0kk′ = n01. The correlation of Z1 and Z2 is computed as $$\rho_0 = \frac{1}{(n_{01}/n_{1}+1)}=\frac{1}{(A_1+1)}=0.4142$$
We can see that in our “2+2”-experimental arm example, where K=2, we have $$\Sigma_{1} =\begin{bmatrix} 1& \rho_0\\ \rho_0& 1 \end{bmatrix}=\begin{bmatrix} 1& 0.4142\\ 0.4142& 1 \end{bmatrix}$$
Based on the above description, the function
one_stage_multiarm
can be used to find ρ0 and the correlation
matrix Σ1 of Z1 and Z2 as shown below.
Given K, Σ1 and FWER1, we can find the associated critical value (denoted as z1 − α1) for the marginal type I error rate in the 2-experimental arm trial (denoted as α1) based on the following equation.
FWER1 = 1 − ∫−∞z1 − α1∫−∞z1 − α1...∫−∞z1 − α1πz(Z(z1, z2, ...zK), 0, Σ1)dz1dz2...dzK
This calculation can also be achieved using the function
one_stage_multiarm
.
Given β1, $A_1=\sqrt{K}$, an effective standardized effect size Δ (assumed to be 0.4 for all experimental arms in our paper), and z1 − α1 derived above, we can derive sample sizes for the experimental (n1) and control arms ($n_{0_1}=A_1n_1=\sqrt{K}n_1$), respectively, as shown below.
Since we have
$$z_{1-\alpha_1}+z_{1-\beta_1}=\frac{\mu_i-\mu_0}{\sigma\sqrt{\frac{1}{n_{1}}+\frac{1}{\sqrt{K}n_{1}}}}=\frac{\Delta }{\sqrt{\frac{1}{n_{1}}+\frac{1}{\sqrt{K}n_{1}}}}$$ Therefore,
$$n_{1}= \frac{(z_{\alpha_1}+z_{\beta_1})^2}{\Delta^2}(1+\frac{\sqrt{K})}{K})$$
and $$n_{0_{1}}= \frac{(z_{\alpha_1}+z_{\beta_1})^2}{\Delta^2}(1+\sqrt{K})$$
In sum, the total sample size of the K-experimental arm trial is N1 = Kn1 + n01
With our “2+2” example, we again use the function
one_stage_multiarm
to derive the sample sizes for the first
period.
With β1 and Σ1, we can also derive the overall (disjunctive) power Ω1 = 0.922 based on equation below.
Ω1 = 1 − ∫−∞zβ1∫−∞zβ1...∫−∞zβ1πz(Z(z1, z2, ...zK), 0, ∑)dz1dz2...dzK
This result is also included as part of the output from the function
one_stage_multiarm
, i.e., $Power1
.
From the above outputs, we can see that the computed disjunctive power is 0.922.
Based Steps 1 to 5, we demonstrated given K, FWER1, marginal power 1 − β1 and the standardized effect size Δ, how to derive the marginal type I error rate α1 and its corresponding critical value, sample size n1 for each of the experimental arm and n01 for the control arm, and disjunctive power Ω1 for the K-experimental arm trial.
In sum, the function one_stage_multiarm
in R package
PlatformDesign
can complete step 1 to 5 all at once. Below
are the complete outputs generated by applying this function.
multi
#> $K
#> [1] 2
#>
#> $n1
#> [1] 101
#>
#> $n0_1
#> [1] 143
#>
#> $N1
#> [1] 345
#>
#> $z_alpha1
#> [1] 2.220604
#>
#> $FWER1
#> [1] 0.025
#>
#> $z_beta1
#> [1] 0.8416212
#>
#> $Power1
#> [1] 0.9222971
#>
#> $corMat1
#> [,1] [,2]
#> [1,] 1.0000000 0.4142136
#> [2,] 0.4142136 1.0000000
#>
#> $delta
#> [1] 0.4
Based on these knowledge, we can introduce our proposed methods when adding new arms in the following from steps 6 to 14.
Timing is the first component to consider if planning to add new experimental arms during a study. In practice, we can use a fraction to denote the timing. In this paper, we use the number of patients already being enrolled in the experimental arm (denoted as nt) when new arms added to define the timing of adding. By this definition, at the time of adding new arms, number of patients enrolled in the control arm is n0t = [A1nt].
For example, the following codes describe a scenario if new arms are added when there have 30 patients enrolled for each of the experimental arms.
Then we need to decide the family-wise error rate when adding new arms, denoted as FWER2. In this example, we control the FWER2 at the same level to the FWER1. With FWER2, we can find the marginal type I error rate (denoted as α1). With α1, we can find the updated marginal power (denoted as ω2 = 1 − β2), and then the disjunctive power (denoted as Ω2). We will describe these updates with details in the following steps.
The goal of a two-period K+M experimental arm platform design is to
minimize the sample size (denoted as N2) while keeping the
marginal power (ω2)
and disjunctive power (Ω2) no less than their
counterparts in the first period (ω1 and Ω1). That is, we set the
lower limit of ω2
(denoted as ω2min)
to be 0.8, the lower limit of Ω2 (denoted as Ω2min)
to be 0.922 (which is obtained by using function
one_stage_multiarm
in the previous steps) as for our “2+2”
example. These two limits will be used to select the recommended optimal
design(s) (details shown in Step 13).
Because we need to keep FWER2 equal to FWER1 when adding new arms, we then have to update n1 to n2 and n01 to n02 for each experimental arm (See Fig 1). Here n2 and n02 are the sample sizes for each of the experimental arms and for the concurrent control in the 2+2-experimental arm trial.
We define an admissible set for pairs of (n2, n02) based on the following three constraints. The first two constraints for (n2, n02) is related to the allocation ratio after adding the new arms, denoted as A2. In the first period with two experimental arms (before adding the new arms), the control allocation ratio is A1. Once the two new experimental arms are added, we need to use an allocation ratio A2 to achieve desired design properties (e.g., control the FWER and achieve the marginal power). After the initially opened two experimental arms are completed, the trial will again have only two experimental arms left. Therefore, the allocation ratio will go back from A2 to A1.
Here we have,
A2 = (n02 − n0t)/(n2 − nt) > 0
where nt and n0t are number of enrolled patients for each of the experimental arms and the control at the time of adding new arms. That is, the value of A2 needs to be a non-infinite positive number. For example, the first two constraints in our “2+2” example are
n02 > n0t = 43 and n2 > nt = 30
We also need to set an upper limit for the total sample size N2. A reasonable upper limit is that N2 should not exceed the required sample sizes (denoted as S) of separately conducting two multiarm trials, i.e., a K-experimental arm trial and a M-experimental arm trial. To recap, K and M refer to the numbers of initially and newly added experimental arms.
Therefore, S can be calculated as $$ S= \frac{(z_{1-\alpha_1}+z_{1-\beta_1})^2}{\Delta^2}(1+2\sqrt{K}+K) + \frac{(z_{1-\alpha_1^*}+z_{1-\beta_1})^2}{\Delta^2}(1+2\sqrt{M}+M). $$
, where z1 − α1 and z1 − α1* are the critical values for the K- and M-experimental arm trial, separately. In our “2+2” example, z1 − α1 = z1 − α1* as K = M = 2 and S = 2N1 = 690.
Therefore, the third constraint for (n2, n02) is N2 = (K + M)n2 + n02 + n0t < S = 690.
Under the above three constraints, the admissible set of (n2, n02)
can be identified using the function admiss
(integer points
in triangle area in Fig.2). The data set pair3
contains all
(n2, n02)
pairs satisfying the 3 constraints introduced above.
For each pair of (n2, n02) in the admissible set the correlation matrix of z statistics, ∑2, of four test statistics (Z1, Z2, Z3, andZ4) can be derived using in n2 and n02. The correlation between two experimental arms is,
$$\rho_{k, k^{'}} = \frac{n_{0kk^{'}}}{\frac{(n_{0_{2}})^{2}}{n_{2}}+n_{0_{2}}}$$
Specifically, between the initially opened two experimental arms (and between the two newly added arms), the shared control now is n0kk′ = n02. Therefore, the correlation of Z statistics between these two initially opened experimental arms (and between the two newly added arms) is
$$\rho_1 = \frac{1}{(n_{0_2}/n_2+1)}$$.
The number of shared controls between one initially opened experimental arm and one delayed experimental arm is n0kk′ = n02 − n0t Therefore, the correlation of the Z test statistics between these two experimental arms is
$$\rho_2 = \frac{n_{0_2}-n_{0t}}{(n_{0_2}^2/n_2+n_{0_2})}$$.
To be specific, in our 2+2-experimental arm example, we have Σ2 as, $$\Sigma_2 =\begin{bmatrix} 1 & \rho_1 & \rho_2 & \rho_2\\ \rho_1 & 1& \rho_2& \rho_2\\ \rho_2 & \rho_2& 1& \rho_1\\ \rho_2 & \rho_2& \rho_1 & 1 \end{bmatrix}$$
Now we can use FWER2 and Σ2 to calculate the marginal type I error α2 and the corresponding critical value z1 − α2 for each pair of (n2, n02) in the admissible set found in ).
FWER2 = 1 − ∫−∞z1 − α2∫−∞z1 − α2...∫−∞z1 − α2πz(Z(z1, z2, ...zK + M), 0, Σ2)dz1dz2...dzK + M
With n1, n01, α1, β1 and α2, we can use the following equation to calculate the marginal power ω2 = 1 − β2 for each pair of (n2, n02) from z1 − β2.
$$z_{1-\beta2 }=\sqrt{\frac{\frac{1} {n_1}+\frac{1}{n_{0_1}}}{\frac{1}{n_2}+\frac{1}{n_{0_2}}}}(z_{1-\alpha1} + z_{1-\beta1})- z_{1-\alpha2}$$
With β2 and Σ2, we can derive the new disjunctive power Ω2 for each pair of (n2, n02) using the following equation.
Ω2 = 1 − ∫−∞zβ2∫−∞zβ2...∫−∞zβ2πz(Z(z1, z2, ...zK + M), 0, Σ2)dz1dz2...dzK + M
In our “2+2” example, we calculate ω2 and Ω2 from all of the 29040 pairs of (n2, n02) in the entire admissible set. We then perform a 2-step selection procedure to obtain the recommended optimal design(s): 1) first we only keep the designs with ω2 and Ω2 above or equal to ω1 and Ω1, respectively (i.e., lower limits decided in Step 7); 2) next, among those selected ones we choose the designs with the smallest total sample size N2.
Given nt, K, M, FWER1,
ω1, and Δ, the function
platform_Design
can provide the optimal designs with the
minimum total sample size among those having ω2 and Ω2 no less than their
counterparts in the K-experimental arm trial.
design <- platform_design(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
design$designs
The first part of the outputs ($design Karm
) contains
the parameters for the K-experimental arm trial. The second part
($designs
) contains the parameters for the K+M experimental
arm trial designed based on the former. From above
($designs
), we can see it is possible to have multiple
recommended designs which all have the same total sample size N2. We
provide a full list of useful parameters for each of the recommended
optimal designs.
For example, if we choose design # 15669 for this 2+2-experimental arm trial, the corresponding critical value for controlling the FWER at 0.025 is 2.475. The marginal power is 0.8, and the disjunctive power is 0.985, both no less than their counterparts in the 2-experimental arm trial. The required total sample size is N2 = 669. Among the 669 patients, in the first period the sample sizes for each experimental arm and the control are nt = 30 and n0t = 43, with an allocation ratio of A1 = 1.414. Once the two additional experimental arms are added, the optimal allocation ratio changes to A2 = 2.01 for the overlapping stage of the second period. The allocation ratio will change back to A1 once the two initial experimental arms closed to accrual. Through the entire 2+2-experimental arm trial, the sample size for each experimental arm is n2 = 107. The sample size for the concurrent control of each experimental arm is n02 = 198. To be noted, the sample size for the entire control arm, concurrent and non-concurrent combined, is nc = 241. The reduction in the total sample size comparing to two separate 2-experimental arm trials is 21.
As we can see from Step 13, although the total sample size is the same for all the 4 recommended designs, the other parameters can be different. Therefore, we can choose a final design based on our needs according to the other parameters. For example, if we would like to choose a design with the largest disjunctive power Ω2, our final choice goes to the design in the last row of the result above (design #16632).
If ω2min
and Ω2min
in Step 7 can not be met at the same time. The algorithm in function
platform_Design
will return the designs with smallest N2 but only meeting one
of the two limits. In case we are not satisfied with the result or if
neither of the two limits can be met, we can choose from the three
options below: 1. Go back to Step 7 to decrease the values of ω2min.
After that, repeat Steps 8 to 14 again. This can be done only if a
marginal power lower than ω1 is acceptable, which
partially compromises the goal of the design. 2. Go back to Step 6 to
set up a smaller nt (and
therefore smaller n0t) - that is,
increasing overlapping among initially and later added experimental
arms. The rationale is the later the new arms are added, the less likely
we can find designs satisfying both limits defined in Step 7. After
that, repeat Steps 8 to 14 again. This is only feasible if the situation
allows us to change the timing of adding new arm. 3. Consider
controlling for PWER instead of FWER as illustrated in the next
example.
If users do not wish to control FWER, or if controlling for FWER can
not be achieved given required power levels, then we recommend using the
function platform Design(.) with the argument fwer
replaced
by pwer
to plan for the K+M-experimental arm trial. If we
plan to add 2 new experimental arms when 30 patients have already been
enrolled in each of the 2 initial experimental arms, given the pair-wise
type-I error controlled at 0.025 and the marginal power equal to 0.8, we
can use the following code to get the design parameters. Here, five
optimal designs are recommended and each row is a individual design.
Notably, we can save 87 patients with this design compared to 2 separate
multiarm trials.
design2 <- platform_design(nt=30, K=2, M=2, pwer=0.025, marginal.power=0.8, delta=0.4,seed=123)
design2$designs
The main difference between using pwer
instead of
fwer
in platform Design(.) is that it does not use the
Dunnett method to derive critical values, instead, it calculates that
directly from the user-defined pair-wise type I error. Notably, the
upper limit S for the total
sample size N2 that
used to find the admissible set when controlling for PWER, is
constructed using the multiarm trials (one K- and one M-experimental arm
trial) which are also controlling for PWER in the function platform
Design(.). The sample sizes for the multiarm trials (controlling for
PWER) can also be calculated with the function one stage multiarm(.).
Other aspects of the algorithms are similar between the two applications
of the platform Design(.) function.
The function platform_design2()
is a faster version of
platform_design()
. It adopts a more efficient algorithm to
find optimal design(s) for a two-period K+M experimental arm platform
trial by searching for the optimal design starting from the maximum
possible N_2
value. The usage of this function is exactly
the same as with platform_design()
, except
platform_design2() returns NULL
for $design
when optimal design does not exist and will not provide the sub-optimal
design which only satisfying the minimum power level requirement for the
disjunctive power. When optimal design exists, the results from the two
functions are the same.
The two code chunks below can be used to compare time used to find
the optimal design using platform_design2()
and
platform_design()
.
start_time <- Sys.time()
test <- platform_design2(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
end_time <- Sys.time()
end_time - start_time
# Time difference of 41.85487 secs
start_time <- Sys.time()
test2 <- platform_design(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
end_time <- Sys.time()
end_time - start_time
# Time difference of 8.188013 mins
Notably, the maximum time-saving is with the situations when the optimal design does not exist given the conditions specified by the user.
start_time <- Sys.time()
platform_design2(nt = 50, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
#> The lower limits of marginal and disjunctive power cannot be met at the same time.
#> $design_Karm
#> K n1 n0_1 N1 z_alpha1 FWER1 z_beta1 Power1 cor0 delta
#> 1 2 101 143 345 2.220604 0.025 0.8416212 0.9222971 0.4142136 0.4
#>
#> $designs
#> NULL
end_time <- Sys.time()
# Time difference
end_time - start_time
#> Time difference of 0.709923 secs
1. Pan, H., Yuan, X. and Ye, J. (2022). An optimal two-period multiarm platform design with new experimental arms added during the trial. Submitted.
2. Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatments with a control. Journal of the American Statistical Association, 50(272), 1096-1121.