PlatformDesign

library(PlatformDesign)
library(ggplot2)

This vignette provides a guidance for designing an optimal two-period K+M-experimental arm platform trial with M delayed experimental arms added during the trial using the package PlatformDesign, controlling for family-wise type-I error rate (FWER) or pair-wise type-I error rate (PWER). The K+M-experimental arm trial has K experimental arms and one control arm during the first period, and later M experimental arms are added on the start of the second period. The one common control arm is shared among all experimental arms across the trial. The design method suits any K+M-experimental arm trial, with the examples here shows how to design a 2+2-experimental arm trial (see Fig 1 for the scheme). The goal of the proposed method is to find the optimal design with a minimum total sample size while making the marginal and disjunctive power no less than their counterparts in the K-experimental arm trial it is based on, controlling for FWER or PWER.

Fig 1.Schema of a two-period 2+2-experimental arm platform trial. Left part of the figure shows a traditional three-arm trial, in the context of this paper, we refer it as a 2-experimental arm trial as it has 2 experimental arms and 1 common control. In the right part of this figure, it is the 2+2-experimental arm trial. During the first period, the 2+2-experimental arm trial has two experimental arms, Trt 1 and Trt 2 (light blue segments), and a control arm (the dark blue segment). The first vertical line (solid) separates the first and second period of the 2+2-experimental arm trial, and indicates the opening of the two new experimental arms, Trt 3 and Trt 4. The second vertical line (dashed) separates two parts of the second period, and indicates the closing of Trt 1 and Trt 2. In the first part of the second period, the control arm (the dark purple segment) is shared among the four experimental arms (light purple segments). During the second part of the second period, the control (dark green), Trt 3, and Trt 4 (light green) continue to accrue patients until reaching the planned sample sizes. The nt and n0t (blue brackets) indicate the numbers of patients enrolled in each of Trt 1 and Trt 2, and the control, respectively, when Trt 3 and Trt 4 are added. The n1 and n01 (orange brackets) indicate the sample sizes for each of the two experimental arms and the control, respectively, in the 2-experimental arm trial. The n2 and n02 (green brackets) indicate the sample sizes for each of the four experimental arms and their corresponding concurrent control in the two-period 2+2-experimental arm trial. A1 denotes the allocation ratio (control to experimental arm) during the first period of the 2+2-experimental arm trial. A2 denotes the allocation ratio during the the first part of the second period, when all four experiments arms are open. A3 is equal to A1.
Fig 1.Schema of a two-period 2+2-experimental arm platform trial. Left part of the figure shows a traditional three-arm trial, in the context of this paper, we refer it as a 2-experimental arm trial as it has 2 experimental arms and 1 common control. In the right part of this figure, it is the 2+2-experimental arm trial. During the first period, the 2+2-experimental arm trial has two experimental arms, Trt 1 and Trt 2 (light blue segments), and a control arm (the dark blue segment). The first vertical line (solid) separates the first and second period of the 2+2-experimental arm trial, and indicates the opening of the two new experimental arms, Trt 3 and Trt 4. The second vertical line (dashed) separates two parts of the second period, and indicates the closing of Trt 1 and Trt 2. In the first part of the second period, the control arm (the dark purple segment) is shared among the four experimental arms (light purple segments). During the second part of the second period, the control (dark green), Trt 3, and Trt 4 (light green) continue to accrue patients until reaching the planned sample sizes. The nt and n0t (blue brackets) indicate the numbers of patients enrolled in each of Trt 1 and Trt 2, and the control, respectively, when Trt 3 and Trt 4 are added. The n1 and n01 (orange brackets) indicate the sample sizes for each of the two experimental arms and the control, respectively, in the 2-experimental arm trial. The n2 and n02 (green brackets) indicate the sample sizes for each of the four experimental arms and their corresponding concurrent control in the two-period 2+2-experimental arm trial. A1 denotes the allocation ratio (control to experimental arm) during the first period of the 2+2-experimental arm trial. A2 denotes the allocation ratio during the the first part of the second period, when all four experiments arms are open. A3 is equal to A1.

Example 1: the optimal two-period multiarm trial design with delayed arms, controlling for FWER

The following steps contain two parts: 1) Steps 1 to 5 derive the design parameters in the K-experimental arm trial that the K+M-experimental arm trial is based on. The K+M-experimental arm trial is based on the K-experimental arm trial in the sense that we will keep the same FWER and marginal power for the K+M-experimental arm trial as in the K-experimental arm trial, despite M new experimental arms are added during the second period of the K+M experimental arm trial. 2) Steps 6 to 14 illustrate how the design parameters are calculated for the K+M-experimental arm trial.

For users who are less interested in the theoretical details, you can skip other steps and focus on Steps 1, 7, 13, and 14.

Step 1: initial setup

Four design parameters for the K-experimental arm trial should be pre-specified: the number of experimental arms (K), the family-wise error rate (FWER1), the marginal type II error (β1), and the allocation ratio (control-to-each experimental arm, denoted as A1). In our method, we use $A_1=\sqrt{K}$, according to the K-root optimal allocation rule. In the following code, we assume K = 2, FWER1 = 0.025, β1 = 0.2, and $A_1=\sqrt{2}$. In addition, zβ1 ( in the following code) is the corresponding critical value for the power of 1 − β1.

K <- 2
FWER_1 <- 0.025
beta1 <- 0.2
z_beta1 <- qnorm(1-beta1) #z_(1-beta1)
A1 <- sqrt(K)

Step 2: Correlation Matrix 1

We use Z1 and Z2 to denote the two test statistics for comparing the two experimental arm to the control in the original design. Given A1, the correlation between Z1 and Z2 (denoted as ρ0) and the correlation matrix (denoted as Σ1) can be calculated (see below).

First, we have, $$\rho_{0} = \frac{n_{0kk^{'}}}{\frac{(n_{01})^{2}}{n_{1}}+n_{01}}$$ where n1 is the number of patients on each of the experimental arm, n01 is the number of patients on the control in the K-experimental arm trial, and n0kk is number of control patients shared between 2 experimental arms k and k.

Since the two experimental arms share a common control arm, that is, n0kk = n01. The correlation of Z1 and Z2 is computed as $$\rho_0 = \frac{1}{(n_{01}/n_{1}+1)}=\frac{1}{(A_1+1)}=0.4142$$

We can see that in our “2+2”-experimental arm example, where K=2, we have $$\Sigma_{1} =\begin{bmatrix} 1& \rho_0\\ \rho_0& 1 \end{bmatrix}=\begin{bmatrix} 1& 0.4142\\ 0.4142& 1 \end{bmatrix}$$

Based on the above description, the function one_stage_multiarm can be used to find ρ0 and the correlation matrix Σ1 of Z1 and Z2 as shown below.

multi <- one_stage_multiarm(K = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
corMat1 <- multi$corMat1
corMat1
#>           [,1]      [,2]
#> [1,] 1.0000000 0.4142136
#> [2,] 0.4142136 1.0000000

Step 3: Critical Value 1

Given K, Σ1 and FWER1, we can find the associated critical value (denoted as z1 − α1) for the marginal type I error rate in the 2-experimental arm trial (denoted as α1) based on the following equation.

FWER1 = 1 − ∫−∞z1 − α1−∞z1 − α1...∫−∞z1 − α1πz(Z(z1, z2, ...zK), 0, Σ1)dz1dz2...dzK This calculation can also be achieved using the function one_stage_multiarm.

multi$z_alpha1
#> [1] 2.220604

Step 4: Sample Sizes 1

Given β1, $A_1=\sqrt{K}$, an effective standardized effect size Δ (assumed to be 0.4 for all experimental arms in our paper), and z1 − α1 derived above, we can derive sample sizes for the experimental (n1) and control arms ($n_{0_1}=A_1n_1=\sqrt{K}n_1$), respectively, as shown below.

Since we have

$$z_{1-\alpha_1}+z_{1-\beta_1}=\frac{\mu_i-\mu_0}{\sigma\sqrt{\frac{1}{n_{1}}+\frac{1}{\sqrt{K}n_{1}}}}=\frac{\Delta }{\sqrt{\frac{1}{n_{1}}+\frac{1}{\sqrt{K}n_{1}}}}$$ Therefore,

$$n_{1}= \frac{(z_{\alpha_1}+z_{\beta_1})^2}{\Delta^2}(1+\frac{\sqrt{K})}{K})$$

and $$n_{0_{1}}= \frac{(z_{\alpha_1}+z_{\beta_1})^2}{\Delta^2}(1+\sqrt{K})$$

In sum, the total sample size of the K-experimental arm trial is N1 = Kn1 + n01

With our “2+2” example, we again use the function one_stage_multiarm to derive the sample sizes for the first period.

multi$n1
#> [1] 101
multi$n0_1
#> [1] 143
multi$N1
#> [1] 345

Step 5: Disjunctive Power 1

With β1 and Σ1, we can also derive the overall (disjunctive) power Ω1 = 0.922 based on equation below.

Ω1 = 1 − ∫−∞zβ1−∞zβ1...∫−∞zβ1πz(Z(z1, z2, ...zK), 0, ∑)dz1dz2...dzK This result is also included as part of the output from the function one_stage_multiarm, i.e., $Power1.

multi$Power1
#> [1] 0.9222971

From the above outputs, we can see that the computed disjunctive power is 0.922.

Based Steps 1 to 5, we demonstrated given K, FWER1, marginal power 1 − β1 and the standardized effect size Δ, how to derive the marginal type I error rate α1 and its corresponding critical value, sample size n1 for each of the experimental arm and n01 for the control arm, and disjunctive power Ω1 for the K-experimental arm trial.

In sum, the function one_stage_multiarm in R package PlatformDesign can complete step 1 to 5 all at once. Below are the complete outputs generated by applying this function.

multi
#> $K
#> [1] 2
#> 
#> $n1
#> [1] 101
#> 
#> $n0_1
#> [1] 143
#> 
#> $N1
#> [1] 345
#> 
#> $z_alpha1
#> [1] 2.220604
#> 
#> $FWER1
#> [1] 0.025
#> 
#> $z_beta1
#> [1] 0.8416212
#> 
#> $Power1
#> [1] 0.9222971
#> 
#> $corMat1
#>           [,1]      [,2]
#> [1,] 1.0000000 0.4142136
#> [2,] 0.4142136 1.0000000
#> 
#> $delta
#> [1] 0.4

Based on these knowledge, we can introduce our proposed methods when adding new arms in the following from steps 6 to 14.

Step 6: Timing of Adding New Arms

Timing is the first component to consider if planning to add new experimental arms during a study. In practice, we can use a fraction to denote the timing. In this paper, we use the number of patients already being enrolled in the experimental arm (denoted as nt) when new arms added to define the timing of adding. By this definition, at the time of adding new arms, number of patients enrolled in the control arm is n0t = [A1nt].

For example, the following codes describe a scenario if new arms are added when there have 30 patients enrolled for each of the experimental arms.

nt <- 30
nt
#> [1] 30

n_0t <- ceiling(nt*A1)
n_0t
#> [1] 43

Step 7: Initial Setup 2

Then we need to decide the family-wise error rate when adding new arms, denoted as FWER2. In this example, we control the FWER2 at the same level to the FWER1. With FWER2, we can find the marginal type I error rate (denoted as α1). With α1, we can find the updated marginal power (denoted as ω2 = 1 − β2), and then the disjunctive power (denoted as Ω2). We will describe these updates with details in the following steps.

The goal of a two-period K+M experimental arm platform design is to minimize the sample size (denoted as N2) while keeping the marginal power (ω2) and disjunctive power (Ω2) no less than their counterparts in the first period (ω1 and Ω1). That is, we set the lower limit of ω2 (denoted as ω2min) to be 0.8, the lower limit of Ω2 (denoted as Ω2min) to be 0.922 (which is obtained by using function one_stage_multiarm in the previous steps) as for our “2+2” example. These two limits will be used to select the recommended optimal design(s) (details shown in Step 13).

FWER_2 <- FWER_1
FWER_2
#> [1] 0.025
omega2_min <- 1-beta1
omega2_min
#> [1] 0.8
Omega2_min <- multi$Power1
Omega2_min
#> [1] 0.9222971

Step 8: Admissible Set

Because we need to keep FWER2 equal to FWER1 when adding new arms, we then have to update n1 to n2 and n01 to n02 for each experimental arm (See Fig 1). Here n2 and n02 are the sample sizes for each of the experimental arms and for the concurrent control in the 2+2-experimental arm trial.

We define an admissible set for pairs of (n2, n02) based on the following three constraints. The first two constraints for (n2, n02) is related to the allocation ratio after adding the new arms, denoted as A2. In the first period with two experimental arms (before adding the new arms), the control allocation ratio is A1. Once the two new experimental arms are added, we need to use an allocation ratio A2 to achieve desired design properties (e.g., control the FWER and achieve the marginal power). After the initially opened two experimental arms are completed, the trial will again have only two experimental arms left. Therefore, the allocation ratio will go back from A2 to A1.

Here we have,

A2 = (n02 − n0t)/(n2 − nt) > 0

where nt and n0t are number of enrolled patients for each of the experimental arms and the control at the time of adding new arms. That is, the value of A2 needs to be a non-infinite positive number. For example, the first two constraints in our “2+2” example are

n02 > n0t = 43 and n2 > nt = 30

We also need to set an upper limit for the total sample size N2. A reasonable upper limit is that N2 should not exceed the required sample sizes (denoted as S) of separately conducting two multiarm trials, i.e., a K-experimental arm trial and a M-experimental arm trial. To recap, K and M refer to the numbers of initially and newly added experimental arms.

Therefore, S can be calculated as $$ S= \frac{(z_{1-\alpha_1}+z_{1-\beta_1})^2}{\Delta^2}(1+2\sqrt{K}+K) + \frac{(z_{1-\alpha_1^*}+z_{1-\beta_1})^2}{\Delta^2}(1+2\sqrt{M}+M). $$

, where z1 − α1 and z1 − α1* are the critical values for the K- and M-experimental arm trial, separately. In our “2+2” example, z1 − α1 = z1 − α1* as K = M = 2 and S = 2N1 = 690.

Therefore, the third constraint for (n2, n02) is N2 = (K + M)n2 + n02 + n0t < S = 690.

Under the above three constraints, the admissible set of (n2, n02) can be identified using the function admiss (integer points in triangle area in Fig.2). The data set pair3 contains all (n2, n02) pairs satisfying the 3 constraints introduced above.

pair3 <- admiss(n1=101, n0_1=143, nt=30, ntrt=4, S=690)

ggplot(data=pair3, aes(x=Var1, y=Var2)) +
  geom_point() +
  geom_abline(intercept = 647, slope=-4, color="red") +
  geom_hline(yintercept=43, color="red")+
  geom_vline(xintercept=30, color="red") +
  xlim(0, 500)+
  ylim(0,1000)+
  xlab("n2")+
  ylab("n02")

Step 9: Correlation Matrix 2

For each pair of (n2, n02) in the admissible set the correlation matrix of z statistics, 2, of four test statistics (Z1, Z2, Z3, andZ4) can be derived using in n2 and n02. The correlation between two experimental arms is,

$$\rho_{k, k^{'}} = \frac{n_{0kk^{'}}}{\frac{(n_{0_{2}})^{2}}{n_{2}}+n_{0_{2}}}$$

Specifically, between the initially opened two experimental arms (and between the two newly added arms), the shared control now is n0kk = n02. Therefore, the correlation of Z statistics between these two initially opened experimental arms (and between the two newly added arms) is

$$\rho_1 = \frac{1}{(n_{0_2}/n_2+1)}$$.

The number of shared controls between one initially opened experimental arm and one delayed experimental arm is n0kk = n02 − n0t Therefore, the correlation of the Z test statistics between these two experimental arms is

$$\rho_2 = \frac{n_{0_2}-n_{0t}}{(n_{0_2}^2/n_2+n_{0_2})}$$.

To be specific, in our 2+2-experimental arm example, we have Σ2 as, $$\Sigma_2 =\begin{bmatrix} 1 & \rho_1 & \rho_2 & \rho_2\\ \rho_1 & 1& \rho_2& \rho_2\\ \rho_2 & \rho_2& 1& \rho_1\\ \rho_2 & \rho_2& \rho_1 & 1 \end{bmatrix}$$

Step 10: Critical Value 2

Now we can use FWER2 and Σ2 to calculate the marginal type I error α2 and the corresponding critical value z1 − α2 for each pair of (n2, n02) in the admissible set found in ).

FWER2 = 1 − ∫−∞z1 − α2−∞z1 − α2...∫−∞z1 − α2πz(Z(z1, z2, ...zK + M), 0, Σ2)dz1dz2...dzK + M

Step 11: Marginal Power 2

With n1, n01, α1, β1 and α2, we can use the following equation to calculate the marginal power ω2 = 1 − β2 for each pair of (n2, n02) from z1 − β2.

$$z_{1-\beta2 }=\sqrt{\frac{\frac{1} {n_1}+\frac{1}{n_{0_1}}}{\frac{1}{n_2}+\frac{1}{n_{0_2}}}}(z_{1-\alpha1} + z_{1-\beta1})- z_{1-\alpha2}$$

Step 12: Disjunctive Power 2

With β2 and Σ2, we can derive the new disjunctive power Ω2 for each pair of (n2, n02) using the following equation.

Ω2 = 1 − ∫−∞zβ2−∞zβ2...∫−∞zβ2πz(Z(z1, z2, ...zK + M), 0, Σ2)dz1dz2...dzK + M

Step 13: Design Selection

In our “2+2” example, we calculate ω2 and Ω2 from all of the 29040 pairs of (n2, n02) in the entire admissible set. We then perform a 2-step selection procedure to obtain the recommended optimal design(s): 1) first we only keep the designs with ω2 and Ω2 above or equal to ω1 and Ω1, respectively (i.e., lower limits decided in Step 7); 2) next, among those selected ones we choose the designs with the smallest total sample size N2.

Given nt, K, M, FWER1, ω1, and Δ, the function platform_Design can provide the optimal designs with the minimum total sample size among those having ω2 and Ω2 no less than their counterparts in the K-experimental arm trial.

design <- platform_design(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
design$designs
Fig 2. Recommended optimal designs when nt=30 in a 2+2-trial
Fig 2. Recommended optimal designs when nt=30 in a 2+2-trial

The first part of the outputs ($design Karm) contains the parameters for the K-experimental arm trial. The second part ($designs) contains the parameters for the K+M experimental arm trial designed based on the former. From above ($designs), we can see it is possible to have multiple recommended designs which all have the same total sample size N2. We provide a full list of useful parameters for each of the recommended optimal designs.

For example, if we choose design # 15669 for this 2+2-experimental arm trial, the corresponding critical value for controlling the FWER at 0.025 is 2.475. The marginal power is 0.8, and the disjunctive power is 0.985, both no less than their counterparts in the 2-experimental arm trial. The required total sample size is N2 = 669. Among the 669 patients, in the first period the sample sizes for each experimental arm and the control are nt = 30 and n0t = 43, with an allocation ratio of A1 = 1.414. Once the two additional experimental arms are added, the optimal allocation ratio changes to A2 = 2.01 for the overlapping stage of the second period. The allocation ratio will change back to A1 once the two initial experimental arms closed to accrual. Through the entire 2+2-experimental arm trial, the sample size for each experimental arm is n2 = 107. The sample size for the concurrent control of each experimental arm is n02 = 198. To be noted, the sample size for the entire control arm, concurrent and non-concurrent combined, is nc = 241. The reduction in the total sample size comparing to two separate 2-experimental arm trials is 21.

Step 14: Final Decision

As we can see from Step 13, although the total sample size is the same for all the 4 recommended designs, the other parameters can be different. Therefore, we can choose a final design based on our needs according to the other parameters. For example, if we would like to choose a design with the largest disjunctive power Ω2, our final choice goes to the design in the last row of the result above (design #16632).

Notes

If ω2min and Ω2min in Step 7 can not be met at the same time. The algorithm in function platform_Design will return the designs with smallest N2 but only meeting one of the two limits. In case we are not satisfied with the result or if neither of the two limits can be met, we can choose from the three options below: 1. Go back to Step 7 to decrease the values of ω2min. After that, repeat Steps 8 to 14 again. This can be done only if a marginal power lower than ω1 is acceptable, which partially compromises the goal of the design. 2. Go back to Step 6 to set up a smaller nt (and therefore smaller n0t) - that is, increasing overlapping among initially and later added experimental arms. The rationale is the later the new arms are added, the less likely we can find designs satisfying both limits defined in Step 7. After that, repeat Steps 8 to 14 again. This is only feasible if the situation allows us to change the timing of adding new arm. 3. Consider controlling for PWER instead of FWER as illustrated in the next example.

Example 2: the optimal two-period multiarm trial design with delayed arms, controlling for PWER

If users do not wish to control FWER, or if controlling for FWER can not be achieved given required power levels, then we recommend using the function platform Design(.) with the argument fwer replaced by pwer to plan for the K+M-experimental arm trial. If we plan to add 2 new experimental arms when 30 patients have already been enrolled in each of the 2 initial experimental arms, given the pair-wise type-I error controlled at 0.025 and the marginal power equal to 0.8, we can use the following code to get the design parameters. Here, five optimal designs are recommended and each row is a individual design. Notably, we can save 87 patients with this design compared to 2 separate multiarm trials.

design2 <- platform_design(nt=30, K=2, M=2, pwer=0.025, marginal.power=0.8, delta=0.4,seed=123)
design2$designs 
Fig 3. Recommended optimal designs when nt=30 in a 2+2-trial
Fig 3. Recommended optimal designs when nt=30 in a 2+2-trial

The main difference between using pwer instead of fwer in platform Design(.) is that it does not use the Dunnett method to derive critical values, instead, it calculates that directly from the user-defined pair-wise type I error. Notably, the upper limit S for the total sample size N2 that used to find the admissible set when controlling for PWER, is constructed using the multiarm trials (one K- and one M-experimental arm trial) which are also controlling for PWER in the function platform Design(.). The sample sizes for the multiarm trials (controlling for PWER) can also be calculated with the function one stage multiarm(.). Other aspects of the algorithms are similar between the two applications of the platform Design(.) function.

Example 3: using platform_design2(), a faster version of platform_design().

The function platform_design2() is a faster version of platform_design(). It adopts a more efficient algorithm to find optimal design(s) for a two-period K+M experimental arm platform trial by searching for the optimal design starting from the maximum possible N_2 value. The usage of this function is exactly the same as with platform_design(), except platform_design2() returns NULL for $design when optimal design does not exist and will not provide the sub-optimal design which only satisfying the minimum power level requirement for the disjunctive power. When optimal design exists, the results from the two functions are the same.

The two code chunks below can be used to compare time used to find the optimal design using platform_design2() and platform_design().

start_time <- Sys.time()
test <- platform_design2(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
end_time <- Sys.time()
end_time - start_time
# Time difference of 41.85487 secs
start_time <- Sys.time()
test2 <- platform_design(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
end_time <- Sys.time()
end_time - start_time
# Time difference of 8.188013 mins

Notably, the maximum time-saving is with the situations when the optimal design does not exist given the conditions specified by the user.

start_time <- Sys.time()
platform_design2(nt = 50, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
#> The lower limits of marginal and disjunctive power cannot be met at the same time.
#> $design_Karm
#>   K  n1 n0_1  N1 z_alpha1 FWER1   z_beta1    Power1      cor0 delta
#> 1 2 101  143 345 2.220604 0.025 0.8416212 0.9222971 0.4142136   0.4
#> 
#> $designs
#> NULL
end_time <- Sys.time()

# Time difference
end_time - start_time
#> Time difference of 0.709923 secs

Author(s)

Xiaomeng Yuan, Haitao Pan

References

1. Pan, H., Yuan, X. and Ye, J. (2022). An optimal two-period multiarm platform design with new experimental arms added during the trial. Submitted.

2. Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatments with a control. Journal of the American Statistical Association, 50(272), 1096-1121.