PlatformDesign

library(PlatformDesign)
library(ggplot2)

This vignette provides a guidance for designing an optimal two-period K+M-experimental arm platform trial with M delayed experimental arms added during the trial using the package PlatformDesign, controlling for family-wise type-I error rate (FWER) or pair-wise type-I error rate (PWER). The K+M-experimental arm trial has K experimental arms and one control arm during the first period, and later M experimental arms are added on the start of the second period. The one common control arm is shared among all experimental arms across the trial. The design method suits any K+M-experimental arm trial, with the examples here shows how to design a 2+2-experimental arm trial (see Fig 1 for the scheme). The goal of the proposed method is to find the optimal design with a minimum total sample size while making the marginal and disjunctive power no less than their counterparts in the K-experimental arm trial it is based on, controlling for FWER or PWER.

Fig 1.Schema of a two-period 2+2-experimental arm platform trial. Left part of the figure shows a traditional three-arm trial, in the context of this paper, we refer it as a 2-experimental arm trial as it has 2 experimental arms and 1 common control. In the right part of this figure, it is the 2+2-experimental arm trial. During the first period, the 2+2-experimental arm trial has two experimental arms, Trt 1 and Trt 2 (light blue segments), and a control arm (the dark blue segment). The first vertical line (solid) separates the first and second period of the 2+2-experimental arm trial, and indicates the opening of the two new experimental arms, Trt 3 and Trt 4. The second vertical line (dashed) separates two parts of the second period, and indicates the closing of Trt 1 and Trt 2. In the first part of the second period, the control arm (the dark purple segment) is shared among the four experimental arms (light purple segments). During the second part of the second period, the control (dark green), Trt 3, and Trt 4 (light green) continue to accrue patients until reaching the planned sample sizes. The nt and n0t (blue brackets) indicate the numbers of patients enrolled in each of Trt 1 and Trt 2, and the control, respectively, when Trt 3 and Trt 4 are added. The n1 and n01 (orange brackets) indicate the sample sizes for each of the two experimental arms and the control, respectively, in the 2-experimental arm trial. The n2 and n02 (green brackets) indicate the sample sizes for each of the four experimental arms and their corresponding concurrent control in the two-period 2+2-experimental arm trial. A1 denotes the allocation ratio (control to experimental arm) during the first period of the 2+2-experimental arm trial. A2 denotes the allocation ratio during the the first part of the second period, when all four experiments arms are open. A3 is equal to A1.

Example 1: the optimal two-period multiarm trial design with delayed arms, controlling for FWER

The following steps contain two parts: 1) Steps 1 to 5 derive the design parameters in the K-experimental arm trial that the K+M-experimental arm trial is based on. The K+M-experimental arm trial is based on the K-experimental arm trial in the sense that we will keep the same FWER and marginal power for the K+M-experimental arm trial as in the K-experimental arm trial, despite M new experimental arms are added during the second period of the K+M experimental arm trial. 2) Steps 6 to 14 illustrate how the design parameters are calculated for the K+M-experimental arm trial.

For users who are less interested in the theoretical details, you can skip other steps and focus on Steps 1, 7, 13, and 14.

Step 1: initial setup

Four design parameters for the K-experimental arm trial should be pre-specified: the number of experimental arms (K), the family-wise error rate (FWER₁), the marginal type II error (β₁), and the allocation ratio (control-to-each experimental arm, denoted as A₁). In our method, we use $A_1=\sqrt{K}$, according to the K-root optimal allocation rule. In the following code, we assume K = 2, FWER₁ = 0.025, β₁ = 0.2, and $A_1=\sqrt{2}$. In addition, z_β₁ ( in the following code) is the corresponding critical value for the power of 1 − β₁.

K <- 2
FWER_1 <- 0.025
beta1 <- 0.2
z_beta1 <- qnorm(1-beta1) #z_(1-beta1)
A1 <- sqrt(K)

Step 2: Correlation Matrix 1

We use Z₁ and Z₂ to denote the two test statistics for comparing the two experimental arm to the control in the original design. Given A₁, the correlation between Z₁ and Z₂ (denoted as ρ₀) and the correlation matrix (denoted as Σ₁) can be calculated (see below).

First, we have, $$\rho_{0} = \frac{n_{0kk^{'}}}{\frac{(n_{01})^{2}}{n_{1}}+n_{01}}$$ where n₁ is the number of patients on each of the experimental arm, n₀₁ is the number of patients on the control in the K-experimental arm trial, and n_0kk^′ is number of control patients shared between 2 experimental arms k and k′.

Since the two experimental arms share a common control arm, that is, n_0kk^′ = n₀₁. The correlation of Z₁ and Z₂ is computed as $$\rho_0 = \frac{1}{(n_{01}/n_{1}+1)}=\frac{1}{(A_1+1)}=0.4142$$

We can see that in our “2+2”-experimental arm example, where K=2, we have $$\Sigma_{1} =\begin{bmatrix} 1& \rho_0\\ \rho_0& 1 \end{bmatrix}=\begin{bmatrix} 1& 0.4142\\ 0.4142& 1 \end{bmatrix}$$

Based on the above description, the function one_stage_multiarm can be used to find ρ₀ and the correlation matrix Σ₁ of Z₁ and Z₂ as shown below.

multi <- one_stage_multiarm(K = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
corMat1 <- multi$corMat1
corMat1
#>           [,1]      [,2]
#> [1,] 1.0000000 0.4142136
#> [2,] 0.4142136 1.0000000

Step 3: Critical Value 1

Given K, Σ₁ and FWER₁, we can find the associated critical value (denoted as z_{1 − α₁}) for the marginal type I error rate in the 2-experimental arm trial (denoted as α₁) based on the following equation.

FWER₁ = 1 − ∫_−∞^{z_{1 − α₁}}∫_−∞^{z_{1 − α₁}}...∫_−∞^{z_{1 − α₁}}π_z(Z(z₁, z₂, ...z_K), 0, Σ₁)dz₁dz₂...dz_K This calculation can also be achieved using the function one_stage_multiarm.

multi$z_alpha1
#> [1] 2.220604

Step 4: Sample Sizes 1

Given β₁, $A_1=\sqrt{K}$, an effective standardized effect size Δ (assumed to be 0.4 for all experimental arms in our paper), and z_{1 − α₁} derived above, we can derive sample sizes for the experimental (n₁) and control arms ($n_{0_1}=A_1n_1=\sqrt{K}n_1$), respectively, as shown below.

Since we have

$$z_{1-\alpha_1}+z_{1-\beta_1}=\frac{\mu_i-\mu_0}{\sigma\sqrt{\frac{1}{n_{1}}+\frac{1}{\sqrt{K}n_{1}}}}=\frac{\Delta }{\sqrt{\frac{1}{n_{1}}+\frac{1}{\sqrt{K}n_{1}}}}$$ Therefore,

$$n_{1}= \frac{(z_{\alpha_1}+z_{\beta_1})^2}{\Delta^2}(1+\frac{\sqrt{K})}{K})$$

and $$n_{0_{1}}= \frac{(z_{\alpha_1}+z_{\beta_1})^2}{\Delta^2}(1+\sqrt{K})$$

In sum, the total sample size of the K-experimental arm trial is N₁ = Kn₁ + n_0₁

With our “2+2” example, we again use the function one_stage_multiarm to derive the sample sizes for the first period.

multi$n1
#> [1] 101
multi$n0_1
#> [1] 143
multi$N1
#> [1] 345

Step 5: Disjunctive Power 1

With β₁ and Σ₁, we can also derive the overall (disjunctive) power Ω1 = 0.922 based on equation below.

Ω₁ = 1 − ∫_−∞^z_β1∫_−∞^z_β1...∫_−∞^z_β1π_z(Z(z₁, z₂, ...z_K), 0, ∑)dz₁dz₂...dz_K This result is also included as part of the output from the function one_stage_multiarm, i.e., $Power1.

multi$Power1
#> [1] 0.9222971

From the above outputs, we can see that the computed disjunctive power is 0.922.

Based Steps 1 to 5, we demonstrated given K, FWER₁, marginal power 1 − β₁ and the standardized effect size Δ, how to derive the marginal type I error rate α₁ and its corresponding critical value, sample size n₁ for each of the experimental arm and n_0₁ for the control arm, and disjunctive power Ω₁ for the K-experimental arm trial.

In sum, the function one_stage_multiarm in R package PlatformDesign can complete step 1 to 5 all at once. Below are the complete outputs generated by applying this function.

multi
#> $K
#> [1] 2
#> 
#> $n1
#> [1] 101
#> 
#> $n0_1
#> [1] 143
#> 
#> $N1
#> [1] 345
#> 
#> $z_alpha1
#> [1] 2.220604
#> 
#> $FWER1
#> [1] 0.025
#> 
#> $z_beta1
#> [1] 0.8416212
#> 
#> $Power1
#> [1] 0.9222971
#> 
#> $corMat1
#>           [,1]      [,2]
#> [1,] 1.0000000 0.4142136
#> [2,] 0.4142136 1.0000000
#> 
#> $delta
#> [1] 0.4

Based on these knowledge, we can introduce our proposed methods when adding new arms in the following from steps 6 to 14.

Step 6: Timing of Adding New Arms

Timing is the first component to consider if planning to add new experimental arms during a study. In practice, we can use a fraction to denote the timing. In this paper, we use the number of patients already being enrolled in the experimental arm (denoted as n_t) when new arms added to define the timing of adding. By this definition, at the time of adding new arms, number of patients enrolled in the control arm is n_0t = [A₁n_t].

For example, the following codes describe a scenario if new arms are added when there have 30 patients enrolled for each of the experimental arms.

nt <- 30
nt
#> [1] 30

n_0t <- ceiling(nt*A1)
n_0t
#> [1] 43

Step 7: Initial Setup 2

Then we need to decide the family-wise error rate when adding new arms, denoted as FWER₂. In this example, we control the FWER₂ at the same level to the FWER₁. With FWER₂, we can find the marginal type I error rate (denoted as α₁). With α₁, we can find the updated marginal power (denoted as ω₂ = 1 − β₂), and then the disjunctive power (denoted as Ω₂). We will describe these updates with details in the following steps.

The goal of a two-period K+M experimental arm platform design is to minimize the sample size (denoted as N₂) while keeping the marginal power (ω₂) and disjunctive power (Ω₂) no less than their counterparts in the first period (ω₁ and Ω₁). That is, we set the lower limit of ω₂ (denoted as ω_2min) to be 0.8, the lower limit of Ω₂ (denoted as Ω_2min) to be 0.922 (which is obtained by using function one_stage_multiarm in the previous steps) as for our “2+2” example. These two limits will be used to select the recommended optimal design(s) (details shown in Step 13).

FWER_2 <- FWER_1
FWER_2
#> [1] 0.025
omega2_min <- 1-beta1
omega2_min
#> [1] 0.8
Omega2_min <- multi$Power1
Omega2_min
#> [1] 0.9222971

Step 8: Admissible Set

Because we need to keep FWER₂ equal to FWER₁ when adding new arms, we then have to update n₁ to n₂ and n₀₁ to n₀₂ for each experimental arm (See Fig 1). Here n₂ and n₀₂ are the sample sizes for each of the experimental arms and for the concurrent control in the 2+2-experimental arm trial.

We define an admissible set for pairs of (n₂, n₀₂) based on the following three constraints. The first two constraints for (n₂, n₀₂) is related to the allocation ratio after adding the new arms, denoted as A₂. In the first period with two experimental arms (before adding the new arms), the control allocation ratio is A₁. Once the two new experimental arms are added, we need to use an allocation ratio A₂ to achieve desired design properties (e.g., control the FWER and achieve the marginal power). After the initially opened two experimental arms are completed, the trial will again have only two experimental arms left. Therefore, the allocation ratio will go back from A₂ to A₁.

Here we have,

A₂ = (n_0₂ − n_0t)/(n₂ − n_t) > 0

where n_t and n_0t are number of enrolled patients for each of the experimental arms and the control at the time of adding new arms. That is, the value of A₂ needs to be a non-infinite positive number. For example, the first two constraints in our “2+2” example are

n_0₂ > n_0t = 43 and n₂ > n_t = 30

We also need to set an upper limit for the total sample size N₂. A reasonable upper limit is that N₂ should not exceed the required sample sizes (denoted as S) of separately conducting two multiarm trials, i.e., a K-experimental arm trial and a M-experimental arm trial. To recap, K and M refer to the numbers of initially and newly added experimental arms.

Therefore, S can be calculated as $$ S= \frac{(z_{1-\alpha_1}+z_{1-\beta_1})^2}{\Delta^2}(1+2\sqrt{K}+K) + \frac{(z_{1-\alpha_1^*}+z_{1-\beta_1})^2}{\Delta^2}(1+2\sqrt{M}+M). $$

, where z_{1 − α₁} and z_{1 − α₁^*} are the critical values for the K- and M-experimental arm trial, separately. In our “2+2” example, z_{1 − α₁} = z_{1 − α₁^*} as K = M = 2 and S = 2N₁ = 690.

Therefore, the third constraint for (n₂, n_0₂) is N₂ = (K + M)n₂ + n_0₂ + n_0t < S = 690.

Under the above three constraints, the admissible set of (n₂, n_0₂) can be identified using the function admiss (integer points in triangle area in Fig.2). The data set pair3 contains all (n₂, n₀₂) pairs satisfying the 3 constraints introduced above.

pair3 <- admiss(n1=101, n0_1=143, nt=30, ntrt=4, S=690)

ggplot(data=pair3, aes(x=Var1, y=Var2)) +
  geom_point() +
  geom_abline(intercept = 647, slope=-4, color="red") +
  geom_hline(yintercept=43, color="red")+
  geom_vline(xintercept=30, color="red") +
  xlim(0, 500)+
  ylim(0,1000)+
  xlab("n2")+
  ylab("n02")

Step 9: Correlation Matrix 2

For each pair of (n₂, n_0₂) in the admissible set the correlation matrix of z statistics, ∑₂, of four test statistics (Z₁, Z₂, Z₃, andZ₄) can be derived using in n₂ and n_0₂. The correlation between two experimental arms is,

$$\rho_{k, k^{'}} = \frac{n_{0kk^{'}}}{\frac{(n_{0_{2}})^{2}}{n_{2}}+n_{0_{2}}}$$

Specifically, between the initially opened two experimental arms (and between the two newly added arms), the shared control now is n_0kk^′ = n₀₂. Therefore, the correlation of Z statistics between these two initially opened experimental arms (and between the two newly added arms) is

$$\rho_1 = \frac{1}{(n_{0_2}/n_2+1)}$$.

The number of shared controls between one initially opened experimental arm and one delayed experimental arm is n_0kk^′ = n_0₂ − n_0t Therefore, the correlation of the Z test statistics between these two experimental arms is

$$\rho_2 = \frac{n_{0_2}-n_{0t}}{(n_{0_2}^2/n_2+n_{0_2})}$$.

To be specific, in our 2+2-experimental arm example, we have Σ₂ as, $$\Sigma_2 =\begin{bmatrix} 1 & \rho_1 & \rho_2 & \rho_2\\ \rho_1 & 1& \rho_2& \rho_2\\ \rho_2 & \rho_2& 1& \rho_1\\ \rho_2 & \rho_2& \rho_1 & 1 \end{bmatrix}$$

Step 10: Critical Value 2

Now we can use FWER₂ and Σ₂ to calculate the marginal type I error α₂ and the corresponding critical value z_{1 − α₂} for each pair of (n₂, n₀₂) in the admissible set found in ).

FWER₂ = 1 − ∫_−∞^{z_{1 − α₂}}∫_−∞^{z_{1 − α₂}}...∫_−∞^{z_{1 − α₂}}π_z(Z(z₁, z₂, ...z_K + M), 0, Σ₂)dz₁dz₂...dz_K + M

Step 11: Marginal Power 2

With n₁, n₀₁, α₁, β₁ and α₂, we can use the following equation to calculate the marginal power ω₂ = 1 − β₂ for each pair of (n₂, n₀₂) from z_{1 − β₂}.

$$z_{1-\beta2 }=\sqrt{\frac{\frac{1} {n_1}+\frac{1}{n_{0_1}}}{\frac{1}{n_2}+\frac{1}{n_{0_2}}}}(z_{1-\alpha1} + z_{1-\beta1})- z_{1-\alpha2}$$

Step 12: Disjunctive Power 2

With β₂ and Σ₂, we can derive the new disjunctive power Ω₂ for each pair of (n₂, n_0₂) using the following equation.

Ω₂ = 1 − ∫_−∞^z_β2∫_−∞^z_β2...∫_−∞^z_β2π_z(Z(z₁, z₂, ...z_K + M), 0, Σ₂)dz₁dz₂...dz_K + M

Step 13: Design Selection

In our “2+2” example, we calculate ω₂ and Ω₂ from all of the 29040 pairs of (n₂, n₀₂) in the entire admissible set. We then perform a 2-step selection procedure to obtain the recommended optimal design(s): 1) first we only keep the designs with ω₂ and Ω₂ above or equal to ω₁ and Ω₁, respectively (i.e., lower limits decided in Step 7); 2) next, among those selected ones we choose the designs with the smallest total sample size N₂.

Given n_t, K, M, FWER₁, ω₁, and Δ, the function platform_Design can provide the optimal designs with the minimum total sample size among those having ω₂ and Ω₂ no less than their counterparts in the K-experimental arm trial.

design <- platform_design(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
design$designs

Fig 2. Recommended optimal designs when nt=30 in a 2+2-trial

The first part of the outputs ($design Karm) contains the parameters for the K-experimental arm trial. The second part ($designs) contains the parameters for the K+M experimental arm trial designed based on the former. From above ($designs), we can see it is possible to have multiple recommended designs which all have the same total sample size N2. We provide a full list of useful parameters for each of the recommended optimal designs.

For example, if we choose design # 15669 for this 2+2-experimental arm trial, the corresponding critical value for controlling the FWER at 0.025 is 2.475. The marginal power is 0.8, and the disjunctive power is 0.985, both no less than their counterparts in the 2-experimental arm trial. The required total sample size is N₂ = 669. Among the 669 patients, in the first period the sample sizes for each experimental arm and the control are n_t = 30 and n_0t = 43, with an allocation ratio of A₁ = 1.414. Once the two additional experimental arms are added, the optimal allocation ratio changes to A2 = 2.01 for the overlapping stage of the second period. The allocation ratio will change back to A₁ once the two initial experimental arms closed to accrual. Through the entire 2+2-experimental arm trial, the sample size for each experimental arm is n₂ = 107. The sample size for the concurrent control of each experimental arm is n₀₂ = 198. To be noted, the sample size for the entire control arm, concurrent and non-concurrent combined, is n_c = 241. The reduction in the total sample size comparing to two separate 2-experimental arm trials is 21.

Step 14: Final Decision

As we can see from Step 13, although the total sample size is the same for all the 4 recommended designs, the other parameters can be different. Therefore, we can choose a final design based on our needs according to the other parameters. For example, if we would like to choose a design with the largest disjunctive power Ω₂, our final choice goes to the design in the last row of the result above (design #16632).

Notes

If ω_{2_min} and Ω_{2_min} in Step 7 can not be met at the same time. The algorithm in function platform_Design will return the designs with smallest N₂ but only meeting one of the two limits. In case we are not satisfied with the result or if neither of the two limits can be met, we can choose from the three options below: 1. Go back to Step 7 to decrease the values of ω_{2_min}. After that, repeat Steps 8 to 14 again. This can be done only if a marginal power lower than ω₁ is acceptable, which partially compromises the goal of the design. 2. Go back to Step 6 to set up a smaller n_t (and therefore smaller n_0t) - that is, increasing overlapping among initially and later added experimental arms. The rationale is the later the new arms are added, the less likely we can find designs satisfying both limits defined in Step 7. After that, repeat Steps 8 to 14 again. This is only feasible if the situation allows us to change the timing of adding new arm. 3. Consider controlling for PWER instead of FWER as illustrated in the next example.

Example 2: the optimal two-period multiarm trial design with delayed arms, controlling for PWER

If users do not wish to control FWER, or if controlling for FWER can not be achieved given required power levels, then we recommend using the function platform Design(.) with the argument fwer replaced by pwer to plan for the K+M-experimental arm trial. If we plan to add 2 new experimental arms when 30 patients have already been enrolled in each of the 2 initial experimental arms, given the pair-wise type-I error controlled at 0.025 and the marginal power equal to 0.8, we can use the following code to get the design parameters. Here, five optimal designs are recommended and each row is a individual design. Notably, we can save 87 patients with this design compared to 2 separate multiarm trials.

design2 <- platform_design(nt=30, K=2, M=2, pwer=0.025, marginal.power=0.8, delta=0.4,seed=123)
design2$designs

Fig 3. Recommended optimal designs when nt=30 in a 2+2-trial

The main difference between using pwer instead of fwer in platform Design(.) is that it does not use the Dunnett method to derive critical values, instead, it calculates that directly from the user-defined pair-wise type I error. Notably, the upper limit S for the total sample size N₂ that used to find the admissible set when controlling for PWER, is constructed using the multiarm trials (one K- and one M-experimental arm trial) which are also controlling for PWER in the function platform Design(.). The sample sizes for the multiarm trials (controlling for PWER) can also be calculated with the function one stage multiarm(.). Other aspects of the algorithms are similar between the two applications of the platform Design(.) function.

Example 3: using platform_design2(), a faster version of platform_design().

The function platform_design2() is a faster version of platform_design(). It adopts a more efficient algorithm to find optimal design(s) for a two-period K+M experimental arm platform trial by searching for the optimal design starting from the maximum possible N_2 value. The usage of this function is exactly the same as with platform_design(), except platform_design2() returns NULL for $design when optimal design does not exist and will not provide the sub-optimal design which only satisfying the minimum power level requirement for the disjunctive power. When optimal design exists, the results from the two functions are the same.

The two code chunks below can be used to compare time used to find the optimal design using platform_design2() and platform_design().

start_time <- Sys.time()
test <- platform_design2(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
end_time <- Sys.time()
end_time - start_time
# Time difference of 41.85487 secs

start_time <- Sys.time()
test2 <- platform_design(nt = 30, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
end_time <- Sys.time()
end_time - start_time
# Time difference of 8.188013 mins

Notably, the maximum time-saving is with the situations when the optimal design does not exist given the conditions specified by the user.

start_time <- Sys.time()
platform_design2(nt = 50, K = 2, M = 2, fwer = 0.025, marginal.power = 0.8, delta = 0.4)
#> The lower limits of marginal and disjunctive power cannot be met at the same time.
#> $design_Karm
#>   K  n1 n0_1  N1 z_alpha1 FWER1   z_beta1    Power1      cor0 delta
#> 1 2 101  143 345 2.220604 0.025 0.8416212 0.9222971 0.4142136   0.4
#> 
#> $designs
#> NULL
end_time <- Sys.time()

# Time difference
end_time - start_time
#> Time difference of 0.7059834 secs

Author(s)

Xiaomeng Yuan, Haitao Pan

References

1. Pan, H., Yuan, X. and Ye, J. (2022). An optimal two-period multiarm platform design with new experimental arms added during the trial. Submitted.

2. Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatments with a control. Journal of the American Statistical Association, 50(272), 1096-1121.