--- title: "**iDOVE -- $\\text{\\underline{D}}$urability $\\text{\\underline{O}}$f $\\text{\\underline{V}}$accine $\\text{\\underline{E}}$fficacy Against SARS-CoV-2 $\\text{\\underline{I}}$nfection**" author: "Yu Gu, Shannon T. Holloway, and Dan-Yu Lin" date: September 9, 2021 output: rmarkdown::pdf_document vignette: > %\VignetteIndexEntry{iDOVE-vignette} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) RcppArmadillo::armadillo_throttle_cores(2) opt <- options() options(continue=" ", width=70, prompt=" ") on.exit(options(opt)) library(iDOVE, quietly=TRUE) ``` ## Introduction \textbf{iDOVE} is an R package for assessing potentially time-varying vaccine efficacy (VE) against SARS-CoV-2 infection under staggered enrollment of participants and time-varying community transmission, allowing crossover of placebo volunteers to the vaccine arm before the end of the study. Infection is not directly observed, but is rather known to occur between two examinations. \textbf{iDOVE} implements the methodology of Lin et al. (2021) for estimating time-varying VE against such interval-censored events, representing the log hazard ratio for the vaccine effect by a piece-wise linear function of time since vaccination. The special case of right-censored events is implemented by dove2() in the DOVE package. \textbf{iDOVE} inputs a rectangular data set with the following information: \begin{itemize} \item \textbf{Entry time}: The time when the participant enters the trial. \item \textbf{Left interval time}: The last examination time when the test is negative. \item \textbf{Right interval time}: The first examination time when the test is positive. \item \textbf{Vaccination time}: The time when vaccination takes place. \item \textbf{Covariates}: Baseline covariates (e.g., priority group, age, sex, ethnicity). \end{itemize} \noindent Of note, an arbitrary number of baseline covariates can be included, and all of the time variables are measured from the start of the trial and are specified in units of whole days. \vspace{.15in} The primary analysis tool of the package is \textit{idove()}, which returns the estimated hazard ratio for each baseline covariate, the estimated VE in reducing the attack rate (cumulative incidence), the estimated VE in reducing the hazard rate (instantaneous risk), and the estimated VE in reducing the attack rates over successive time periods. The standard errors and 95\% confidence intervals are also provided. \vspace{.15in} In addition, the package includes three convenience functions: \textit{intCens()}, which is used to wrap all of the input time variables together and is part of the model statement of \textit{idove()}; \textit{print()}, which displays the primary results of the analysis; and \textit{plot()}, which generates plots of the estimated VE curves. Finally, a simulated dataset is provided to illustrate the use of the software. ## Functions ### \textit{intCens()} This convenience function is used as the left-hand side of a formula object for the sole purpose of simplifying the specification of required input variables: entry time, left interval time, right interval time, and vaccination time. This function is not intended to be used as a stand-alone feature. For completeness, the function ensures that the input data obey basic constraints and returns the data in a predictable format for use in internal functions. \vspace{.15in} The usage is ```{r eval=FALSE} intCens(entry_time, left_time, right_time, vaccination_time) ``` where \texttt{entry\_time} is the time when the participant enters the trial; \texttt{left\_time} is the last examination time when the test is negative; \texttt{right\_time} is the first examination time when the test is positive (NA or Inf if the participant is never tested positive during the clinical trial); \texttt{vaccination\_time} is the time when vaccination takes place (NA or Inf if the participant is not vaccinated during the trial). Note that all times must be provided in units of whole days. ### \textit{idove()} This function is the primary tool of \textbf{iDOVE}. The value object returned is an S3 object of class iDOVE that contains the estimated hazard ratio for each baseline covariate, the estimated VE in reducing the attack rate, $VE_a(t)$, and in reducing the hazard rate, $VE_h(t)$, where $t$ is time elapsed since vaccination, as well as the estimated VE in reducing the attack rates over $m$ successive time periods, $VE_a(0,t_1), VE_a(t_1,t_2), \ldots, VE_a(t_{m-1},t_m)$. By definition, $VE_a(0,t)=VE_a(t)$. \vspace{.15in} The function call takes the following form: ```{r usage, eval=FALSE} idove(formula, data, constantVE = FALSE, plots = TRUE, changePts = NULL, timePts = NULL, tol = 0.0001, maxit = 2000) ``` where \begin{itemize} \item \texttt{formula} is a model statement. See below for further details. \item \texttt{data} is a data.frame object containing all required data as previously described. \item \texttt{constantVE} is a logical object specifying the VE trend after the last change point. If TRUE, VE is assumed to be constant after the last change point; otherwise, VE is allowed to vary after the last change point. \item \texttt{plots} is a logical object indicating whether graphical forms of the estimated $VE_a(t)$ and $VE_h(t)$ curves are to be generated. \item \texttt{changePts} is an optional integer vector to specify the change points for the vaccine effect on the hazard ratio. If no change points are provided, one change point will automatically be selected among Weeks 4, 5, 6, 7, 8 by the Akaike information criterion (AIC) to capture the ramping vaccine effect after the initial shot. \item \texttt{timePts} is an optional integer vector to specify the time points $(t_1, t_2, \ldots, t_m)$ for partitioning the study period in the estimation of VE on the attack rates over successive time periods. If not provided, a default sequence $t_1, 2t_1, 3t_1, \dots $ will be used, where $t_1$ is the first change point. The sequence ends at the maximum of the finite left and right interval times from all participants. \item \texttt{tol} is the convergence threshold for the EM algorithm. \item \texttt{maxit} is the maximum number of iterations for the EM algorithm. \end{itemize} \vspace{.15in} The model statement is a formula object. The left side is an object returned by the \textit{intCens()} function and specifies all time variables. The right side contains all baseline covariates; a model without baseline covariates is allowed. Categorical baseline covariates can be specified, and all other categories are compared to the first category. The \texttt{formula} input takes the following general structure ```{r intCens-usage, eval=FALSE} intCens(entry_time, left_time, right_time, vaccination_time) ~ covariates ``` where 'event\_time', 'left\_time', 'right\_time', 'vaccination\_time', and 'covariates' are place holders indicating the data that are to be provided; they should be replaced by the appropriate variable names in the header of the input data. \vspace{.15in} The two VE measures, $VE_a(t)$ and $VE_h(t)$, are estimated up to the maximum of all finite left and right ends of the time intervals. To ensure stable estimates, we suggest placing change points at times (since vaccination) when there are relatively large numbers of events and not placing change points at the right tail. \vspace{.15in} ### \textit{plot()} When provided the value object returned by \textit{idove()}, this convenience function creates/recreates plots of the estimated VE curves in reducing the attack rate, $VE_a(t)$, and in reducing the hazard rate, $VE_h(t)$. ### \textit{print()} When provided the value object returned by \textit{idove()}, the tabular results are displayed. ## Examples To illustrate the call structure and results of \textit{idove()}, we use the dataset provided with the package, idoveData. This dataset was simulated under a blinded, priority-tier dependent crossover design with a ramping vaccine effect between dose 1 and dose 2 and contains the following observations for each of the 40,000 participants: \begin{itemize} \item \textbf{entry.time}: The entry time in days. \item \textbf{left.time}: The left end of the time interval in days. \item \textbf{right.time}: The right end of the time interval in days. \item \textbf{vaccine.time}: The time of vaccination in days. \item \textbf{priority}: A composite baseline risk score taking values 1-5. \item \textbf{sex}: A binary indicator of sex (male/female). \end{itemize} \vspace{.15in} The data can be loaded in the usual way ```{r data-load} data(idoveData) ``` ```{r data-head} head(idoveData) ``` \vspace{.15in} Consider the summary statistics ```{r data-summary} summary(idoveData) summary(idoveData$right.time[is.finite(idoveData$right.time)]) ``` We can see that participants were enrolled in the study over a 4-month period (0 $\le$ entry.time $\le 120$ days), the follow-up time ended on day 315 (left.time and finite right.time $\le$ 315 days), and more than $75\%$ of the participants were never tested positive during the follow-up (right.time = Inf indicates that a participant did not test positive during the course of the trial). In addition, the priority (risk) score is evenly distributed across participants, who are equally distributed between the two sex groups. In this analysis, we will include in our model statement both baseline covariates, priority and sex. \vspace{.25in} In the first example, we set Week 4 as the change point and assume a potentially waning VE after 4 weeks. We want to estimate $VE_a$ over 0-4, 4-16, 16-28, 28-40 weeks. Note that all times must be provided in the unit of integer days. The function call takes the following form ```{r idove-noaic, fig.show='hide', echo=TRUE, eval=FALSE} model <- intCens(entry.time, left.time, right.time, vaccine.time) ~ priority + sex result1 <- idove(formula = model, data = idoveData, changePts = 4*7, timePts = c(4, 16, 28, 40)*7) ``` ```{r idove-noaic-read, echo = FALSE, eval = TRUE} result1 <- readRDS(file="result1.rds") ``` \begin{verbatim} ## changePts: {28} ## performing nonparametric maximum likelihood ## Iteration 100 : difference = 0.000140 ## EM algorithm converged after 118 iterations ## Number of subjects: 40000 ## Number of unique time points: 313 ## Log-likelihood at final estimates: -10880.11 ## PL converged after 2 iterations ## PL converged after 2 iterations ## PL converged after 2 iterations ## PL converged after 2 iterations \end{verbatim} \vspace{.15in} The function returns an S3 object of class iDOVE, which contains a list object with the following information. \vspace{.1in} \noindent \textbf{call}: The unevaluated call. ```{r idove-noaic-return-call} result1$call ``` \vspace{.15in} \noindent \textbf{changePts}: The changePts of the analysis. ```{r idove-noaic-return-changePts} result1$changePts ``` \vspace{.15in} \noindent \textbf{Covariate Effects}: The estimated (log) hazard ratio of each covariate, together with the estimated standard error, the $95\%$ confidence interval, and the two-sided p-value for testing no covariate effect. ```{r idove-noaic-return-covariates} result1$covariates ``` When no baseline covariates are provided, this element will be NA. \vspace{.15in} \noindent \textbf{Vaccine Efficacy}: Element \textbf{\$VE\_a} contains the daily VE estimate in reducing the attack rate, together with its standard error and the $95\%$ confidence interval. Element \textbf{\$VE\_h} contains the daily VE estimate in reducing the hazard rate, together with its standard error and the $95\%$ confidence interval. ```{r idove-noaic-return-VE} head(result1$vaccine$VE_a) tail(result1$vaccine$VE_a) head(result1$vaccine$VE_h) tail(result1$vaccine$VE_h) ``` Element \textbf{\$VE\_period} contains the estimated VE in reducing the attack rate over each time period, its standard error, and the $95\%$ confidence interval. ```{r idove-naic-return-interval} result1$vaccine$VE_period ``` \vspace{.15in} The graphical depictions of $VE_a$ and $VE_h$ estimates are generated by default by \textit{idove()} and are shown in Figure \ref{fig:idoveFigs}. This figure can be regenerated using \textit{plot()} as follows: ```{r idove-noaic-plot, eval=FALSE} plot(x = result1) ``` ```{r idove-noaic-show-plot, echo=FALSE, out.width='49%',fig.cap='\\label{fig:idoveFigs}Plots auto-generated by \\textit{idove()}. On the left, the estimated $VE_a(t)$ curve (black) and its $95\\%$ confidence intervals (green) are shown as a function of the time since vaccination. On the right, the estimated $VE_h(t)$ curve (black) and its $95\\%$ confidence intervals (green) are shown as a function of the time since vaccination.' , fig.show='hold', fig.align='center'} knitr::include_graphics( path=c("idove1a.pdf","idove1b.pdf"), auto_pdf = getOption("knitr.graphics.auto_pdf", FALSE), dpi = NULL, error = getOption("knitr.graphics.error", TRUE) ) ``` \vspace{.15in} In the second example, we have the software use AIC to choose a change point among Weeks 4, 5, 6, 7, 8. We assume a constant VE after the change point, and thus only the constant VE is estimated. The function call takes the following form ```{r idove-constantVE, echo=TRUE, eval=FALSE} result2 <- idove(formula = model, data = idoveData, constantVE = TRUE) ``` ```{r idove-constantVE-read, echo = FALSE, eval = TRUE} result2 <- readRDS(file="result2.rds") ``` \begin{verbatim} ## constantVE selected ## changePts not given; using AIC to select from {28,35,42,49,56} ## performing nonparametric maximum likelihood ## evaluating change point: 28 ## EM algorithm converged after 28 iterations ## evaluating change point: 35 ## EM algorithm converged after 29 iterations ## evaluating change point: 42 ## EM algorithm converged after 29 iterations ## evaluating change point: 49 ## EM algorithm converged after 30 iterations ## evaluating change point: 56 ## EM algorithm converged after 29 iterations ## Day 28 (week 4) was selected as the change point by AIC ## Number of subjects: 40000 ## Number of unique time points: 313 ## Partial log-likelihood at final estimates: -10909.76 ## PL converged after 2 iterations ## PL converged after 2 iterations ## PL converged after 1 iterations \end{verbatim} \vspace{.15in} The function returns a list object containing the following items. \vspace{.1in} \noindent \textbf{Covariate Effects}: The estimated (log) hazard ratio of each covariate, together with the estimated standard error, the $95\%$ confidence interval, and the two-sided p-value for testing no covariate effect. ```{r idove-constantVE-return-covariates} result2$covariates ``` \vspace{.15in} \noindent \textbf{Vaccine Efficacy}: Element \textbf{\$VE} contains the estimated constant VE, together with its standard error and the $95\%$ confidence interval. ```{r idove-aic-return-VE} result2$vaccine$VE ``` \noindent \textbf{References} Lin, D-Y, Gu, Y., Zeng, D., Janes, H. E., and Gilbert, P. B. (2021). Evaluating vaccine efficacy against SARS-CoV-2 infection. Clinical Infectious Diseases, ciab630, https://doi.org/10.1093/cid/ciab630