Package 'squant'

Title: Subgroup Identification Based on Quantitative Objectives
Description: A subgroup identification method for precision medicine based on quantitative objectives. This method can handle continuous, binary and survival endpoint for both prognostic and predictive case. For the predictive case, the method aims at identifying a subgroup for which treatment is better than control by at least a pre-specified or auto-selected constant. For the prognostic case, the method aims at identifying a subgroup that is at least better than a pre-specified/auto-selected constant. The derived signature is a linear combination of predictors, and the selected subgroup are subjects with the signature > 0. The false discover rate when no true subgroup exists is controlled at a user-specified level.
Authors: YAN SUN [aut, cre, cph], LING CHENG [aut], A.S. HEDAYAT [aut]
Maintainer: YAN SUN <[email protected]>
License: GPL-3
Version: 1.1.7
Built: 2024-12-19 06:54:16 UTC
Source: CRAN

Help Index


SQUANT performance evaluation

Description

eval_squant evaluates the subgroup identification performance.

Usage

eval_squant(
  yvar,
  censorvar,
  trtvar,
  trtcd = 1,
  dir,
  type,
  data,
  squant.out,
  brief = FALSE
)

Arguments

yvar

A character. The response variable name in the data. The corresponding column in the data should be numeric.

censorvar

A character or NULL. The event indicator variable name in the data. The corresponding column in the data should be 0(censor) or 1(event). Use NULL when it is not a time-to-event case.

trtvar

A character or NULL. The trt variable name in the data for the predictive case. The corresponding column in the data should contain the treatment assignments, and can be either numeric or character. Use NULL for the prognostics case.

trtcd

The code for the treatment arm for the predictive case, e.g. trtcd="treatment" or trtcd=1, etc.

dir

A character, "larger" or "smaller". When dir == "larger", larger response is preferred for the target subgroup. In the predictive case, it means the derived signature from squant selects patients satisfying E(Y|X,TRT)-E(Y|X,CTRL)>=quant. In the prognostic case, it means the derived signature from squant selects patients satisfying E(Y|X)>=quant. When dir == "smaller", smaller response is preferred for the target subgroup. In the predictive case, it means the derived signature from squant selects patients satisfying E(Y|X,CTRL)-E(Y|X,TRT)>=quant. In the prognostic case, it means the derived signature from squant selects patients satisfying E(Y|X)<=quant.

type

The response type. Use "s" for survival, "b" for binary, and "c" for continuous.

data

The data frame for performance evaluation of the derived signature.

squant.out

The squant object, the signature of which will be applied to the specified data. The output of squant function.

brief

A logical value, TRUE or FALSE. When TRUE, only the most important p value will be reported.

Details

This function evaluates the subgroup identification performance through applying the derived signature (the squant object) to a specified dataset. Note that when the specified dataset is the same as the training set, the performance is usually over-optimistic and is subject to over-fitting. Ideally, use an independent testing set to have an honest evaluation of the performance.

Value

An object of "eval_squant". A list containing the following elements.

inter.pval

Treatment*subgroup Interaction p value (predictive case only, adjusted for prognostic markers if any).

pos.group.pval

The p value of the trt difference in the selected positive group (predictive case only, adjusted for prognostic markers if any).

neg.group.pval

The p value of the trt difference in the negative group (predictive case only, adjusted for prognostic markers if any).

pval

The p value of group comparison (prognostic case only).

group.stats

The performance of each arm by group (predictive case) or the performance of each group (prognostic case).

data.pred

The data with the predicted subgroup in the last column.

Examples

#toy example#
set.seed(888)
x=as.data.frame(matrix(rnorm(200),100,2))
names(x) = c("x1", "x2")
trt = sample(0:1, size=100, replace=TRUE)
y= 2*x[,2]*trt+rnorm(100)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=c("x1", "x2"),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=2, n.cv = 10, FDR = 0.1, progress=FALSE)



#predictive case with continuous response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
trt = sample(0:1, size=200, replace=TRUE)
y=x[,1]+x[,2]*trt+rnorm(200)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=paste("x", 1:100,sep=""),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res
#fitted signature#
res$squant.fit
#performance of the identified subgroup#
#including:
#  interaction p value,
#  p valve of trt difference in positive group,
#  p value of trt difference in negative group,
#  and stats for each arm in each group.
res$performance
#interpretation#
res$interpretation1
res$interpretation2

#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar=NULL, trtvar="trt", trtcd=1, dir="larger",
                       type="c", data=data, squant.out=res, brief=FALSE)
#plot the subgroups#
plot(res, trt.name="Trt", ctrl.name="Ctrl")
plot(eval.res, trt.name="Trt", ctrl.name="Ctrl")



#prognostic case with survival response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
y=10*(10+x[,1]+rnorm(200))
data = cbind(y=y, x)
data$event = sample(c(rep(1,150),rep(0,50)))
res = squant(yvar="y", censorvar="event", xvars=paste("x", 1:100,sep=""),
             trtvar=NULL, trtcd=NULL, data=data, type="s", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res

#fitted signature#
res$squant.fit
#performance of the identified subgroup#
res$performance
#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar="event", trtvar=NULL, trtcd=NULL, dir="larger",
                       type="s", data=data, squant.out=res, brief=FALSE)

#plot the subgroups#
plot(res, trt.name=NULL, ctrl.name=NULL)
plot(eval.res, trt.name=NULL, ctrl.name=NULL)

Plot eval_squant result

Description

plot plots the subgroup identification performance.

Usage

## S3 method for class 'eval_squant'
plot(x, trt.name = "Trt", ctrl.name = "Ctrl", ...)

Arguments

x

An eval_squant object. The output of eval_squant function.

trt.name

The name used on plot for the treatment arm.

ctrl.name

The name used on plot for the control arm.

...

Ignored.

Details

An interaction plot is plotted for the predictive case and a group plot is plotted for the prognostic case.

Value

A ggplot.

Examples

#toy example#
set.seed(888)
x=as.data.frame(matrix(rnorm(200),100,2))
names(x) = c("x1", "x2")
trt = sample(0:1, size=100, replace=TRUE)
y= 2*x[,2]*trt+rnorm(100)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=c("x1", "x2"),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=2, n.cv = 10, FDR = 0.1, progress=FALSE)



#predictive case with continuous response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
trt = sample(0:1, size=200, replace=TRUE)
y=x[,1]+x[,2]*trt+rnorm(200)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=paste("x", 1:100,sep=""),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res
#fitted signature#
res$squant.fit
#performance of the identified subgroup#
#including:
#  interaction p value,
#  p valve of trt difference in positive group,
#  p value of trt difference in negative group,
#  and stats for each arm in each group.
res$performance
#interpretation#
res$interpretation1
res$interpretation2

#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar=NULL, trtvar="trt", trtcd=1, dir="larger",
                       type="c", data=data, squant.out=res, brief=FALSE)
#plot the subgroups#
plot(res, trt.name="Trt", ctrl.name="Ctrl")
plot(eval.res, trt.name="Trt", ctrl.name="Ctrl")



#prognostic case with survival response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
y=10*(10+x[,1]+rnorm(200))
data = cbind(y=y, x)
data$event = sample(c(rep(1,150),rep(0,50)))
res = squant(yvar="y", censorvar="event", xvars=paste("x", 1:100,sep=""),
             trtvar=NULL, trtcd=NULL, data=data, type="s", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res

#fitted signature#
res$squant.fit
#performance of the identified subgroup#
res$performance
#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar="event", trtvar=NULL, trtcd=NULL, dir="larger",
                       type="s", data=data, squant.out=res, brief=FALSE)

#plot the subgroups#
plot(res, trt.name=NULL, ctrl.name=NULL)
plot(eval.res, trt.name=NULL, ctrl.name=NULL)

Plot SQUANT result

Description

plot plots the subgroup identification performance.

Usage

## S3 method for class 'squant'
plot(x, trt.name = "Trt", ctrl.name = "Ctrl", ...)

Arguments

x

A squant object. The output of squant function.

trt.name

The name used on plot for the treatment arm.

ctrl.name

The name used on plot for the control arm.

...

Ignored.

Details

An interaction plot is plotted for the predictive case and a group plot is plotted for the prognostic case.

Value

A ggplot.

Examples

#toy example#
set.seed(888)
x=as.data.frame(matrix(rnorm(200),100,2))
names(x) = c("x1", "x2")
trt = sample(0:1, size=100, replace=TRUE)
y= 2*x[,2]*trt+rnorm(100)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=c("x1", "x2"),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=2, n.cv = 10, FDR = 0.1, progress=FALSE)



#predictive case with continuous response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
trt = sample(0:1, size=200, replace=TRUE)
y=x[,1]+x[,2]*trt+rnorm(200)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=paste("x", 1:100,sep=""),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res
#fitted signature#
res$squant.fit
#performance of the identified subgroup#
#including:
#  interaction p value,
#  p valve of trt difference in positive group,
#  p value of trt difference in negative group,
#  and stats for each arm in each group.
res$performance
#interpretation#
res$interpretation1
res$interpretation2

#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar=NULL, trtvar="trt", trtcd=1, dir="larger",
                       type="c", data=data, squant.out=res, brief=FALSE)
#plot the subgroups#
plot(res, trt.name="Trt", ctrl.name="Ctrl")
plot(eval.res, trt.name="Trt", ctrl.name="Ctrl")



#prognostic case with survival response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
y=10*(10+x[,1]+rnorm(200))
data = cbind(y=y, x)
data$event = sample(c(rep(1,150),rep(0,50)))
res = squant(yvar="y", censorvar="event", xvars=paste("x", 1:100,sep=""),
             trtvar=NULL, trtcd=NULL, data=data, type="s", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res

#fitted signature#
res$squant.fit
#performance of the identified subgroup#
res$performance
#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar="event", trtvar=NULL, trtcd=NULL, dir="larger",
                       type="s", data=data, squant.out=res, brief=FALSE)

#plot the subgroups#
plot(res, trt.name=NULL, ctrl.name=NULL)
plot(eval.res, trt.name=NULL, ctrl.name=NULL)

SQUANT prediction

Description

predict assigns subgroup for each individual in a new dataset.

Usage

## S3 method for class 'squant'
predict(object, data, ...)

Arguments

object

The squant object, the signature of which will be applied to the specified data. The output of squant function.

data

The data frame for prediction.

...

Ignored.

Details

This function assigns subgroup for each individual in a new dataset based on the derived signature contained within the squant object.

Value

A data frame with the predicted subgroup in the last column.


The SQUANT method

Description

squant conducts subgroup identification based on quantitative criteria.

Usage

squant(
  yvar,
  censorvar = NULL,
  xvars,
  trtvar = NULL,
  trtcd = 1,
  data,
  type = "c",
  weight = NULL,
  dir = "larger",
  quant = NULL,
  xvars.keep = NULL,
  alpha = 1,
  fold = 5,
  n.cv = 50,
  FDR = 0.15,
  progress = TRUE
)

Arguments

yvar

A character. The response variable name in the data. The corresponding column in the data should be numeric.

censorvar

A character or NULL. The event indicator variable name in the data. The corresponding column in the data should be 0(censor) or 1(event). Use NULL when it is not a time-to-event case.

xvars

A vector of characters. The covariates (predictors) variable names in the data. The corresponding columns in the data should be numeric.

trtvar

A character or NULL. The trt variable name in the data for the predictive case. The corresponding column in the data should contain the treatment assignments, and can be either numeric or character. Use NULL for the prognostics case.

trtcd

The code for the treatment arm for the predictive case, e.g. trtcd="treatment" or trtcd=1, etc.

data

The data frame for training.

type

The response type. Use "s" for survival, "b" for binary, and "c" for continuous.

weight

The weight of every observation, has to be a numeric vector>0 or NULL (equivalent to all 1).

dir

A character, "larger" or "smaller". When dir == "larger", larger response is preferred for the target subgroup. In the predictive case, it means selecting patients satisfying E(Y|X,TRT)-E(Y|X,CTRL)>=quant. In the prognostic case, it means selecting patients satisfying E(Y|X)>=quant. When dir == "smaller", smaller response is preferred for the target subgroup. In the predictive case, it means selecting patients satisfying E(Y|X,CTRL)-E(Y|X,TRT)>=quant. In the prognostic case, it means selecting patients satisfying E(Y|X)<=quant.

quant

A numeric value or NULL. The quantitative subgroup selection criterion. Please see dir. When NULL, the program will automatically select the best quant based on cross validation.

xvars.keep

A character vector. The names of variables that we want to keep in the final model.

alpha

The same alpha as in glmnet. alpha=1 is the lasso penalty.

fold

A numeric value. The number of folds for internal cross validation for variable selection.

n.cv

A numeric value. The number of different values of quant used for cross validation. It's also the number of CV to conduct variable selection.

FDR

A numeric value. The level of FDR control for variable selection and the entire training process.

progress

a logical value (TRUE/FALSE), whether to display the program progress.

Details

This is the main function of SQUANT to train subgroup signatures. This method can handle continuous, binary and survival endpoint for both prognostic and predictive case. For the predictive case, the method aims at identifying a subgroup for which treatment is better than control by at least a pre-specified or auto-selected constant. For the prognostic case, the method aims at identifying a subgroup that is at least better than a pre-specified/auto-selected constant. The derived signature is a linear combination of predictors, and the selected subgroup are subjects with the signature > 0. The false discover rate when no true subgroup exists is strictly controlled at a user-specified level.

Value

An object of "squant". A list containing the following elements.

squant.fit

The fitted signature from training, which is the coefficients of the linear combination of predictors plus an intercept.

data.pred

The training data with the predicted subgroup in the last column.

performance

The output of eval_squant (excluding the data.pred). The performance of subgroup identification. In the predictive case, the performance includes the interaction p value, the p value of the trt difference in the selected positive group, the p value of the trt difference in the unselected negative group (all adjusted for prognostic markers if any) and the stats for each arm in each group. In the prognostic case, the performance includes p value of group comparison and the stats of each group.

d.sel

Closely related to quant.Please see element: interpretation.

min.diff, threshold

Please see interpretation.

xvars.top

The ordered variable importance list.

FDR.min

The minimum achievable FDR threshold so that a signature can be derived. This is useful when a pre-specified FDR does not lead to a signature, in which case the FDR.min can be used instead.

prog.adj

Prognostic effect contributed by xvars.adj for each subject (predictive case only).

xvars.adj

Important prognostic markers to adjust in the model (predictive case only).

interpretation1

Interpretation of the result.

interpretation2

Interpretation of the result.

References

Yan Sun, Samad Hedayat. Subgroup Identification based on Quantitative Objectives. (submitted)

Examples

#toy example#
set.seed(888)
x=as.data.frame(matrix(rnorm(200),100,2))
names(x) = c("x1", "x2")
trt = sample(0:1, size=100, replace=TRUE)
y= 2*x[,2]*trt+rnorm(100)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=c("x1", "x2"),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=2, n.cv = 10, FDR = 0.1, progress=FALSE)



#predictive case with continuous response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
trt = sample(0:1, size=200, replace=TRUE)
y=x[,1]+x[,2]*trt+rnorm(200)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=paste("x", 1:100,sep=""),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res
#fitted signature#
res$squant.fit
#performance of the identified subgroup#
#including:
#  interaction p value,
#  p valve of trt difference in positive group,
#  p value of trt difference in negative group,
#  and stats for each arm in each group.
res$performance
#interpretation#
res$interpretation1
res$interpretation2

#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar=NULL, trtvar="trt", trtcd=1, dir="larger",
                       type="c", data=data, squant.out=res, brief=FALSE)
#plot the subgroups#
plot(res, trt.name="Trt", ctrl.name="Ctrl")
plot(eval.res, trt.name="Trt", ctrl.name="Ctrl")



#prognostic case with survival response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
y=10*(10+x[,1]+rnorm(200))
data = cbind(y=y, x)
data$event = sample(c(rep(1,150),rep(0,50)))
res = squant(yvar="y", censorvar="event", xvars=paste("x", 1:100,sep=""),
             trtvar=NULL, trtcd=NULL, data=data, type="s", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res

#fitted signature#
res$squant.fit
#performance of the identified subgroup#
res$performance
#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar="event", trtvar=NULL, trtcd=NULL, dir="larger",
                       type="s", data=data, squant.out=res, brief=FALSE)

#plot the subgroups#
plot(res, trt.name=NULL, ctrl.name=NULL)
plot(eval.res, trt.name=NULL, ctrl.name=NULL)