Package 'targeted'

Title: Targeted Inference
Description: Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).
Authors: Klaus K. Holst [aut, cre], Andreas Nordland [aut]
Maintainer: Klaus K. Holst <[email protected]>
License: Apache License (== 2.0)
Version: 0.5
Built: 2024-11-24 06:58:57 UTC
Source: CRAN

Help Index


AIPW estimator

Description

AIPW for the mean (and linear projections of the EIF) with missing observations

Usage

aipw(response_model, data, formula = ~1, missing_model, ...)

Arguments

response_model

Model for the response given covariates (ml_model or formula)

data

data.frame

formula

design specifying the OLS estimator with outcome given by the EIF

missing_model

Optional missing_model (ml_model or formula). By default will use the same design as the response_model.

...

arguments to cate

Examples

m <- lvm(y ~ x+z, r ~ x)
distribution(m,~ r) <- binomial.lvm()
transform(m, y0~r+y) <- function(x) { x[x[,1]==0,2] <- NA; x[,2] }
d <- sim(m,1e3,seed=1)

aipw(y0 ~ x, data=d)

Assumption Lean inference for generalized linear model parameters

Description

Assumption lean inference via cross-fitting (Double ML). See <doi:10.1111/rssb.12504

Usage

alean(
  response_model,
  exposure_model,
  data,
  link = "identity",
  g_model,
  nfolds = 1,
  silent = FALSE,
  mc.cores,
  ...
)

Arguments

response_model

formula or ml_model object (formula => glm)

exposure_model

model for the exposure

data

data.frame

link

Link function (g)

g_model

Model for E[g(YA,W)W]E[g(Y|A,W)|W]

nfolds

Number of folds

silent

supress all messages and progressbars

mc.cores

mc.cores Optional number of cores. parallel::mcmapply used instead of future

...

additional arguments to future.apply::future_mapply

Details

Let YY be the response variable, AA the exposure and WW covariates. The target parameter is:

Ψ(P)=E(Cov[A,g{E(YA,W)}W])E{Var(AW)}\Psi(P) = \frac{E(Cov[A, g\{E(Y|A,W)\}\mid W])} {E\{Var(A\mid W)\}}

The response_model is the model for E(YA,W)E(Y|A,W), and exposure_model is the model for E(AW)E(A|W). link specifies gg.

Value

alean.targeted object

Author(s)

Klaus Kähler Holst

Examples

sim1 <- function(n, family=gaussian(), ...) {
   m <- lvm() |>
     distribution(~ y, binomial.lvm()) |>
     regression('a', value=function(l) l) |>
     regression('y', value=function(a,l) a + l)
     if (family$family=="binomial")
        distribution(m, ~a) <- binomial.lvm()
   sim(m, n)
}

library(splines)
f <- binomial()
d <- sim1(1e4, family=f)
e <- alean(response_model=ML(y ~ a + bs(l, df=3), family=binomial),
           exposure_model=ML(a ~ bs(l, df=3), family=f),
           data=d,
           link = "logit", mc.cores=1, nfolds=1)
e

e <- alean(response_model=ML(y ~ a + l, family=binomial),
           exposure_model=ML(a ~ l),
           data=d,
           link = "logit", mc.cores=1, nfolds=1)
e

AIPW (doubly-robust) estimator for Average Treatement Effect

Description

Augmented Inverse Probability Weighting estimator for the Average (Causal) Treatment Effect. All nuisance models are here parametric (glm). For a more general approach see the cate implementation. In this implementation the standard errors are correct even when the nuisance models are misspecified (the influence curve is calculated including the term coming from the parametric nuisance models). The estimate is consistent if either the propensity model or the outcome model / Q-model is correctly specified.

Usage

ate(
  formula,
  data = parent.frame(),
  weights,
  offset,
  family = stats::gaussian(identity),
  nuisance = NULL,
  propensity = nuisance,
  all,
  labels = NULL,
  ...
)

Arguments

formula

Formula (see details below)

data

data.frame

weights

optional frequency weights

offset

optional offset (character or vector). can also be specified in the formula.

family

Exponential family argument for outcome model

nuisance

outcome regression formula (Q-model)

propensity

propensity model formula

all

If TRUE all standard errors are calculated (default TRUE when exposure only has two levels)

labels

Optional treatment labels

...

Additional arguments to lower level functions

Details

The formula may either be specified as: response ~ treatment | nuisance-formula | propensity-formula

For example: ate(y~a | x+z+a | x*z, data=...)

Alternatively, as a list: ate(list(y~a, ~x+z, ~x*z), data=...)

Or using the nuisance (and propensity argument): ate(y~a, nuisance=~x+z, ...)

Value

An object of class 'ate.targeted' is returned. See targeted-class for more details about this class and its generic functions.

Author(s)

Klaus K. Holst

See Also

cate

Examples

m <- lvm(y ~ a+x, a~x)
distribution(m, ~y) <- binomial.lvm()
m <- ordinal(m, K=4, ~a)
transform(m, ~a) <- factor
d <- sim(m, 1e3, seed=1)
(a <- ate(y~a|a*x|x, data=d))
## ate(y~a, nuisance=~a*x, propensity=~x, ...)

# Comparison with randomized experiment
m0 <- cancel(m, a~x)
lm(y~a-1, sim(m0,2e4))

# Choosing a different contrast for the association measures
summary(a, contrast=c(2,4))

Calibration (training)

Description

Calibration for multiclassication methods

Usage

calibration(
  pr,
  cl,
  weights = NULL,
  threshold = 10,
  method = "bin",
  breaks = nclass.Sturges,
  df = 3,
  ...
)

Arguments

pr

matrix with probabilities for each class

cl

class variable

weights

counts

threshold

do not calibrate if less then 'threshold' events

method

either 'isotonic' (pava), 'logistic', 'mspline' (monotone spline), 'bin' (local constant)

breaks

optional number of bins (only for method 'bin')

df

degrees of freedom (only for spline methods)

...

additional arguments to lower level functions

Details

...

Value

An object of class 'calibration' is returned. See calibration-class for more details about this class and its generic functions.

Author(s)

Klaus K. Holst

Examples

sim1 <- function(n, beta=c(-3, rep(.5,10)), rho=.5) {
 p <- length(beta)-1
 xx <- lava::rmvn0(n,sigma=diag(nrow=p)*(1-rho)+rho)
 y <- rbinom(n, 1, lava::expit(cbind(1,xx)%*%beta))
 d <- data.frame(y=y, xx)
 names(d) <- c("y",paste0("x",1:p))
 return(d)
}

set.seed(1)
beta <- c(-2,rep(1,10))
d <- sim1(1e4, beta=beta)
a1 <- NB(y ~ ., data=d)
a2 <- glm(y ~ ., data=d, family=binomial)
## a3 <- randomForest(factor(y) ~ ., data=d, family=binomial)

d0 <- sim1(1e4, beta=beta)
p1 <- predict(a1, newdata=d0)
p2 <- predict(a2, newdata=d0, type="response")
## p3 <- predict(a3, newdata=d0, type="prob")

c2 <- calibration(p2, d0$y, method="isotonic")
c1 <- calibration(p1, d0$y, breaks=100)
if (interactive()) {
  plot(c1)
  plot(c2,col="red",add=TRUE)
  abline(a=0,b=1)##'
  with(c1$xy[[1]], points(pred,freq,type="b", col="red"))
}

set.seed(1)
beta <- c(-2,rep(1,10))
dd <- lava::csplit(sim1(1e4, beta=beta), k=3)
mod <- NB(y ~ ., data=dd[[1]])
p1 <- predict(mod, newdata=dd[[2]])
cal <- calibration(p1, dd[[2]]$y)
p2 <- predict(mod, newdata=dd[[3]])
pp <- predict(c1, p2)
cc <- calibration(pp, dd[[3]]$y)
if (interactive()) {##'
  plot(cal)
  plot(cc, add=TRUE, col="blue")
}

calibration class object

Description

The functions calibration returns an object of the class calibration.

An object of class 'calibration' is a list with at least the following components:

stepfun

estimated step-functions (see stepfun) for each class

classes

the unique classes

model

model/method type (string)

xy

list of data.frame's with predictions (pr) and estimated probabilities of success (only for 'bin' method)

Value

objects of the S3 class 'calibration'

S3 generics

The following S3 generic functions are available for an object of class targeted:

predict

Apply calibration to new data.

plot

Plot the calibration curves (reliability plot).

print

Basic print method.

See Also

calibration, calibrate

Examples

## See example(calibration) for examples

Conditional Average Treatment Effect estimation

Description

Conditional Average Treatment Effect estimation via Double Machine Learning

Usage

cate(
  treatment,
  response_model,
  propensity_model,
  contrast = c(1, 0),
  data,
  nfolds = 5,
  type = "dml2",
  silent = FALSE,
  stratify = FALSE,
  mc.cores,
  ...
)

Arguments

treatment

formula specifying treatment and variables to condition on

response_model

formula or ml_model object (formula => glm)

propensity_model

formula or ml_model object (formula => glm)

contrast

treatment contrast (default 1 vs 0)

data

data.frame

nfolds

Number of folds

type

'dml1' or 'dml2'

silent

supress all messages and progressbars

stratify

If TRUE the response_model will be stratified by treatment

mc.cores

mc.cores Optional number of cores. parallel::mcmapply used instead of future

...

additional arguments to future.apply::future_mapply

Value

cate.targeted object

Author(s)

Klaus Kähler Holst

Examples

sim1 <- function(n=1e4,
                 seed=NULL,
                 return_model=FALSE, ...) {
suppressPackageStartupMessages(require("lava"))
if (!is.null(seed)) set.seed(seed)
m <- lava::lvm()
regression(m, ~a) <- function(z1,z2,z3,z4,z5)
         cos(z1)+sin(z1*z2)+z3+z4+z5^2
regression(m, ~u) <- function(a,z1,z2,z3,z4,z5)
        (z1+z2+z3)*a + z1+z2+z3 + a
distribution(m, ~a) <- binomial.lvm()
if (return_model) return(m)
lava::sim(m, n, p=par)
}

d <- sim1(200)
e <- cate(a ~ z1+z2+z3, response=u~., data=d)
e

cross_validated class object

Description

The functions cv returns an object of the type cross_validated.

An object of class 'cross_validated' is a list with at least the following components:

cv

An array with the model score(s) evaluated for each fold, repetition, and model estimates (see estimate.default)

names

Names (character vector) of the models

rep

number of repetitions of the CV

folds

Number of folds of the CV

Value

objects of the S3 class 'cross_validated'

S3 generics

The following S3 generic functions are available for an object of class cross_validated:

coef

Extract average model scores from the cross-validation procedure.

print

Basic print method.

summary

Summary of the cross-validation procedure.

'

See Also

cv

Examples

## See example(cv) for examples

Conditional Relative Risk estimation

Description

Conditional Relative Risk estimation via Double Machine Learning

Usage

crr(
  treatment,
  response_model,
  propensity_model,
  importance_model,
  contrast = c(1, 0),
  data,
  nfolds = 5,
  type = "dml1",
  ...
)

Arguments

treatment

formula specifying treatment and variables to condition on

response_model

SL object

propensity_model

SL object

importance_model

SL object

contrast

treatment contrast (default 1 vs 0)

data

data.frame

nfolds

Number of folds

type

'dml1' or 'dml2'

...

additional arguments to SuperLearner

Value

cate.targeted object

Author(s)

Klaus Kähler Holst & Andreas Nordland

Examples

sim1 <- function(n=1e4,
                 seed=NULL,
                 return_model=FALSE, ...){
suppressPackageStartupMessages(require("lava"))
if (!is.null(seed)) set.seed(seed)
m <- lava::lvm()
distribution(m, ~x) <- gaussian.lvm()
distribution(m, ~v) <- gaussian.lvm(mean = 10)
distribution(m, ~a) <- binomial.lvm("logit")
regression(m, "a") <- function(v, x){.1*v + x}
distribution(m, "y") <- gaussian.lvm()
regression(m, "y") <- function(a, v, x){v+x+a*x+a*v*v}
if (return_model) return(m)
lava::sim(m, n = n)
}

d <- sim1(n = 2e3, seed = 1)
if (require("SuperLearner",quietly=TRUE)) {
  e <- crr(data=d,
           type = "dml2",
           treatment = a ~ v,
           response_model = ML(y~ a*(x + v + I(v^2))),
           importance_model = ML(D_ ~ v + I(v^2)),
           propensity_model = ML(a ~ x + v + I(v^2), family=binomial),
           nfolds = 2)
  summary(e) # the true parameters are c(1,1)
}

Cross-validation

Description

Generic cross-validation function

Usage

cv(
  models,
  data,
  response = NULL,
  nfolds = 5,
  rep = 1,
  weights = NULL,
  modelscore,
  seed = NULL,
  shared = NULL,
  args.pred = NULL,
  args.future = list(),
  mc.cores,
  ...
)

Arguments

models

List of fitting functions

data

data.frame or matrix

response

Response variable (vector or name of column in data).

nfolds

Number of folds (default 5. K=0 splits in 1:n/2, n/2:n with last part used for testing)

rep

Number of repetitions (default 1)

weights

Optional frequency weights

modelscore

Model scoring metric (default: MSE / Brier score). Must be a function with arguments: response, prediction, weights, ...

seed

Random seed (argument parsed to future_Apply::future_lapply)

shared

Function applied to each fold with results send to each model

args.pred

Optional arguments to prediction function (see details below)

args.future

Arguments to future.apply::future_mapply

mc.cores

Optional number of cores. parallel::mcmapply used instead of future

...

Additional arguments parsed to models in models

Details

models should be list of objects of class ml_model. Alternatively, each element of models should be a list with a fitting function and a prediction function.

The response argument can optionally be a named list where the name is then used as the name of the response argument in models. Similarly, if data is a named list with a single data.frame/matrix then this name will be used as the name of the data/design matrix argument in models.

Value

An object of class 'cross_validated' is returned. See cross_validated-class for more details about this class and its generic functions.

Author(s)

Klaus K. Holst

Examples

f0 <- function(data,...) lm(...,data=data)
f1 <- function(data,...) lm(Sepal.Length~Species,data=data)
f2 <- function(data,...) lm(Sepal.Length~Species+Petal.Length,data=data)
x <- cv(list(m0=f0,m1=f1,m2=f2),rep=10, data=iris, formula=Sepal.Length~.)
x

Extract design matrix

Description

Extract design matrix from data.frame and formula

Usage

design(formula, data, intercept = FALSE, rm_envir = FALSE, ...)

Arguments

formula

formula

data

data.frame

intercept

If FALSE (default) an intercept is not included

rm_envir

Remove environment

...

additional arguments (e.g, specials such weights, offsets, subset)

Value

An object of class 'design'

Author(s)

Klaus Kähler Holst


Create a list from all combination of input variables

Description

Similar to expand.grid function, this function creates all combinations of the input arguments but returns the result as a list.

Usage

expand.list(...)

Arguments

...

input variables

Value

list

Author(s)

Klaus Kähler Holst

Examples

expand.list(x=2:4, z=c("a","b"))

ML model

Description

Wrapper for ml_model

Usage

ML(formula, model = "glm", ...)

Arguments

formula

formula

model

model (sl, rf, pf, glm, ...)

...

additional arguments to model object

Details

model 'sl' (SuperLearner::SuperLearner) args: SL.library, cvControl, f<aamily, method example:

model 'grf' (grf::regression_forest) args: num.trees, mtry, sample.weights, sample.fraction, min.node.size, ... example:

model 'grf.binary' (grf::probability_forest) args: num.trees, mtry, sample.weights, ... example:

model 'glm' args: family, weights, offset, ...


R6 class for prediction models

Description

Provides standardized estimation and prediction methods

Public fields

info

Optional information/name of the model

formals

List with formal arguments of estimation and prediction functions

formula

Formula specifying response and design matrix

args

additional arguments specified during initialization

Active bindings

fit

Active binding returning estimated model object

Methods

Public methods


Method new()

Create a new prediction model object

Usage
ml_model$new(
  formula = NULL,
  estimate,
  predict = stats::predict,
  predict.args = NULL,
  info = NULL,
  specials,
  response.arg = "y",
  x.arg = "x",
  ...
)
Arguments
formula

formula specifying outcome and design matrix

estimate

function for fitting the model (must be a function response, 'y', and design matrix, 'x'. Alternatively, a function with a single 'formula' argument)

predict

prediction function (must be a function of model object, 'object', and new design matrix, 'newdata')

predict.args

optional arguments to prediction function

info

optional description of the model

specials

optional additional terms (weights, offset, id, subset, ...) passed to 'estimate'

response.arg

name of response argument

x.arg

name of design matrix argument

...

optional arguments to fitting function


Method estimate()

Estimation method

Usage
ml_model$estimate(data, ..., store = TRUE)
Arguments
data

data.frame

...

Additional arguments to estimation method

store

Logical determining if estimated model should be stored inside the class.


Method predict()

Prediction method

Usage
ml_model$predict(newdata, ..., object = NULL)
Arguments
newdata

data.frame

...

Additional arguments to prediction method

object

Optional model fit object


Method update()

Update formula

Usage
ml_model$update(formula, ...)
Arguments
formula

formula or character which defines the new response

...

Additional arguments to lower level functions


Method print()

Print method

Usage
ml_model$print(...)
Arguments
...

Additional arguments to lower level functions


Method response()

Extract response from data

Usage
ml_model$response(data, ...)
Arguments
data

data.frame

...

additional arguments to 'design'


Method design()

Extract design matrix (features) from data

Usage
ml_model$design(data, ...)
Arguments
data

data.frame

...

additional arguments to 'design'


Method opt()

Get options

Usage
ml_model$opt(arg, ...)
Arguments
arg

name of option to get value of

...

additional arguments to lower level functions


Method clone()

The objects of this class are cloneable with this method.

Usage
ml_model$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Klaus Kähler Holst

Examples

data(iris)
rf <- function(formula, ...)
ml_model$new(formula, info="grf::probability_forest",
  estimate=function(x,y, ...) grf::probability_forest(X=x, Y=y, ...),
  predict=function(object, newdata)
             predict(object, newdata)$predictions, ...)

args <- expand.list(num.trees=c(100,200), mtry=1:3,
          formula=c(Species ~ ., Species ~ Sepal.Length + Sepal.Width))
models <- lapply(args, function(par) do.call(rf, par))

x <- models[[1]]$clone()
x$estimate(iris)
predict(x, newdata=head(iris))

 # Reduce Ex. timing
a <- targeted::cv(models, data=iris)
cbind(coef(a), attr(args, "table"))


ff <- ml_model$new(estimate=function(y,x) lm.fit(x=x, y=y),
        predict=function(object, newdata) newdata%*%object$coefficients)
## tmp <- ff$estimate(y, x=x)
## ff$predict(x)

Naive Bayes

Description

Naive Bayes Classifier

Usage

NB(
  formula,
  data,
  weights = NULL,
  kernel = FALSE,
  laplace.smooth = 0,
  prior = NULL,
  ...
)

Arguments

formula

Formula with syntax: response ~ predictors | weights

data

data.frame

weights

optional frequency weights

kernel

If TRUE a kernel estimator is used for numeric predictors (otherwise a gaussian model is used)

laplace.smooth

Laplace smoothing

prior

optional prior probabilities (default estimated from data)

...

additional arguments to lower level functions

Value

An object of class 'NB' is returned. See NB-class for more details about this class and its generic functions.

Author(s)

Klaus K. Holst

Examples

data(iris)
m2 <- NB(Species ~ Sepal.Width + Petal.Length, data=iris)
pr2 <- predict(m2, newdata=iris)

NB class object

Description

The functions NB returns an object of the type NB.

An object of class 'NB' is a list with at least the following components:

prior

Matrix with prior probabilities, i.e. marginal class probabilities Pr(class)

pcond

list of matrices with conditional probabilities of the features given the classes (one list element per class), Pr(x|class)

classes

Names (character vector) of the classes

xvar

number of repetitions of the CV

xmodel

Number of folds of the CV

model

Number of folds of the CV

Value

objects of the S3 class 'NB'

S3 generics

The following S3 generic functions are available for an object of class NB:

predict

Predict class probabilities for new features data.

print

Basic print method.

See Also

NB, NB2

Examples

## See example(NB) for examples

Find non-dominated points of a set

Description

Find the non-dominated point of a set (minima of a point set).

Usage

nondom(x, ...)

Arguments

x

matrix

...

additional arguments to lower level functions

Details

A point x dominates y if it is never worse and at least in one case strictly better. Formally, let f_i denote the ith coordinate of the condition (objective) function, then for all i: f_i(x)<=f_i(y) and there exists j: f_j(x)<f_j(y).

Based on the algorithm of Kung et al. 1975.

Value

matrix

Author(s)

Klaus Kähler Holst

Examples

rbind(
  c(1.0, 0.5),
  c(0.0, 1.0),
  c(1.0, 0.0),
  c(0.5, 1.0),
  c(1.0, 1.0),
  c(0.8, 0.8)) |> nondom()

Pooled Adjacent Violators Algorithm

Description

Pooled Adjacent Violators Algorithm

Usage

pava(y, x = numeric(0), weights = numeric(0))

Arguments

y

response variable

x

(optional) predictor vector (otherwise y is assumed to be a priori sorted according to relevant predictor)

weights

weights (optional) weights

Value

List with index (idx) of jump points and values (value) at each jump point.

Author(s)

Klaus K. Holst

Examples

x <- runif(5e3, -5, 5)
pr <- lava::expit(-1 + x)
y <- rbinom(length(pr), 1, pr)
pv <- pava(y, x)
plot(pr ~ x, cex=0.3)
with(pv, lines(sort(x)[index], value, col="red", type="s"))

Prediction for kernel density estimates

Description

Kernel density estimator predictions

Usage

## S3 method for class 'density'
predict(object, xnew, ...)

Arguments

object

density object

xnew

New data on which to make predictions for

...

additional arguments to lower level functions

Author(s)

Klaus K. Holst


Predictions for Naive Bayes Classifier

Description

Naive Bayes Classifier predictions

Usage

## S3 method for class 'NB'
predict(object, newdata, expectation = NULL, threshold = c(0.001, 0.001), ...)

Arguments

object

density object

newdata

new data on which to make predictions

expectation

Variable to calculate conditional expectation wrt probabilities from NB classifier

threshold

Threshold parameters. First element defines the threshold on the probabilities and the second element the value to set those truncated probabilities to.

...

Additional arguments to lower level functions

Author(s)

Klaus K. Holst


Responder Average Treatment Effect

Description

Estimation of the Average Treatment Effect among Responders

Usage

RATE(
  response,
  post.treatment,
  treatment,
  data,
  family = gaussian(),
  M = 5,
  pr.treatment,
  treatment.level,
  SL.args.response = list(family = gaussian(), SL.library = c("SL.mean", "SL.glm")),
  SL.args.post.treatment = list(family = binomial(), SL.library = c("SL.mean", "SL.glm")),
  preprocess = NULL,
  efficient = TRUE,
  ...
)

Arguments

response

Response formula (e.g, Y~D*A)

post.treatment

Post treatment marker formula (e.g., D~W)

treatment

Treatment formula (e.g, A~1)

data

data.frame

family

Exponential family for response (default gaussian)

M

Number of folds in cross-fitting (M=1 is no cross-fitting)

pr.treatment

(optional) Randomization probability of treatment.

treatment.level

Treatment level in binary treatment (default 1)

SL.args.response

Arguments to SuperLearner for the response model

SL.args.post.treatment

Arguments to SuperLearner for the post treatment indicator

preprocess

(optional) Data preprocessing function

efficient

If TRUE, the estimate will be efficient. If FALSE, the estimate will be a simple plug-in estimate.

...

Additional arguments to lower level functions

Value

estimate object

Author(s)

Andreas Nordland, Klaus K. Holst


Responder Average Treatment Effect

Description

Estimation of the Average Treatment Effect among Responders for Survival Outcomes

Usage

RATE.surv(
  response,
  post.treatment,
  treatment,
  censoring,
  tau,
  data,
  M = 5,
  pr.treatment,
  call.response,
  args.response = list(),
  SL.args.post.treatment = list(family = binomial(), SL.library = c("SL.mean", "SL.glm")),
  call.censoring,
  args.censoring = list(),
  preprocess = NULL,
  ...
)

Arguments

response

Response formula (e.g., Surv(time, event) ~ D + W).

post.treatment

Post treatment marker formula (e.g., D ~ W).

treatment

Treatment formula (e.g., A ~ 1).

censoring

Censoring formula (e.g., Surv(time, event == 0) ~ D + A + W)).

tau

Time-point of interest, see Details.

data

data.frame.

M

Number of folds in cross-fitting (M=1 is no cross-fitting).

pr.treatment

(optional) Randomization probability of treatment.

call.response

Model call for the response model (e.g. "mets::phreg").

args.response

Additional arguments to the response model.

SL.args.post.treatment

Additional arguments to SuperLearner for the post treatment indicator model.

call.censoring

Similar to call.response.

args.censoring

Similar to args.response.

preprocess

(optional) Data pre-processing function.

...

Additional arguments to lower level data pre-processing functions.

Details

Estimation of

P(TτA=1)P(TτA=1)E[DA=1]\frac{P(T \leq \tau|A=1) - P(T \leq \tau|A=1)}{E[D|A=1]}

under right censoring based on plug-in estimates of P(TτA=a)P(T \leq \tau|A=a) and E[DA=1]E[D|A=1].

An efficient one-step estimator of P(TτA=a)P(T \leq \tau|A=a) is constructed using the efficient influence function

I{A=a}P(A=a)(ΔS0c(T~X)I{T~τ}+0τS0(uX)S0(τX)S0(uX)S0c(uX)dM0c(uX)))+(1I{A=a}P(A=a))F0(τA=a,W)P(TτA=a).\frac{I\{A=a\}}{P(A = a)} \Big(\frac{\Delta}{S^c_{0}(\tilde T|X)} I\{\tilde T \leq \tau\} + \int_0^\tau \frac{S_0(u|X)-S_0(\tau|X)}{S_0(u|X)S^c_0(u|X)} d M^c_0(u|X))\Big)\\ + \Big(1 - \frac{I\{A=a\}}{P(A = a)}\Big)F_0(\tau|A=a, W) - P(T \leq \tau|A=a).

An efficient one-step estimator of E[DA=1]E[D|A=1] is constructed using the efficient influence function

AP(A=1)(DE[DA=1,W])+E[DA=1,W]E[DA=1].\frac{A}{P(A = 1)}\left(D-E[D|A=1, W]\right) + E[D|A=1, W] -E[D|A=1].

Value

estimate object

Author(s)

Andreas Nordland, Klaus K. Holst


Risk regression

Description

Risk regression with binary exposure and nuisance model for the odds-product.

Let AA be the binary exposure, VV the set of covariates, and YY the binary response variable, and define pa(v)=P(Y=1A=a,V=v),a{0,1}p_a(v) = P(Y=1 \mid A=a, V=v), a\in\{0,1\}.

The target parameter is either the relative risk

RR(v)=p1(v)p0(v)\mathrm{RR}(v) = \frac{p_1(v)}{p_0(v)}

or the risk difference

RD(v)=p1(v)p0(v)\mathrm{RD}(v) = p_1(v)-p_0(v)

We assume a target parameter model given by either

log{RR(v)}=αtv\log\{RR(v)\} = \alpha^t v

or

arctanh{RD(v)}=αtv\mathrm{arctanh}\{RD(v)\} = \alpha^t v

and similarly a working linear nuisance model for the odds-product

ϕ(v)=log(p0(v)p1(v)(1p0(v))(1p1(v)))=βtv\phi(v) = \log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right) = \beta^t v

.

A propensity model for E(A=1V)E(A=1|V) is also fitted using a logistic regression working model

logit{E(A=1V=v)}=γtv.\mathrm{logit}\{E(A=1\mid V=v)\} = \gamma^t v.

If both the odds-product model and the propensity model are correct the estimator is efficient. Further, the estimator is consistent in the union model, i.e., the estimator is double-robust in the sense that only one of the two models needs to be correctly specified to get a consistent estimate.

Usage

riskreg(
  formula,
  nuisance = ~1,
  propensity = ~1,
  target = ~1,
  data,
  weights,
  type = "rr",
  optimal = TRUE,
  std.err = TRUE,
  start = NULL,
  mle = FALSE,
  ...
)

Arguments

formula

formula (see details below)

nuisance

nuisance model (formula)

propensity

propensity model (formula)

target

(optional) target model (formula)

data

data.frame

weights

optional weights

type

type of association measure (rd og rr)

optimal

If TRUE optimal weights are calculated

std.err

If TRUE standard errors are calculated

start

optional starting values

mle

Semi-parametric (double-robust) estimate or MLE (TRUE gives MLE)

...

additional arguments to unconstrained optimization routine (nlminb)

Details

The 'formula' argument should be given as response ~ exposure | target-formula | nuisance-formula or response ~ exposure | target | nuisance | propensity

E.g., riskreg(y ~ a | 1 | x+z | x+z, data=...)

Alternatively, the model can specifed using the target, nuisance and propensity arguments: riskreg(y ~ a, target=~1, nuisance=~x+z, ...)

The riskreg_fit function can be used with matrix inputs rather than formulas.

Value

An object of class 'riskreg.targeted' is returned. See targeted-class for more details about this class and its generic functions.

Author(s)

Klaus K. Holst

References

Richardson, T. S., Robins, J. M., & Wang, L. (2017). On modeling and estimation for the relative risk and risk difference. Journal of the American Statistical Association, 112(519), 1121–1130. http://dx.doi.org/10.1080/01621459.2016.1192546

Examples

m <- lvm(a[-2] ~ x,
         z ~ 1,
         lp.target[1] ~ 1,
         lp.nuisance[-1] ~ 2*x)
distribution(m,~a) <- binomial.lvm("logit")
m <- binomial.rr(m, "y","a","lp.target","lp.nuisance")
d <- sim(m,5e2,seed=1)

I <- model.matrix(~1, d)
X <- model.matrix(~1+x, d)
with(d, riskreg_mle(y, a, I, X, type="rr"))

with(d, riskreg_fit(y, a, nuisance=X, propensity=I, type="rr"))
riskreg(y ~ a | 1, nuisance=~x ,  data=d, type="rr")

## Model with same design matrix for nuisance and propensity model:
with(d, riskreg_fit(y, a, nuisance=X, type="rr"))

## a <- riskreg(y ~ a, target=~z, nuisance=~x,  propensity=~x, data=d, type="rr")
a <- riskreg(y ~ a | z, nuisance=~x,  propensity=~x, data=d, type="rr")
a
predict(a, d[1:5,])

riskreg(y ~ a, nuisance=~x,  data=d, type="rr", mle=TRUE)

Binary regression models with right censored outcomes

Description

Binary regression models with right censored outcomes

Usage

riskreg_cens(
  response,
  censoring,
  treatment = NULL,
  prediction = NULL,
  data,
  newdata,
  tau,
  type = "risk",
  M = 1,
  call.response = "phreg",
  args.response = list(),
  call.censoring = "phreg",
  args.censoring = list(),
  preprocess = NULL,
  efficient = TRUE,
  control = list(),
  ...
)

Arguments

response

Response formula (e.g., Surv(time, event) ~ D + W).

censoring

Censoring formula (e.g., Surv(time, event == 0) ~ D + A + W)).

treatment

Optional treatment model (ml_model)

prediction

Optional prediction model (ml_model)

data

data.frame.

newdata

Optional data.frame. In this case the uncentered influence function evalued in 'newdata' is returned with nuisance parameters obtained from 'data'.

tau

Time-point of interest, see Details.

type

"risk", "treatment", "rmst", "brier"

M

Number of folds in cross-fitting (M=1 is no cross-fitting).

call.response

Model call for the response model (e.g. "mets::phreg").

args.response

Additional arguments to the response model.

call.censoring

Similar to call.response.

args.censoring

Similar to args.response.

preprocess

(optional) Data pre-processing function.

efficient

If FALSE an IPCW estimator is returned

control

See details

...

Additional arguments to lower level data pre-processing functions.

Details

The one-step estimator depends on the calculation of an integral wrt. the martingale process corresponding to the counting process N(t) = I(C>min(T,tau)). This can be decomposed into an integral wrt the counting process, dNc(t)dN_c(t) and the compensator dΛc(t)d\Lambda_c(t) where the latter term can be computational intensive to calculate. Rather than calculating this integral in all observed time points, we can make a coarser evaluation which can be controlled by setting control=(sample=N). With N=0 the (computational intensive) standard evaluation is used.##'

Value

estimate object

Author(s)

Klaus K. Holst, Andreas Nordland


Predictive model scoring

Description

Predictive model scoring

Usage

scoring(
  response,
  ...,
  type = "quantitative",
  levels = NULL,
  metrics = NULL,
  weights = NULL,
  names = NULL,
  messages = 1
)

Arguments

response

Observed response

...

model predictions (continuous predictions or class probabilities (matrices))

type

continuous or categorical response (the latter is automatically chosen if response is a factor, otherwise a continuous response is assumed)

levels

(optional) unique levels in response variable

metrics

which metrics to report

weights

optional frequency weights

names

optional names of models coments (given as ..., alternatively these can be named arguments)

messages

controls amount of messages/warnings (0: none)

Value

Numeric matrix of dimension m x p, where m is the number of different models and p is the number of model metrics

Examples

data(iris)
set.seed(1)
dat <- csplit(iris,2)
g1 <- NB(Species ~ Sepal.Width + Petal.Length, data=dat[[1]])
g2 <- NB(Species ~ Sepal.Width, data=dat[[1]])
pr1 <- predict(g1, newdata=dat[[2]], wide=TRUE)
pr2 <- predict(g2, newdata=dat[[2]], wide=TRUE)
table(colnames(pr1)[apply(pr1,1,which.max)], dat[[2]]$Species)
table(colnames(pr2)[apply(pr2,1,which.max)], dat[[2]]$Species)
scoring(dat[[2]]$Species, pr1=pr1, pr2=pr2)
## quantitative response:
scoring(response=1:10, prediction=rnorm(1:10))

SuperLearner wrapper for ml_model

Description

SuperLearner wrapper for ml_model

Usage

SL(
  formula = ~.,
  ...,
  SL.library = c("SL.mean", "SL.glm"),
  binomial = FALSE,
  data = NULL
)

Arguments

formula

Model design

...

Additional arguments for SuperLearner::SuperLearner

SL.library

character vector of prediction algorithms

binomial

boolean specifying binomial or gaussian family (default FALSE)

data

Optional data.frame

Value

ml_model object

Author(s)

Klaus Kähler Holst


Softmax transformation

Description

Softmax transformation

Usage

softmax(x, log = FALSE, ref = TRUE, ...)

Arguments

x

Input matrix (e.g., linear predictors of multinomial logistic model)

log

Return on log-scale (default FALSE)

ref

Add reference level (add 0 column to x)

...

Additional arguments to lower level functions

Value

Numeric matrix of dimension n x p, where n= nrow(x) and p = ncol(x) + (ref==TRUE)


Solve ODE

Description

Solve ODE with Runge-Kutta method (RK4)

Usage

solve_ode(ode_ptr, input, init, par = 0)

Arguments

ode_ptr

pointer (externalptr) to C++ function or an R function

input

Input matrix. 1st column specifies the time points

init

Initial conditions

par

Parameters defining the ODE (parsed to ode_ptr)

Details

The external point should be created with the function targeted::specify_ode.

Value

Matrix with solution

Author(s)

Klaus Kähler Holst

See Also

specify_ode

Examples

example(specify_ode)

Specify Ordinary Differential Equation (ODE)

Description

Define compiled code for ordinary differential equation.

Usage

specify_ode(code, fname = NULL, pname = c("dy", "x", "y", "p"))

Arguments

code

string with the body of the function definition (see details)

fname

Optional name of the exported C++ function

pname

Vector of variable names (results, inputs, states, parameters)

Details

The model (code) should be specified as the body of of C++ function. The following variables are defined bye default (see the argument pname)

dy

Vector with derivatives, i.e. the rhs of the ODE (the result).

x

Vector with the first element being the time, and the following elements additional exogenous input variables,

y

Vector with the dependent variable

p

Parameter vector

y(t)=fp(x(t),y(t))y'(t) = f_{p}(x(t), y(t)) All variables are treated as Armadillo (http://arma.sourceforge.net/) vectors/matrices.

As an example consider the Lorenz Equations dxtdt=σ(ytxt)\frac{dx_{t}}{dt} = \sigma(y_{t}-x_{t}) dytdt=xt(ρzt)yt\frac{dy_{t}}{dt} = x_{t}(\rho-z_{t})-y_{t} dztdt=xtytβzt\frac{dz_{t}}{dt} = x_{t}y_{t}-\beta z_{t}

We can specify this model as ode <- 'dy(0) = p(0)*(y(1)-y(0)); dy(1) = y(0)*(p(1)-y(2)); dy(2) = y(0)*y(1)-p(2)*y(2);' dy <- specify_ode(ode)

As an example of model with exogenous inputs consider the following ODE: y(t)=β0+β1y(t)+β2y(t)x(t)+β3x(t)ty'(t) = \beta_{0} + \beta_{1}y(t) + \beta_{2}y(t)x(t) + \beta_{3}x(t)\cdot t This could be specified as mod <- 'double t = x(0); dy = p(0) + p(1)*y + p(2)*x(1)*y + p(3)*x(1)*t;' dy <- specify_ode(mod)##'

Value

pointer (externalptr) to C++ function

Author(s)

Klaus Kähler Holst

See Also

solve_ode


targeted class object

Description

The functions riskreg and ate returns an object of the type targeted.

An object of class 'targeted' is a list with at least the following components:

estimate

An estimate object with the target parameter estimates (see estimate.default)

opt

Object returned from the applied optimization routine

npar

number of parameters of the model (target and nuisance)

type

String describing the model

Value

objects of the S3 class 'targeted'

S3 generics

The following S3 generic functions are available for an object of class targeted:

coef

Extract target coefficients of the estimated model.

vcov

Extract the variance-covariance matrix of the target parameters.

IC

Extract the estimated influence function.

print

Print estimates of the target parameters.

summary

Extract information on both target parameters and estimated nuisance model.

'

See Also

riskreg, ate

Examples

## See example(riskreg) for examples