Title: | Utility Functions for Statistical Analysis Report Generation and Monte Carlo Studies |
---|---|
Description: | Helper functions for creating formatted summary of regression models, writing publication-ready tables to latex files, and running Monte Carlo experiments. |
Authors: | Youyi Fong [cre], Krisztian Sebestyen [aut], Han Sunwoo [aut], Jason Becker [ctb], Bendix Carstensen [ctb], Daryl Morris [ctb], Josh Pasek [ctb], Dennis Chao [ctb], Andri Signorell [ctb], Sue Li [ctb], Jonathan Bartlett [ctb], Christophe Dutang [ctb] |
Maintainer: | Youyi Fong <[email protected]> |
License: | GPL (>= 2) |
Version: | 2024.11-01 |
Built: | 2024-12-04 07:01:05 UTC |
Source: | CRAN |
Calculate age, by Jason P Becker, modified very slightly in how arguments are passed to the function.
age_calc(dob, enddate = Sys.Date(), units = c("days","months","years"), precise = TRUE)
age_calc(dob, enddate = Sys.Date(), units = c("days","months","years"), precise = TRUE)
dob |
POSIXlt or Date. Birthday |
enddate |
POSIXlt or Date. Date to compute age |
units |
string. Choose a unit. |
precise |
Boolean. |
Jason P Becker
http://blog.jsonbecker.com/2013/12/calculating-age-with-precision-in-r.html
age_calc (dob=strptime("29OCT2002", format="%d%b%Y"), enddate=strptime("30OCT2003", format="%d%b%Y"), units='years', precise=TRUE) age_calc (dob=strptime("29OCT2002", format="%d%b%Y"), enddate=strptime("30DEC2003", format="%d%b%Y"), units='years', precise=FALSE)
age_calc (dob=strptime("29OCT2002", format="%d%b%Y"), enddate=strptime("30OCT2003", format="%d%b%Y"), units='years', precise=TRUE) age_calc (dob=strptime("29OCT2002", format="%d%b%Y"), enddate=strptime("30DEC2003", format="%d%b%Y"), units='years', precise=FALSE)
AUC methods.
## S3 method for class 'auc' coef(object, ...) ## S3 method for class 'auc' predict(object, newdata, case.percentage = NULL, ...) ## S3 method for class 'auc' print(x, ...) ## S3 method for class 'auc' summary(object, ...) ## S3 method for class 'auc' trainauc(fit, training.data = NULL, ...) ## S3 method for class 'auc' ratio(fit) ## S3 method for class 'glm' trainauc(fit, ...) ## S3 method for class 'glm' ratio(fit)
## S3 method for class 'auc' coef(object, ...) ## S3 method for class 'auc' predict(object, newdata, case.percentage = NULL, ...) ## S3 method for class 'auc' print(x, ...) ## S3 method for class 'auc' summary(object, ...) ## S3 method for class 'auc' trainauc(fit, training.data = NULL, ...) ## S3 method for class 'auc' ratio(fit) ## S3 method for class 'glm' trainauc(fit, ...) ## S3 method for class 'glm' ratio(fit)
fit |
an object that inherits from class 'auc' such as 'rauc' or 'sauc' |
object |
an object that inherits from class 'auc' such as 'rauc' or 'sauc' |
x |
an object that inherits from class 'auc' such as |
newdata |
data at which to predict |
case.percentage |
used for class prediction, defaults to NULL |
training.data |
data frame used to compute auc based on a fit obtained by a call to |
... |
arguments passed to or from methods |
Youyi Fong [email protected]
Krisztian Sebestyen
cbinduneven binds together a list of matrixes/dataframes of different lengths, rows are matched by names binary returns binary representation of an integer. binary2 returns binary representatin of an integer with leading 0, the length of string is n. mysystem can call any exe file that is in the PATH f2c convert temperature from f to c/
mytable (..., exclude = if (useNA == "no") c(NA, NaN), useNA = "ifany", dnn = list.names(...), deparse.level = 1) cbinduneven(li) binary(i) multi.outer (f, ... ) myreshapelong(dat, cols.to.be.stacked, label.cols.to.be.stacked, new.col.name) binary2(i, n) f2c(f) ftoi(f) keepWarnings(expr) meanmed(x, na.rm = FALSE) myaggregate(x, by, FUN, new.col.name = "aggregate.value", ...) myreshapewide(formula, dat, idvar, keep.extra.col=FALSE) mysapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE, ret.mat = TRUE) myscale(x) mysystem(cmd, ...) mytapply(X, INDEX, FUN = NULL, ..., simplify = TRUE) read.sv(file, header = TRUE, ...) read.tsv(file, header = TRUE, sep = "\t", ...) table.prop(x,y=NULL,digit=1,style=2,whole.table.add.to.1=FALSE,useNA="ifany", add.perc=FALSE, add.total.column = FALSE) table.cases (case,group,include.all=TRUE,desc="cases") table.cases.3(case,group1,group2) unix() mycor (x, use = "everything", method = c("pearson", "kendall", "spearman"), alternative = c("two.sided", "less", "greater"), exact = NULL, conf.level = 0.95, continuity = FALSE, digits.coef=2, digits.pval=3, ...)
mytable (..., exclude = if (useNA == "no") c(NA, NaN), useNA = "ifany", dnn = list.names(...), deparse.level = 1) cbinduneven(li) binary(i) multi.outer (f, ... ) myreshapelong(dat, cols.to.be.stacked, label.cols.to.be.stacked, new.col.name) binary2(i, n) f2c(f) ftoi(f) keepWarnings(expr) meanmed(x, na.rm = FALSE) myaggregate(x, by, FUN, new.col.name = "aggregate.value", ...) myreshapewide(formula, dat, idvar, keep.extra.col=FALSE) mysapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE, ret.mat = TRUE) myscale(x) mysystem(cmd, ...) mytapply(X, INDEX, FUN = NULL, ..., simplify = TRUE) read.sv(file, header = TRUE, ...) read.tsv(file, header = TRUE, sep = "\t", ...) table.prop(x,y=NULL,digit=1,style=2,whole.table.add.to.1=FALSE,useNA="ifany", add.perc=FALSE, add.total.column = FALSE) table.cases (case,group,include.all=TRUE,desc="cases") table.cases.3(case,group1,group2) unix() mycor (x, use = "everything", method = c("pearson", "kendall", "spearman"), alternative = c("two.sided", "less", "greater"), exact = NULL, conf.level = 0.95, continuity = FALSE, digits.coef=2, digits.pval=3, ...)
exclude |
exclude |
dnn |
dnn |
deparse.level |
deparse.level |
add.total.column |
tbdi |
use |
tbdi |
method |
tbdi |
alternative |
tbdi |
exact |
tbdi |
conf.level |
tbdi |
continuity |
tbdi |
digits.coef |
tbdi |
digits.pval |
tbdi |
cols.to.be.stacked |
tbdi |
label.cols.to.be.stacked |
tbdi |
li |
a list |
i |
tbdi |
n |
tbdn |
f |
In multi.out, f is a function. |
case |
vector of 0/1 |
group |
vector of multi-group indicators |
formula |
a formula object. |
expr |
tbdexpr |
x |
tbdx |
na.rm |
tbdna.rm |
desc |
tbdby |
by |
tbdby |
whole.table.add.to.1 |
Boolean |
new.col.name |
tbdnew.col.name |
... |
tbd... |
dat |
tbddat |
idvar |
tbdidvar |
X |
tbdX |
simplify |
tbdsimplify |
USE.NAMES |
tbdUSE.NAMES |
ret.mat |
tbdret.mat |
cmd |
tbdcmd |
INDEX |
tbdINDEX |
file |
tbdfile |
header |
tbdheader |
sep |
tbdsep |
y |
tbdy |
digit |
tbddigit |
style |
tbdstyle |
FUN |
tbdFUN |
keep.extra.col |
tbdFUN |
useNA |
tbdFUN |
add.perc |
tbdFUN |
include.all |
tbdFUN |
group1 |
tbdFUN |
group2 |
tbdFUN |
binary(5) ### 101 binary2(5, 4) a=data.frame("x"=1:2) b=data.frame("y"=3:5);#rownames(b)[3]="" cbinduneven(list(a,b)) ## Not run: # the formula in myreshapewide can only have one variable in the right hand side myreshapewide(fi~week, dat, c("ptid","stim")) myreshapelong(dat.201.neut, cols.to.be.stacked=c("MN.3","SF162","SVA.MLV"), label.cols.to.be.stacked="antigen", new.col.name="y") myaggregate(subset(dat.poc, select=c(HIV, trt)), list(dat.poc$f), function(x) with(x, c(fisher.test(HIV, trt)$estimate, fisher.test(HIV, trt)$p.value))) ## End(Not run)
binary(5) ### 101 binary2(5, 4) a=data.frame("x"=1:2) b=data.frame("y"=3:5);#rownames(b)[3]="" cbinduneven(list(a,b)) ## Not run: # the formula in myreshapewide can only have one variable in the right hand side myreshapewide(fi~week, dat, c("ptid","stim")) myreshapelong(dat.201.neut, cols.to.be.stacked=c("MN.3","SF162","SVA.MLV"), label.cols.to.be.stacked="antigen", new.col.name="y") myaggregate(subset(dat.poc, select=c(HIV, trt)), list(dat.poc$f), function(x) with(x, c(fisher.test(HIV, trt)$estimate, fisher.test(HIV, trt)$p.value))) ## End(Not run)
This function plots a smoothed line of how the average value of Y changes with X in order to check functional form for logistic regression.
binaryloess(x, y, scale = c("logit", "linear"), span = 0.7, weights = NULL, ...)
binaryloess(x, y, scale = c("logit", "linear"), span = 0.7, weights = NULL, ...)
x |
tbdx |
y |
tbdy |
scale |
tbdscale |
span |
smoothing parameter, passed to loess. If less than 1, the neighbourhood includes proportion a of the points. If greater than 1, all points are used, with the maximum distance assumed to be a^(1/p) times the actual maximum distance for p explanatory variables.
|
weights |
sampling weights, passed to loess |
... |
passed to plotting function |
This function comes from Jonathan Bartlett ()https://thestatsgeek.com/2014/09/13/checking-functional-form-in-logistic-regression-using-loess/).
set.seed(1234) n <- 1000 x <- rnorm(n) xb <- -2+x pr <- exp(xb)/(1+exp(xb)) y=rbern(n, pr) par(mfrow=c(1,2)) binaryloess(x, y, scale = "logit", span = 0.7, weights = NULL, ylab="logit(p)") binaryloess(x, y, scale = "linear", span = 0.7, weights = NULL, ylab="prob")
set.seed(1234) n <- 1000 x <- rnorm(n) xb <- -2+x pr <- exp(xb)/(1+exp(xb)) y=rbern(n, pr) par(mfrow=c(1,2)) binaryloess(x, y, scale = "logit", span = 0.7, weights = NULL, ylab="logit(p)") binaryloess(x, y, scale = "linear", span = 0.7, weights = NULL, ylab="prob")
A slightly modified test of the proportional hazards assumption for a Cox regression model fit (coxph). This version corrects some conservativeness of the test.
cox.zph.2(fit, transform = "km", global = TRUE, exact=TRUE)
cox.zph.2(fit, transform = "km", global = TRUE, exact=TRUE)
fit |
fit |
transform |
transform |
global |
global |
exact |
Boolean. If FALSE, this function is an identical copy of cox.zph. If TRUE, it computes the variance of the test statistic exactly, instead of approximately. |
When the model uses time-dependent covariates, the approximation used in Grambsch and Therneau resulted in conservativeness of the test. This is "fixed" here at a cost of up to 2.5 times longer execution time.
Fong, Y. and Halloran, M Elizabeth and Gilbert, P. Using Time-Dependent Age Group in Cox Regression Analysis of Vaccine Efficacy Trials, Just Another Epi Journal, in prep.
library(survival) fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data=ovarian) temp <- cox.zph(fit) print(temp) temp.2 <- cox.zph.2(fit) print(temp.2)
library(survival) fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data=ovarian) temp <- cox.zph(fit) print(temp) temp.2 <- cox.zph.2(fit) print(temp.2)
Cross validation utility functions.
sample.for.cv (dat, v, seed) get.kfold.splits (dat, k, seed) kfold.split (k, n1, n0) ran.kfold.split(k, n1, n0, replicates) lpo.split(n1, n0) get.splits (dat, cv.scheme=c("LPO","5fold","50xrandom4:1"), seed)
sample.for.cv (dat, v, seed) get.kfold.splits (dat, k, seed) kfold.split (k, n1, n0) ran.kfold.split(k, n1, n0, replicates) lpo.split(n1, n0) get.splits (dat, cv.scheme=c("LPO","5fold","50xrandom4:1"), seed)
dat |
a data frame. One of the columns must be named y and y should be 0/1 with 1 for case and 0 for control |
v |
v-fold cross validation |
seed |
seed for random number generators |
k |
var.equal |
n1 |
var.equal |
n0 |
var.equal |
replicates |
var.equal |
cv.scheme |
var.equal |
sample.for.cv: case and controls are sampled separately.
sample.for.cv returns a list of two vector of integers: train and test, which refer to the rows of dat
Deming regression fit. Assume x and y variances are the same. Slightly modified from MethComp R package.
Deming(x, y, vr = sdr^2, sdr = sqrt(vr), boot = TRUE, keep.boot = FALSE, alpha = 0.05)
Deming(x, y, vr = sdr^2, sdr = sqrt(vr), boot = TRUE, keep.boot = FALSE, alpha = 0.05)
x |
tbd |
y |
tbdy |
vr |
tbdvr |
sdr |
tbdsdr |
boot |
tbdboot |
keep.boot |
tbdkeep.boot |
alpha |
tbdalpha |
## Not run: set.seed(1) x=rnorm(100,0,1) y=x+rnorm(100,0,.5) x=x+rnorm(100,0,.5) fit=Deming(x,y, boot=TRUE) summary(fit) plot(x,y) abline(fit) # compare with lm fit fit.1=lm(y~x, data.frame(x,y)) summary(fit.1) abline(fit.1, col=2) ## End(Not run)
## Not run: set.seed(1) x=rnorm(100,0,1) y=x+rnorm(100,0,.5) x=x+rnorm(100,0,.5) fit=Deming(x,y, boot=TRUE) summary(fit) plot(x,y) abline(fit) # compare with lm fit fit.1=lm(y~x, data.frame(x,y)) summary(fit.1) abline(fit.1, col=2) ## End(Not run)
Makes a heatmap representation of correaltion coefficients easier.
DMHeatMap(x, Rowv = TRUE, Colv = if (symm) "Rowv" else TRUE, distfun = dist, hclustfun = hclust, dendrogram = c("both", "row", "column", "none"), symm = FALSE, scale = c("none", "row", "column"), na.rm = TRUE, revC = identical(Colv, "Rowv"), add.expr, breaks, symbreaks = min(x < 0, na.rm = TRUE) || scale != "none", col = "heat.colors", colsep, rowsep, sepcolor = "white", sepwidth = c(0.05, 0.05), cellnote, notecex = 1, notecol = "cyan", na.color = par("bg"), trace = c("column", "row", "both", "none"), tracecol = "cyan", hline = median(breaks), vline = median(breaks), linecol = tracecol, margins = c(5, 5), ColSideColors, RowSideColors, cexRow = 0.2 + 1/log10(nr), cexCol = 0.2 + 1/log10(nc), labRow = NULL, labCol = NULL, labColor = NULL, axis = TRUE, heatmapOnly = FALSE, key = TRUE, keysize = 1.5, density.info = c("histogram", "density", "none"), denscol = tracecol, symkey = min(x < 0, na.rm = TRUE) || symbreaks, densadj = 0.25, main = NULL, xlab = NULL, ylab = NULL, lmat = NULL, lhei = NULL, lwid = NULL, lower.left.only = TRUE, legend = TRUE, legend.x = "topright", verbose = FALSE, ...)
DMHeatMap(x, Rowv = TRUE, Colv = if (symm) "Rowv" else TRUE, distfun = dist, hclustfun = hclust, dendrogram = c("both", "row", "column", "none"), symm = FALSE, scale = c("none", "row", "column"), na.rm = TRUE, revC = identical(Colv, "Rowv"), add.expr, breaks, symbreaks = min(x < 0, na.rm = TRUE) || scale != "none", col = "heat.colors", colsep, rowsep, sepcolor = "white", sepwidth = c(0.05, 0.05), cellnote, notecex = 1, notecol = "cyan", na.color = par("bg"), trace = c("column", "row", "both", "none"), tracecol = "cyan", hline = median(breaks), vline = median(breaks), linecol = tracecol, margins = c(5, 5), ColSideColors, RowSideColors, cexRow = 0.2 + 1/log10(nr), cexCol = 0.2 + 1/log10(nc), labRow = NULL, labCol = NULL, labColor = NULL, axis = TRUE, heatmapOnly = FALSE, key = TRUE, keysize = 1.5, density.info = c("histogram", "density", "none"), denscol = tracecol, symkey = min(x < 0, na.rm = TRUE) || symbreaks, densadj = 0.25, main = NULL, xlab = NULL, ylab = NULL, lmat = NULL, lhei = NULL, lwid = NULL, lower.left.only = TRUE, legend = TRUE, legend.x = "topright", verbose = FALSE, ...)
x |
tbd |
axis |
tbd |
heatmapOnly |
tbd |
verbose |
tbd |
legend.x |
tbd |
legend |
tbd |
Rowv |
tbdRowv |
Colv |
tbdColv |
distfun |
tbddistfun |
hclustfun |
tbdhclustfun |
dendrogram |
tbddendrogram |
symm |
tbdsymm |
scale |
tbdscale |
na.rm |
tbdna.rm |
revC |
tbdrevC |
add.expr |
tbdadd.expr |
breaks |
tbdbreaks |
symbreaks |
tbdsymbreaks |
col |
tbdcol |
colsep |
tbdcolsep |
rowsep |
tbdrowsep |
sepcolor |
tbdsepcolor |
sepwidth |
tbdsepwidth |
cellnote |
tbdcellnote |
notecex |
tbdnotecex |
notecol |
tbdnotecol |
na.color |
tbdna.color |
trace |
tbdtrace |
tracecol |
tbdtracecol |
hline |
tbdhline |
vline |
tbdvline |
linecol |
tbdlinecol |
margins |
tbdmargins |
ColSideColors |
tbdColSideColors |
RowSideColors |
tbdRowSideColors |
cexRow |
tbdcexRow |
cexCol |
tbdcexCol |
labRow |
tbdlabRow |
labCol |
tbdlabCol |
labColor |
tbdlabColor |
key |
tbdkey |
keysize |
tbdkeysize |
density.info |
tbddensity.info |
denscol |
tbddenscol |
symkey |
tbdsymkey |
densadj |
tbddensadj |
main |
tbdmain |
xlab |
tbdxlab |
ylab |
tbdylab |
lmat |
tbdlmat |
lhei |
tbdlhei |
lwid |
tbdlwid |
lower.left.only |
tbdlower.left.only |
... |
tbd... |
cor=matrix(runif(15),5,3) breaks=c(-1,-.7,-.5,-.3,-.1,.1,.3,.5,.7,1) hU=DMHeatMap(cor, trace="none", symm=FALSE,dendrogram="none", col=RColorBrewer::brewer.pal( length(breaks)-1,"RdYlGn"), distfun = function(c) as.dist(1 - c), cexRow =1.5, cexCol =1.5, lmat=rbind( c(2, 1), c(4,3) ), lhei=c(4, 1 ), breaks=breaks, margins=c(2,2), key = FALSE, Rowv=NA, lower.left.only=FALSE)
cor=matrix(runif(15),5,3) breaks=c(-1,-.7,-.5,-.3,-.1,.1,.3,.5,.7,1) hU=DMHeatMap(cor, trace="none", symm=FALSE,dendrogram="none", col=RColorBrewer::brewer.pal( length(breaks)-1,"RdYlGn"), distfun = function(c) as.dist(1 - c), cexRow =1.5, cexCol =1.5, lmat=rbind( c(2, 1), c(4,3) ), lhei=c(4, 1 ), breaks=breaks, margins=c(2,2), key = FALSE, Rowv=NA, lower.left.only=FALSE)
Counting the number of masks in a rectangular region
get_count_from_xy_coor(file, topleft, bottomright, image, plot)
get_count_from_xy_coor(file, topleft, bottomright, image, plot)
file |
_sizes_coordinates.txt |
topleft |
topleft (x,y) coordiate for a rectangular box |
bottomright |
bottomright: bottomright (x,y) coordiate for a rectangular box |
image |
image: an image for plotting |
plot |
plot: plot=TRUE shows image with rectangular box |
This function counts cells inside of rectangular box made by the topleft and bottomright xy-coordinates.
The number of masks inside of the rectangular box
Sunwoo Han
#get_count_from_xy_coor(file='M926910_Position1_CD3-BUV395_sizes_coordinates.txt', #topleft=c(500,0), bottomright=c(1392,500), #image='M926910_Position1_CD3-BUV395.tiff', plot=TRUE)
#get_count_from_xy_coor(file='M926910_Position1_CD3-BUV395_sizes_coordinates.txt', #topleft=c(500,0), bottomright=c(1392,500), #image='M926910_Position1_CD3-BUV395.tiff', plot=TRUE)
Go through a folder and read all files and combine the results into a multidimensional array.
get.sim.res (dir, res.name="res", verbose=TRUE) MCsummary (dir, res.name = "res", exclude.some = TRUE, exclude.col = 1, verbose = TRUE) getFormattedMCSummary (path, sim, nn, fit.method, exclude.some = TRUE, exclude.col = 1, verbose = TRUE, coef.0 = NULL, digit1 = 2, sum.est = c("mean", "median"), sum.sd = c("median", "mean"), style = 1, keep.intercept = FALSE)
get.sim.res (dir, res.name="res", verbose=TRUE) MCsummary (dir, res.name = "res", exclude.some = TRUE, exclude.col = 1, verbose = TRUE) getFormattedMCSummary (path, sim, nn, fit.method, exclude.some = TRUE, exclude.col = 1, verbose = TRUE, coef.0 = NULL, digit1 = 2, sum.est = c("mean", "median"), sum.sd = c("median", "mean"), style = 1, keep.intercept = FALSE)
dir |
directory of MC result files |
path |
partial path to the directory of MC result files |
res.name |
name of the R object saved in the files, default is res, but may be others |
verbose |
Boolean |
sim |
a string to denote simulation setting |
nn |
a vector of sample sizes |
fit.method |
a string to denote fitting method. sim, nn and fit.method together forms the name of the directory containing MC result files |
exclude.col |
column number |
exclude.some |
whether to exclude MC results that are extreme |
coef.0 |
simulation truth |
digit1 |
digits |
sum.est |
use mean or median as location estimate summary |
sum.sd |
use mean or median as sd estimate summary |
style |
integer |
keep.intercept |
whether to include intercept in the table |
Depends on package abind to combine arrays from files.
A multidimensional array.
getK
calculates the kernel matrix between X and itself and returns a n by n matrix. Alternatively, it calculates the kernel matrix between X and X2 and returns a n by n2 matrix.
getK (X,kernel,para=NULL,X2=NULL,C = NULL)
getK (X,kernel,para=NULL,X2=NULL,C = NULL)
X |
covariate matrix with dimension n by d. Note this is not the paired difference of covariate matrix. |
kernel |
string specifying type of kernel:
polynomial or p (1 + <x,y>)^para, |
para |
parameter of the kernel fucntion. for |
X2 |
optional second covariate matrix with dimension n2 by d |
C |
logical. If TRUE, kernels are computed by custom routines in C, which may be more memory efficient, and faster too for ibs and hamming kernels. |
IBS stands for 'Identical By State'. If 'x','y' are in in {0,1,2} then
IBS(x,y) = 0 if |x-y|=2, 1 if |x-y|=1, 2 if |x-y|=0, or IBS(x,y) = 2.0 - |x-y|.
K(u,v) = sum(IBS(u[i],v[i])) / 2K where K = length(u).
The 'hamming' kernel is the equivalent of the 'ibs' kernel for binary data.
Note that 'hamming' kernel is based on hamming similarity(!), not on dissimilarity distance.
Within in the code, C is default to TRUE for ibs and hamming kernels and FALSE otherwise.
A kernel matrix.
Youyi Fong [email protected]
Krisztian Sebestyen [email protected]
Shuxin Yin
X = cbind(x1=rnorm(n=5), x2=rnorm(n=5)) dim(X) X2 = cbind(x1=rnorm(n=3), x2=rnorm(n=3)) dim(X2) K = getK(X,"linear") dim(K) K = getK(X,"linear",X2=X2) dim(K) K1 = getK(X2,"l",X2=X) dim(K1) all(K==t(K1)) # RBF kernel K = getK(X,"rbf",para=1,X2=X2) K1 = getK(X2,"r",para=1,X2=X) all(K==t(K1)) # IBS kernel for ternary data X <- as.matrix(expand.grid(0:2,0:2)) K = getK(X,kernel = 'ibs') # add weight w = runif(ncol(X)) K = getK(X,kernel = 'ibs',para = w) # IBS kernel for binary data via option 'h' for 'hamming similarity measure' X <- as.matrix(expand.grid(0:1,0:1)) K=getK(X,kernel = 'h')
X = cbind(x1=rnorm(n=5), x2=rnorm(n=5)) dim(X) X2 = cbind(x1=rnorm(n=3), x2=rnorm(n=3)) dim(X2) K = getK(X,"linear") dim(K) K = getK(X,"linear",X2=X2) dim(K) K1 = getK(X2,"l",X2=X) dim(K1) all(K==t(K1)) # RBF kernel K = getK(X,"rbf",para=1,X2=X2) K1 = getK(X2,"r",para=1,X2=X) all(K==t(K1)) # IBS kernel for ternary data X <- as.matrix(expand.grid(0:2,0:2)) K = getK(X,kernel = 'ibs') # add weight w = runif(ncol(X)) K = getK(X,kernel = 'ibs',para = w) # IBS kernel for binary data via option 'h' for 'hamming similarity measure' X <- as.matrix(expand.grid(0:1,0:1)) K=getK(X,kernel = 'h')
Estimate the total, direct, and indirect effects using IORW method (inverse odds ratio weighting) and compute 95
iorw(formula.effect, formula.mediators, data, family = NULL, nboot = 10000, numCores = 1, save.steps = FALSE, verbose = FALSE) ## S3 method for class 'iorw' print(x, ...)
iorw(formula.effect, formula.mediators, data, family = NULL, nboot = 10000, numCores = 1, save.steps = FALSE, verbose = FALSE) ## S3 method for class 'iorw' print(x, ...)
formula.effect |
a formula object for the total and direct effect regression. The first term on the right is assumed to be the binary treatment/exposure variable. |
formula.mediators |
a formula object for logistic regression. It should be of the form: ~ mediation marker1 + mediation marker2. |
data |
a data frame. |
family |
if Cox regression, leave as NULL; otherwise, it will be passed to glm(). |
nboot |
an integer. Number of bootstrap replicates. |
numCores |
an interger. Number of cores to use for parallel procesing. |
save.steps |
boolean. Whether or not to save the fits from the three steps and the weights. |
x |
Object of type iorw |
verbose |
boolean. |
... |
Additional arguments passed to the print function. |
Code by Cowling and Lim was downloaded from https://datadryad.org/stash/dataset/doi:10.5061/dryad.cv37539
If a bootstrap replicate generates warnings during regression, NA will be returned for that replicate. The number of such occurrences is recorded in an attribute of boot.perc in the return value.
It does not handle sampling weights yet.
Point estimates and percentile bootstrap confidence intervals.
Youyi Fong, based on code by Cowling and Lim
Cowling, B. J., Lim, W. W., Perera, R. A., Fang, V. J., Leung, G. M., Peiris, J. M., & Tchetgen Tchetgen, E. J. (2019). Influenza hemagglutination-inhibition antibody titer as a mediator of vaccine-induced protection for influenza B. Clinical Infectious Diseases, 68(10), 1713-1717.
Nguyen, Q. C., Osypuk, T. L., Schmidt, N. M., Glymour, M. M., & Tchetgen Tchetgen, E. J. (2015). Practical guidance for conducting mediation analysis with multiple mediators using inverse odds ratio weighting. American journal of epidemiology, 181(5), 349-356.
Tchetgen Tchetgen, E. J. (2013). Inverse odds ratio-weighted estimation for causal mediation analysis. Statistics in medicine, 32(26), 4567-4580.
Imai, K., Keele, L., & Tingley, D. (2010). A general approach to causal mediation analysis. Psychological methods, 15(4), 309.
#### Cox regression # without adjusting for baseline markers library(survival) formula.effect=Surv(surv_time, flu)~vaccine+age formula.mediators=~log2(postvax.B.Brisbane/5) res.1=iorw(formula.effect, formula.mediators, kid, nboot=10, numCores=1); res.1 stopifnot(max(abs(res.1$boot[1,] - c(0.2029779,0.6070105,0.3039110,0.4283389,0.2124268)))<1e-6) # adjust for baseline markers formula.effect=Surv(surv_time, flu)~vaccine+log2(prevax.B.Brisbane)+age formula.mediators=~log2(postvax.B.Brisbane/5) res.2=iorw(formula.effect, formula.mediators, kid, nboot=10, numCores=1); res.2 #### Logistic regression # without adjusting for baseline markers formula.effect=flu~vaccine+age formula.mediators=~log2(postvax.B.Brisbane/5) res.3=iorw(formula.effect, formula.mediators, kid, family=binomial(), nboot=10, numCores=1); res.3 stopifnot(max(abs(res.3$boot[1,] - c(0.1960024,0.6154349,0.2937164,0.4145470,0.2168644)))<1e-6) # adjust for baseline markers formula.effect=flu~vaccine+log2(prevax.B.Brisbane)+age formula.mediators=~log2(postvax.B.Brisbane/5) res.4=iorw(formula.effect, formula.mediators, kid, family=binomial(), nboot=10, numCores=1); res.4
#### Cox regression # without adjusting for baseline markers library(survival) formula.effect=Surv(surv_time, flu)~vaccine+age formula.mediators=~log2(postvax.B.Brisbane/5) res.1=iorw(formula.effect, formula.mediators, kid, nboot=10, numCores=1); res.1 stopifnot(max(abs(res.1$boot[1,] - c(0.2029779,0.6070105,0.3039110,0.4283389,0.2124268)))<1e-6) # adjust for baseline markers formula.effect=Surv(surv_time, flu)~vaccine+log2(prevax.B.Brisbane)+age formula.mediators=~log2(postvax.B.Brisbane/5) res.2=iorw(formula.effect, formula.mediators, kid, nboot=10, numCores=1); res.2 #### Logistic regression # without adjusting for baseline markers formula.effect=flu~vaccine+age formula.mediators=~log2(postvax.B.Brisbane/5) res.3=iorw(formula.effect, formula.mediators, kid, family=binomial(), nboot=10, numCores=1); res.3 stopifnot(max(abs(res.3$boot[1,] - c(0.1960024,0.6154349,0.2937164,0.4145470,0.2168644)))<1e-6) # adjust for baseline markers formula.effect=flu~vaccine+log2(prevax.B.Brisbane)+age formula.mediators=~log2(postvax.B.Brisbane/5) res.4=iorw(formula.effect, formula.mediators, kid, family=binomial(), nboot=10, numCores=1); res.4
Influenza immune response biomarkers dataset.
data("kid")
data("kid")
A data frame with 736 observations on the following 10 variables.
hhID
a numeric vector
age
a numeric vector
intervention
a character vector
vaccine
a numeric vector
vaccine.date
a Date
postvax.date
a Date
prevax.B.Brisbane
a numeric vector
postvax.B.Brisbane
a numeric vector
surv_time
a numeric vector
flu
a numeric vector
Cowling, B. J., Lim, W. W., Perera, R. A., Fang, V. J., Leung, G. M., Peiris, J. M., & Tchetgen Tchetgen, E. J. (2019). Influenza hemagglutination-inhibition antibody titer as a mediator of vaccine-induced protection for influenza B. Clinical Infectious Diseases, 68(10), 1713-1717.
Utility functions by Youyi Fong and Krisz Sebestyen, and some functions copied from other packages for convenience (acknowledged on their manual pages).
Most useful functions: mypostscript/mypdf, mytex,
See the Index link below for a list of available functions.
The package depends on Hmisc. The main reason for that, besides the usefulness of the package, is Hmisc depends on ggplot2, which also define
Returns a data frame that is suitable for time-dependent covariate Cox model fit.
make.timedep.dataset(dat, X, d, baseline.ageyrs, t.1, t.2 = NULL)
make.timedep.dataset(dat, X, d, baseline.ageyrs, t.1, t.2 = NULL)
dat |
data frame |
X |
string. Name of the followup time column in dat. Unit needs to be years. |
d |
string. Name of the followup time column in dat. |
baseline.ageyrs |
string. Name of the followup time column in dat. |
t.1 |
numerical. Cutoff for age group |
t.2 |
numerical. Second cutoff for age group |
The function assumes that the followup length is such that only one change of age group is possible.
Returns a data frame with the following columns added: tstart, tstop, .timedep.agegrp, .baseline.agegrp
tstart |
left bound of time interval |
tstop |
right bound of time interval |
.timedep.agegrp |
time-dependent age group |
.baseline.agegrp |
baseline age group |
Youyi Fong
Therneau, T. and Crowson, C. Using Time Dependent Covariates and Time Dependent Coefficients in the Cox Model. A vignette from the R package surival.
library(survival) n=3000; followup.length=5; incidence.density=0.015; age.sim="continuous" dat.0=sim.dat.tvarying.two(n, followup.length, incidence.density, age.sim, seed=1) dat=subset(dat.0, for.non.tvarying.ana, select=c(ptid, X, d, baseline.age, trt)) dat.timedep = make.timedep.dataset (dat, "X", "d", "baseline.age", 6) coxph(Surv(tstart,tstop,d) ~ trt*.timedep.agegrp, dat.timedep)
library(survival) n=3000; followup.length=5; incidence.density=0.015; age.sim="continuous" dat.0=sim.dat.tvarying.two(n, followup.length, incidence.density, age.sim, seed=1) dat=subset(dat.0, for.non.tvarying.ana, select=c(ptid, X, d, baseline.age, trt)) dat.timedep = make.timedep.dataset (dat, "X", "d", "baseline.age", 6) coxph(Surv(tstart,tstop,d) ~ trt*.timedep.agegrp, dat.timedep)
H calculates entropy.
as.binary(n, base = 2, r = FALSE) binom.coef(n, m) expit(x) logDiffExp(logx1, logx2) logit(x) logMeanExp(logx, B = NULL) logSumExp(logx) logSumExpFor2(logx, logy) permn(x, fun = NULL, ...) Stirling2(n, m) interpolate(pt1, pt2, x)
as.binary(n, base = 2, r = FALSE) binom.coef(n, m) expit(x) logDiffExp(logx1, logx2) logit(x) logMeanExp(logx, B = NULL) logSumExp(logx) logSumExpFor2(logx, logy) permn(x, fun = NULL, ...) Stirling2(n, m) interpolate(pt1, pt2, x)
n |
tbdn |
base |
tbdbase |
r |
tbdr |
m |
tbdm |
pt1 |
a vector of length 2 |
pt2 |
a vector of length 2 |
x |
tbdx |
logx1 |
tbdlogx1 |
logx2 |
tbdlogx2 |
logx |
tbdlogx |
B |
tbdB |
logy |
tbdlogy |
fun |
tbdfun |
... |
tbd... |
H(rep(1/5,5)) H(rep(3,5))
H(rep(1/5,5)) H(rep(3,5))
concatList returns a string that concatenates the elements of the input list or array
AR1(p, w) concatList(lis, sep = "") EXCH(p, rho) fill.jagged.array(a) getMidPoints(x) getUpperRight(matri, func = NULL) last(x, n = 1, ...) mix(a, b) ## S3 method for class 'data.frame' rep(x, times = 1, ...) ## S3 method for class 'matrix' rep(x, times = 1, each = 1, by.row = TRUE, ...) ## S3 method for class 'matrix.block' rep(x, times = 2, ...) shift.left(x, k = 1) shift.right(x, k = 1) thin.rows(dat, thin.factor = 10) ThinRows(dat, thin.factor = 10) tr(m)
AR1(p, w) concatList(lis, sep = "") EXCH(p, rho) fill.jagged.array(a) getMidPoints(x) getUpperRight(matri, func = NULL) last(x, n = 1, ...) mix(a, b) ## S3 method for class 'data.frame' rep(x, times = 1, ...) ## S3 method for class 'matrix' rep(x, times = 1, each = 1, by.row = TRUE, ...) ## S3 method for class 'matrix.block' rep(x, times = 2, ...) shift.left(x, k = 1) shift.right(x, k = 1) thin.rows(dat, thin.factor = 10) ThinRows(dat, thin.factor = 10) tr(m)
p |
tbdp |
w |
tbdw |
lis |
list or array |
sep |
tbdsep |
rho |
tbdrho |
a |
tbda |
x |
tbdx |
matri |
tbdmatri |
func |
tbdfunc |
n |
tbdn |
... |
tbd... |
b |
tbdb |
times |
tbdtimes |
each |
tbdeach |
by.row |
tbdby.row |
k |
tbdk |
dat |
tbddat |
thin.factor |
tbdthin.factor |
m |
tbdm |
concatList(1:3,"_")
concatList(1:3,"_")
DXD computes D %*% X %*% D, where D is a diagonal matrix. tXDX computes t(X) %*% D %*% X. symprod computes S %*% X for symmetric S. txSy computes t(x) %*% S %*% y for symmetric S.
DXD(d1, X, d2) tXDX(X,D) symprod(S, X) txSy(x, S, y) .as.double(x, stripAttributes = FALSE)
DXD(d1, X, d2) tXDX(X,D) symprod(S, X) txSy(x, S, y) .as.double(x, stripAttributes = FALSE)
d1 |
a diagonal matrix or an array |
d2 |
a diagonal matrix or an array |
x |
array |
y |
array |
S |
symmetric matrix |
X |
matix |
D |
matix |
stripAttributes |
boolean |
.as.double does not copying whereas as.double(x) for older versions of R when using .C(DUP = FALSE) make duplicate copy of x. In addition, even if x is a 'double', since x has attributes (dim(x)) as.double(x) duplicates
The functions do not check whether S is symmetric. If it is not symmetric, then the result will be wrong. DXD offers a big gain, while symprod and txSy gains are more incremental.
Krisztian Sebestyen
d1=1:3 d2=4:6 X=matrix(1:9,3,3) all(DXD(d1, X, d2) == diag(d1) %*% X %*% diag(d2)) S=matrix(c(1,2,3,2,4,5,3,5,8),3,3) X=matrix(1:9,3,3) all( symprod(S, X) == S %*% X ) x=1:3 y=4:6 S=matrix(c(1,2,3,2,4,5,3,5,8),3,3) txSy(x, S, y) == drop(t(x)%*%S%*%y)
d1=1:3 d2=4:6 X=matrix(1:9,3,3) all(DXD(d1, X, d2) == diag(d1) %*% X %*% diag(d2)) S=matrix(c(1,2,3,2,4,5,3,5,8),3,3) X=matrix(1:9,3,3) all( symprod(S, X) == S %*% X ) x=1:3 y=4:6 S=matrix(c(1,2,3,2,4,5,3,5,8),3,3) txSy(x, S, y) == drop(t(x)%*%S%*%y)
Misc functions. summ computes iterative sum, sort of like diff.
pava (x, wt = rep(1, length(x))) summ(x) empty2na(x) ## S3 method for class 'pcc' predict(object, newdat, ...) rank.inv.norm(x) INT(x) dec_to_binary (x,d)
pava (x, wt = rep(1, length(x))) summ(x) empty2na(x) ## S3 method for class 'pcc' predict(object, newdat, ...) rank.inv.norm(x) INT(x) dec_to_binary (x,d)
x |
tbdx |
d |
number of digits in the returned binary representation, including leading 0's |
wt |
tbdvar.equal |
object |
tbdvar.equal |
newdat |
tbdvar.equal |
... |
tbdvar.equal |
rank.inv.norm: rank-based inverse normal/gaussian transformation
dec_to_binary covert a decimal number to a binary representation with d digits
summ returns
An implementation of Westfall and Young
p.adj.perm(p.unadj, p.perms, alpha = 0.05)
p.adj.perm(p.unadj, p.perms, alpha = 0.05)
p.unadj |
p.unadj |
p.perms |
p.perms |
alpha |
alpha |
This implementation is not as fast as the implementation from the package multtest. But ususally the step to create p.perms is the rate-limiting step.
The smallest of the Westfall and Young FWER-controlling multitesting adjusted p values coincides with the p value for testing a global null without any assumptions. But for the multitesting adjustment to hold, it requires the subset pivotality condition.
Sue Li, [email protected]
Westfall, P. H., & Young, S. S. (1993). Resampling-based multiple testing: Examples and methods for p-value adjustment (Vol. 279). John Wiley & Sons.
Westfall, P. H., & Troendle, J. F. (2008). Multiple testing with minimal assumptions. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 50(5), 745-755.
mypostscript and mypdf sets the width and height based on mfrow input.
smoothed.scaled.hist (dat.ls, bin_width, scale.factors=NULL, cols=NULL, legend=NULL, cex.legend=1, ...) myplot (object, ...) ## S3 method for class 'loess' myplot(object, xlab="x", ylab="fitted", ...) whiskers (x, s, ...) abline.pt.slope(pt1, slope, x2=NULL, ...) abline.pts(pt1, pt2 = NULL) butterfly.plot(dat, dat2 = NULL, add = FALSE, xaxislabels = rep("", 4), x.ori = 0, xlab = "", ylab = "", cex.axis = 1, ...) empty.plot() add.mtext.label (text, cex = 1.4, adj = -0.2) mydev.off(file = "temp", ext = c("pdf"), res = 200, mydev = NULL, silent = TRUE) getMfrow(len) myhist (x, add.norm=TRUE, col.norm="blue", ...) myforestplot(dat, xlim = NULL, xlab = "", main = "", col.1 = "red", col.2 = "blue", plot.labels = TRUE, order = FALSE, decreasing = FALSE, vline = TRUE, cols = NULL, log = "", null.val = NULL) my.interaction.plot(dat, x.ori = 0, xaxislabels = rep("", 2), cex.axis = 1, add = FALSE, xlab = "", ylab = "", pcol = NULL, lcol = NULL, ...) myboxplot(object, ...) ## S3 method for class 'formula' myboxplot(formula, data, cex = 0.5, xlab = "", ylab = NULL, main = "", box = TRUE, at = NULL, na.action = NULL, p.val = NULL, pch = 1, col = "white", col.points = 1, border = 1, test = "", friedman.test.formula = NULL, reshape.formula = NULL, reshape.id = NULL, jitter = TRUE, add.interaction = FALSE, drop.unused.levels = TRUE, bg.pt = NULL, add = FALSE, seed = 1, write.p.at.top = FALSE, ...) ## S3 method for class 'data.frame' myboxplot(object, cex = 0.5, ylab = "", xlab = "", main = "", box = TRUE, at = NULL, pch = 1, col = 1, test = "", paired = FALSE, ...) ## S3 method for class 'list' myboxplot(object, paired = FALSE, ...) abline.shade.2(x, col=c(0,1,0)) abline.shade(pt, type = 5, col = c(0, 1, 0), alpha = 0.3) mylegend(legend, x, y=NULL, lty = NULL, bty = "n", ...) mymatplot(x, y, type = "b", lty = c(1, 2, 1, 2, 1, 2), pch = NULL, col = rep(c("darkgray", "black"), each = 3), xlab = NULL, ylab = "", draw.x.axis = TRUE, bg = NA, lwd = 1, at = NULL, make.legend = TRUE, legend = NULL, impute.missing.for.line = TRUE, legend.x = 9, legend.title = NULL, legend.cex = 1, legend.lty = lty, legend.inset = 0, xaxt = "s", y.intersp = 1.5, x.intersp = 0.3, text.width = NULL, add = FALSE, ... ) mypairs(dat, ladder = FALSE, show.data.cloud = TRUE, ladder.add.line = T, ladder.add.text = T, ...) wtd.hist (x, breaks = "Sturges", freq = NULL, probability = !freq, include.lowest = TRUE, right = TRUE, density = NULL, angle = 45, col = NULL, border = NULL, main = paste("Histogram of", xname), xlim = range(breaks), ylim = NULL, xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, nclass = NULL, weight = NULL, ...) mylines(x, y, type = "l", ...) myfigure(mfrow = c(1, 1), mfcol = NULL, width = NULL, height = NULL, oma = NULL, mar = NULL, main.outer = FALSE, bg=NULL) mypdf(...) mypng(...) mytiff(...) mypostscript(file = "temp", mfrow = c(1, 1), mfcol = NULL, width = NULL, height = NULL, ext = c("eps", "pdf", "png", "tiff"), oma = NULL, mar = NULL, main.outer = FALSE, save2file = TRUE, res = 200, silent = TRUE, ...) panel.cor(x, y, digits = 2, prefix = "", cex.cor, cor., leading0 = FALSE, cex.cor.dep = TRUE, ...) panel.hist(x, ...) panel.nothing(x, ...) corplot(object, ...) ## Default S3 method: corplot(object, y, ...) ## S3 method for class 'formula' corplot(formula, data, main = "", method = c("pearson", "spearman"), col = 1, cex = 0.5, add.diagonal.line = TRUE, add.lm.fit = FALSE, add.loess.fit = FALSE, col.lm = 2, add.deming.fit = FALSE, col.deming = 4, add = FALSE, digit.cor = 2, log = "", same.xylim = FALSE, xlim = NULL, ylim = NULL, ...)
smoothed.scaled.hist (dat.ls, bin_width, scale.factors=NULL, cols=NULL, legend=NULL, cex.legend=1, ...) myplot (object, ...) ## S3 method for class 'loess' myplot(object, xlab="x", ylab="fitted", ...) whiskers (x, s, ...) abline.pt.slope(pt1, slope, x2=NULL, ...) abline.pts(pt1, pt2 = NULL) butterfly.plot(dat, dat2 = NULL, add = FALSE, xaxislabels = rep("", 4), x.ori = 0, xlab = "", ylab = "", cex.axis = 1, ...) empty.plot() add.mtext.label (text, cex = 1.4, adj = -0.2) mydev.off(file = "temp", ext = c("pdf"), res = 200, mydev = NULL, silent = TRUE) getMfrow(len) myhist (x, add.norm=TRUE, col.norm="blue", ...) myforestplot(dat, xlim = NULL, xlab = "", main = "", col.1 = "red", col.2 = "blue", plot.labels = TRUE, order = FALSE, decreasing = FALSE, vline = TRUE, cols = NULL, log = "", null.val = NULL) my.interaction.plot(dat, x.ori = 0, xaxislabels = rep("", 2), cex.axis = 1, add = FALSE, xlab = "", ylab = "", pcol = NULL, lcol = NULL, ...) myboxplot(object, ...) ## S3 method for class 'formula' myboxplot(formula, data, cex = 0.5, xlab = "", ylab = NULL, main = "", box = TRUE, at = NULL, na.action = NULL, p.val = NULL, pch = 1, col = "white", col.points = 1, border = 1, test = "", friedman.test.formula = NULL, reshape.formula = NULL, reshape.id = NULL, jitter = TRUE, add.interaction = FALSE, drop.unused.levels = TRUE, bg.pt = NULL, add = FALSE, seed = 1, write.p.at.top = FALSE, ...) ## S3 method for class 'data.frame' myboxplot(object, cex = 0.5, ylab = "", xlab = "", main = "", box = TRUE, at = NULL, pch = 1, col = 1, test = "", paired = FALSE, ...) ## S3 method for class 'list' myboxplot(object, paired = FALSE, ...) abline.shade.2(x, col=c(0,1,0)) abline.shade(pt, type = 5, col = c(0, 1, 0), alpha = 0.3) mylegend(legend, x, y=NULL, lty = NULL, bty = "n", ...) mymatplot(x, y, type = "b", lty = c(1, 2, 1, 2, 1, 2), pch = NULL, col = rep(c("darkgray", "black"), each = 3), xlab = NULL, ylab = "", draw.x.axis = TRUE, bg = NA, lwd = 1, at = NULL, make.legend = TRUE, legend = NULL, impute.missing.for.line = TRUE, legend.x = 9, legend.title = NULL, legend.cex = 1, legend.lty = lty, legend.inset = 0, xaxt = "s", y.intersp = 1.5, x.intersp = 0.3, text.width = NULL, add = FALSE, ... ) mypairs(dat, ladder = FALSE, show.data.cloud = TRUE, ladder.add.line = T, ladder.add.text = T, ...) wtd.hist (x, breaks = "Sturges", freq = NULL, probability = !freq, include.lowest = TRUE, right = TRUE, density = NULL, angle = 45, col = NULL, border = NULL, main = paste("Histogram of", xname), xlim = range(breaks), ylim = NULL, xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, nclass = NULL, weight = NULL, ...) mylines(x, y, type = "l", ...) myfigure(mfrow = c(1, 1), mfcol = NULL, width = NULL, height = NULL, oma = NULL, mar = NULL, main.outer = FALSE, bg=NULL) mypdf(...) mypng(...) mytiff(...) mypostscript(file = "temp", mfrow = c(1, 1), mfcol = NULL, width = NULL, height = NULL, ext = c("eps", "pdf", "png", "tiff"), oma = NULL, mar = NULL, main.outer = FALSE, save2file = TRUE, res = 200, silent = TRUE, ...) panel.cor(x, y, digits = 2, prefix = "", cex.cor, cor., leading0 = FALSE, cex.cor.dep = TRUE, ...) panel.hist(x, ...) panel.nothing(x, ...) corplot(object, ...) ## Default S3 method: corplot(object, y, ...) ## S3 method for class 'formula' corplot(formula, data, main = "", method = c("pearson", "spearman"), col = 1, cex = 0.5, add.diagonal.line = TRUE, add.lm.fit = FALSE, add.loess.fit = FALSE, col.lm = 2, add.deming.fit = FALSE, col.deming = 4, add = FALSE, digit.cor = 2, log = "", same.xylim = FALSE, xlim = NULL, ylim = NULL, ...)
digit.cor |
number of digits to print correlation |
col.points |
color of points |
dat.ls |
named list of vectors. A histogram is made for each vector. |
bin_width |
width of bin for histograms |
scale.factors |
named vector of scale factors to scale the histogram counts by |
cex.legend |
cex for legend |
silent |
tbdadd |
legend.lty |
tbdadd |
cex.cor.dep |
tbdadd |
add.loess.fit |
tbdadd |
leading0 |
tbdadd |
null.val |
tbdadd |
write.p.at.top |
tbdadd |
text.width |
tbdadd |
text |
tbdadd |
cex |
tbdcex |
adj |
tbdpt2 |
file |
tbdfile |
ext |
tbdext |
res |
resolution. |
add.norm |
Boolean, whether to add normal approximation density line |
col.norm |
string, color of added normal density line |
pt1 |
tbdpt1 |
s |
tbdslope |
ladder |
tbdslope |
slope |
tbdslope |
friedman.test.formula |
tbdslope |
reshape.id |
tbdslope |
impute.missing.for.line |
tbdslope |
cor. |
tbdslope |
mydev |
tbdslope |
jitter |
Boolean |
add.interaction |
Boolean |
... |
tbd... |
xaxt |
tbdpt2 |
breaks |
tbdpt2 |
freq |
tbdpt2 |
bg.pt |
tbdpt2 |
probability |
tbdpt2 |
include.lowest |
tbdpt2 |
right |
tbdpt2 |
density |
tbdpt2 |
angle |
tbdpt2 |
border |
tbdpt2 |
axes |
tbdpt2 |
plot |
tbdpt2 |
labels |
tbdpt2 |
nclass |
tbdpt2 |
weight |
tbdpt2 |
pt2 |
tbdpt2 |
pt |
tbdpt2 |
alpha |
tbdpt2 |
dat |
tbddat |
lwd |
line width. |
x.intersp |
controls the look of legend. |
y.intersp |
controls the look of legend. |
legend.inset |
legend inset |
dat2 |
tbddat2 |
add |
tbdadd |
log |
log |
add.lm.fit |
lm fit |
add.deming.fit |
add |
col.lm |
col |
col.deming |
col |
reshape.formula |
a formula object. |
xaxislabels |
tbdxaxislabels |
x.ori |
tbdx.ori |
xlab |
tbdxlab |
ylab |
tbdylab |
cex.axis |
tbdcex.axis |
len |
tbdlen |
same.xylim |
Boolean. Whether xlim and ylim should be the same |
xlim |
tbdxlim |
ylim |
tbdxlim |
main |
tbdmain |
col.1 |
tbdcol.1 |
col.2 |
tbdcol.2 |
pcol |
tbdpcol |
lcol |
tbdlcol |
object |
tbdobject |
formula |
tbdformula |
data |
tbddata |
box |
tbdbox |
at |
tbdat |
pch |
tbdpch |
col |
tbdcol |
test |
string. For example, "t","w","f","k", "tw" |
legend |
tbdlegend |
x |
tbdx |
lty |
tbdlty |
bty |
tbdbty |
type |
tbdtype |
make.legend |
tbdmake.legend |
legend.x |
tbdlegend.x |
legend.title |
tbdlegend.title |
legend.cex |
tbdlegend.cex |
draw.x.axis |
tbddraw.x.axis |
bg |
tbdbg |
method |
tbdmethod |
mfrow |
tbdmfrow |
mfcol |
tbdmfcol |
width |
tbdwidth |
height |
tbdheight |
oma |
tbdoma |
mar |
tbdmar |
main.outer |
tbdmain.outer |
save2file |
tbdsave2file |
y |
tbdy |
digits |
tbddigits |
prefix |
tbdprefix |
cex.cor |
cex cor |
plot.labels |
Boolean |
order |
Boolean |
decreasing |
Boolean |
add.diagonal.line |
tbdadd.diagonal.line |
x2 |
tbdadd.diagonal.line |
vline |
tbdadd.diagonal.line |
cols |
tbdadd.diagonal.line |
na.action |
tbdadd.diagonal.line |
drop.unused.levels |
tbdadd.diagonal.line |
p.val |
tbdx |
seed |
tbdx |
paired |
tbdx |
show.data.cloud |
tbdx |
ladder.add.line |
tbdx |
ladder.add.text |
tbdx |
myboxplot shows data points along with boxes. The data poins are jittered and the pattern of jittering is made reproducible in repeated calls. The test can only take one type of test currently.
myforestplot is modified from code from Allan deCamp/SCHARP. dat should have three columns. first column should be point estimate, second and third lci and uci, fourth p value. col.1 is the color used for CIs that do not include null, col.2 is used for CIs that do include null. If order is TRUE, the rows are ordered by the first column of dat. descreasing can be used to change the behavior of order.
corplot.formula uses MethComp::Deming by Bendix Carstensen to fit Deming regression.
wtd.hist is copied from weights package, author: Josh Pasek.
mymatplot will use na.approx (zoo) to fill in NA before plotting in order to draw continuous lines. The filled-in values will not be shown as points.
smoothed.scaled.hist draws histograms and overlay densities on top.
set.seed(1) x=1:50+rnorm(50,0,4) y=1:50+rnorm(50,0,4) dat=data.frame(x, y) corplot(y~x,dat,add.lm.fit=TRUE,add.deming.fit=TRUE,col.lm="red",col.deming="blue") dat=data.frame(y=c(1:10,2:11), x=rep(c("a","b"),each=10), ptid=c(1:10,1:10)) par(mfrow=c(1,2)) myboxplot(y~x, dat, test="w", jitter=FALSE) myboxplot(y~x, dat, test="f", add.interaction=TRUE, reshape.formula=y~x, reshape.id="ptid") myboxplot(list(jitter(1:10), jitter(3:12)), test="w") myboxplot(list(jitter(1:10), jitter(3:12)), test="w", paired=TRUE) smoothed.scaled.hist(list(A=rnorm(100,0,1)), bin_width=0.1, xlab="x") smoothed.scaled.hist(list(A=rnorm(100,0,1), B=rnorm(500,10,2)), bin_width=0.1, xlab="x") ## Not run: myfigure(mfrow=c(1,2)) plot(1:10) plot(1:10) mydev.off(ext="png,pdf", file="tmp") ## End(Not run) #myboxplot x axis may look weird if log="xy"
set.seed(1) x=1:50+rnorm(50,0,4) y=1:50+rnorm(50,0,4) dat=data.frame(x, y) corplot(y~x,dat,add.lm.fit=TRUE,add.deming.fit=TRUE,col.lm="red",col.deming="blue") dat=data.frame(y=c(1:10,2:11), x=rep(c("a","b"),each=10), ptid=c(1:10,1:10)) par(mfrow=c(1,2)) myboxplot(y~x, dat, test="w", jitter=FALSE) myboxplot(y~x, dat, test="f", add.interaction=TRUE, reshape.formula=y~x, reshape.id="ptid") myboxplot(list(jitter(1:10), jitter(3:12)), test="w") myboxplot(list(jitter(1:10), jitter(3:12)), test="w", paired=TRUE) smoothed.scaled.hist(list(A=rnorm(100,0,1)), bin_width=0.1, xlab="x") smoothed.scaled.hist(list(A=rnorm(100,0,1), B=rnorm(500,10,2)), bin_width=0.1, xlab="x") ## Not run: myfigure(mfrow=c(1,2)) plot(1:10) plot(1:10) mydev.off(ext="png,pdf", file="tmp") ## End(Not run) #myboxplot x axis may look weird if log="xy"
roundup prints a specified number of digits after decimal point even if 0s are needed at the end. formatInt prints a specified number of digits before decimal point even if 0s are needed at the beginning.
myprint(object, ...) ## Default S3 method: myprint(..., newline = TRUE, digits = 3, print.name=TRUE) ## S3 method for class 'matrix' myprint(object, ...) formatInt(x, digits, fill = "0", ...) prettyprint (value, digit=2) make.latex.coef.table(models, model.names = NULL, row.major = FALSE, round.digits = NULL) mysanitize.text(str) mysanitize.numbers(x) mytex(dat = NULL, file.name = "temp", digits = NULL, display = NULL, align = "r", include.rownames = TRUE, include.colnames = TRUE, col.headers = NULL, comment = FALSE, floating = FALSE, lines = TRUE, hline.after = NULL, add.to.row = NULL, sanitize.text.function = NULL, append = FALSE, preamble = "", input.foldername = NULL, save2input.only = NULL, caption = NULL, label = paste("tab", last(strsplit(file.name, "/")[[1]]), sep = " "), table.placement = "h!", add.clear.page.between.tables = FALSE, longtable = FALSE, verbose = FALSE, silent = TRUE, ...) mytex.begin(file.name, preamble = "") mytex.end(file.name) mywrite(x, ...) mywrite.csv(x, file = "tmp", row.names = FALSE, digits = NULL, silent = TRUE, ...) roundup(value, digits, na.to.empty = TRUE, remove.leading0 = FALSE) formatDouble(value, digits, na.to.empty = TRUE, remove.leading0 = FALSE)
myprint(object, ...) ## Default S3 method: myprint(..., newline = TRUE, digits = 3, print.name=TRUE) ## S3 method for class 'matrix' myprint(object, ...) formatInt(x, digits, fill = "0", ...) prettyprint (value, digit=2) make.latex.coef.table(models, model.names = NULL, row.major = FALSE, round.digits = NULL) mysanitize.text(str) mysanitize.numbers(x) mytex(dat = NULL, file.name = "temp", digits = NULL, display = NULL, align = "r", include.rownames = TRUE, include.colnames = TRUE, col.headers = NULL, comment = FALSE, floating = FALSE, lines = TRUE, hline.after = NULL, add.to.row = NULL, sanitize.text.function = NULL, append = FALSE, preamble = "", input.foldername = NULL, save2input.only = NULL, caption = NULL, label = paste("tab", last(strsplit(file.name, "/")[[1]]), sep = " "), table.placement = "h!", add.clear.page.between.tables = FALSE, longtable = FALSE, verbose = FALSE, silent = TRUE, ...) mytex.begin(file.name, preamble = "") mytex.end(file.name) mywrite(x, ...) mywrite.csv(x, file = "tmp", row.names = FALSE, digits = NULL, silent = TRUE, ...) roundup(value, digits, na.to.empty = TRUE, remove.leading0 = FALSE) formatDouble(value, digits, na.to.empty = TRUE, remove.leading0 = FALSE)
digit |
tbddigit |
silent |
tbdnewline |
input.foldername |
tbdnewline |
object |
tbdnewline |
newline |
tbdnewline |
print.name |
tbddigits |
save2input.only |
Boolean |
include.colnames |
Boolean |
col.headers |
string. Column headers |
comment |
Boolean, whether to include the version and timestamp comment |
hline.after |
vector |
add.to.row |
a list |
sanitize.text.function |
a function |
str |
tbdvalue |
remove.leading0 |
tbdvalue |
caption |
tbdvalue |
longtable |
tbdvalue |
label |
default to be the same as file.name stem |
table.placement |
tbdvalue |
na.to.empty |
tbdvalue |
value |
tbdvalue |
digits |
tbddigits |
fill |
tbdfill |
models |
tbdmodels |
model.names |
tbdmodel.names |
row.major |
tbdrow.major |
round.digits |
tbdround.digits |
dat |
tbddat |
file.name |
tbdfile.name |
display |
tbddisplay |
align |
tbdalign |
append |
tbdappend |
preamble |
tbdpreamble |
include.rownames |
tbdinclude.rownames |
floating |
tbdfloating |
lines |
tbdlines |
... |
tbd... |
verbose |
tbd... |
x |
tbdx |
file |
tbdfile |
row.names |
tbdrow.names |
add.clear.page.between.tables |
tbdrow.names |
roundup (3.1, 2) # 3.10 formatInt(3, 2) # 03 ## Not run: # demo of dimnames tab=diag(1:4); rownames(tab)<-colnames(tab)<-1:4; names(dimnames(tab))=c("age","height") # for greek letter in the labels, we need sanitize.text.function=identity rownames(tab)[1]="$\alpha$" # note that to use caption, floating needs to be TRUE mytex (tab, file="tmp1", sanitize.text.function=identity, caption="This is a caption .........................", caption.placement="top", floating=TRUE) # col.headers has to have the RIGHT number of columns # but align is more flexible, may not need to include the rownames col tab=diag(1:4); rownames(tab)<-colnames(tab)<-1:4 mytex (tab, file="tmp", include.rownames = TRUE, align=c("c","c","c|","c","c"), col.headers= "\hline\n & \multicolumn{2}{c|}{Vaccine} & \multicolumn{2}{c}{Control} \\ \n") # not include rownames mytex (tab, file="tmp", include.rownames = FALSE, align=c("c","c","c|","c","c"), col.headers= "\hline\n \multicolumn{2}{c|}{Vaccine} & \multicolumn{2}{c}{Control} \\ \n") # It should work even if some rownames are duplicated tab=diag(1:4); rownames(tab)=rep(1,4); colnames(tab)<-1:4 mytex (tab, file="tmp", include.rownames = TRUE, align=c("c","c|","c","c"), col.headers= "\hline\n & \multicolumn{2}{c|}{Vaccine} & \multicolumn{2}{c}{Control} \\ \n") # add.to.rows tab=diag(1:4); rownames(tab)<-1:4; colnames(tab)<-c("a","b","c","d") mytex (tab, file="tmp", add.to.row=list( list(0,2), c(" \multicolumn{5}{l}{Heading 1} \\ \n", "\hline\n \multicolumn{5}{l}{Heading 2}\\ \n" )) ) ## End(Not run)
roundup (3.1, 2) # 3.10 formatInt(3, 2) # 03 ## Not run: # demo of dimnames tab=diag(1:4); rownames(tab)<-colnames(tab)<-1:4; names(dimnames(tab))=c("age","height") # for greek letter in the labels, we need sanitize.text.function=identity rownames(tab)[1]="$\alpha$" # note that to use caption, floating needs to be TRUE mytex (tab, file="tmp1", sanitize.text.function=identity, caption="This is a caption .........................", caption.placement="top", floating=TRUE) # col.headers has to have the RIGHT number of columns # but align is more flexible, may not need to include the rownames col tab=diag(1:4); rownames(tab)<-colnames(tab)<-1:4 mytex (tab, file="tmp", include.rownames = TRUE, align=c("c","c","c|","c","c"), col.headers= "\hline\n & \multicolumn{2}{c|}{Vaccine} & \multicolumn{2}{c}{Control} \\ \n") # not include rownames mytex (tab, file="tmp", include.rownames = FALSE, align=c("c","c","c|","c","c"), col.headers= "\hline\n \multicolumn{2}{c|}{Vaccine} & \multicolumn{2}{c}{Control} \\ \n") # It should work even if some rownames are duplicated tab=diag(1:4); rownames(tab)=rep(1,4); colnames(tab)<-1:4 mytex (tab, file="tmp", include.rownames = TRUE, align=c("c","c|","c","c"), col.headers= "\hline\n & \multicolumn{2}{c|}{Vaccine} & \multicolumn{2}{c}{Control} \\ \n") # add.to.rows tab=diag(1:4); rownames(tab)<-1:4; colnames(tab)<-c("a","b","c","d") mytex (tab, file="tmp", add.to.row=list( list(0,2), c(" \multicolumn{5}{l}{Heading 1} \\ \n", "\hline\n \multicolumn{5}{l}{Heading 2}\\ \n" )) ) ## End(Not run)
Generate samples from random variables.
dbern(x, prob, log = FALSE) dcorbern(x, p, a, log = FALSE) dmixnorm(x, mix.p, sd1, sd2, log = FALSE) dnorm.norm.gamma(x, p, same.distr = FALSE, log = FALSE) rbern(n, prob, generalized = FALSE) rbigamma(n, shape.1, shape.2, rate.1, rate.2, rho) rbilogistic(n, loc.1, loc.2, scale.1, scale.2, rho) rejective.sampling(N, n, pik) rnorm.ar(n, sd, rho) rnorm.norm.gamma(n, mu.0, lambda, alpha, beta) rmixnorm (n, mix.p, mu1, mu2, sd1, sd2) rdoublexp(n, location=0, scale=1) ddoublexp(x, location=0, scale=1) qdoublexp(p, location=0, scale=1) pdoublexp(q, location=0, scale=1) rbidoublexp(n, loc.1, loc.2, scale.1, scale.2, rho)
dbern(x, prob, log = FALSE) dcorbern(x, p, a, log = FALSE) dmixnorm(x, mix.p, sd1, sd2, log = FALSE) dnorm.norm.gamma(x, p, same.distr = FALSE, log = FALSE) rbern(n, prob, generalized = FALSE) rbigamma(n, shape.1, shape.2, rate.1, rate.2, rho) rbilogistic(n, loc.1, loc.2, scale.1, scale.2, rho) rejective.sampling(N, n, pik) rnorm.ar(n, sd, rho) rnorm.norm.gamma(n, mu.0, lambda, alpha, beta) rmixnorm (n, mix.p, mu1, mu2, sd1, sd2) rdoublexp(n, location=0, scale=1) ddoublexp(x, location=0, scale=1) qdoublexp(p, location=0, scale=1) pdoublexp(q, location=0, scale=1) rbidoublexp(n, loc.1, loc.2, scale.1, scale.2, rho)
q |
tbdx |
location |
tbdx |
scale |
tbdx |
x |
tbdx |
prob |
tbdprob |
log |
tbdlog |
p |
tbdp |
a |
tbda |
mix.p |
tbdmix.p |
sd1 |
tbdsd1 |
sd2 |
tbdsd2 |
same.distr |
tbdsame.distr |
n |
tbdn |
generalized |
tbdgeneralized |
N |
tbdN |
pik |
tbdpik |
mu |
tbdmu |
mu1 |
tbdmu |
mu2 |
tbdmu |
sd |
tbdsd |
alpha |
tbdalpha |
mu.0 |
tbdmu.0 |
lambda |
tbdlambda |
beta |
tbdbeta |
loc.1 |
tbdbeta |
loc.2 |
tbdbeta |
scale.1 |
tbdbeta |
scale.2 |
tbdbeta |
rate.1 |
tbdbeta |
rate.2 |
tbdbeta |
shape.1 |
tbdbeta |
shape.2 |
tbdbeta |
rho |
tbdbeta |
rbern generates Bernoulli random variables.
rbilogistic generates a bivariate logistic distribution for correlation coefficient 0.5, or [-0.271, 0.478].
In the former case it is generated by calling rbilogis
, part of the VGAM package; in the latter case it is generated via the AMH copular.
rnorm.ar simulate autoregressive normal random variables, correlation is rho^d between x_1 and x_(1+d)
set.seed(1) rbern(n=10, p=1/2) rbern(n=2, p=c(.999,.001)) ## Not run: tmp=replicate(1e4, rnorm.cor(10, 1, .81)) round(cor(t(tmp)),2) ## End(Not run)
set.seed(1) rbern(n=10, p=1/2) rbern(n=2, p=c(.999,.001)) ## Not run: tmp=replicate(1e4, rnorm.cor(10, 1, .81)) round(cor(t(tmp)),2) ## End(Not run)
getFormattedSummary prints a table of regression coefficient estimates and standard errors.
getFormattedSummary(fits, type = 12, est.digits = 2, se.digits = 2, robust, random = FALSE, VE = FALSE, to.trim = FALSE, rows = NULL, coef.direct = FALSE, trunc.large.est = TRUE, scale.factor = 1, p.digits = 3, remove.leading0 = FALSE, p.adj.method = "fdr", ...) getVarComponent(object, ...) getFixedEf(object, ...) risk.cal(risk, binary.outcome, weights = NULL, ngroups = NULL, cuts = NULL, main = "", add = FALSE, show.emp.risk = TRUE, lcol = 2, ylim = NULL, scale = c("logit", "risk")) interaction.table(fit, v1, v2, v1.type = "continuous", v2.type = "continuous", logistic.regression = TRUE) ## S3 method for class 'coxph' getFixedEf(object, exp=FALSE,robust=FALSE, ...) ## S3 method for class 'gam' getFixedEf(object, ...) ## S3 method for class 'gee' getFixedEf(object, exp = FALSE, ...) ## S3 method for class 'geese' getFixedEf(object, robust = TRUE, ...) ## S3 method for class 'tps' getFixedEf(object, exp=FALSE, robust=TRUE, ...) ## S3 method for class 'glm' getFixedEf(object, exp = FALSE, robust = TRUE, ret.robcov = FALSE, ...) ## S3 method for class 'svyglm' getFixedEf(object, exp = FALSE, robust = TRUE, ...) ## S3 method for class 'svy_vglm' getFixedEf(object, exp = FALSE, robust = TRUE, ...) ## S3 method for class 'svycoxph' getFixedEf(object, exp = FALSE, robust = TRUE, ...) ## S3 method for class 'inla' getFixedEf(object, ...) ## S3 method for class 'lm' getFixedEf(object, exp = F, ...) ## S3 method for class 'lme' getFixedEf(object, ...) ## S3 method for class 'logistf' getFixedEf(object, exp = FALSE, ...) ## S3 method for class 'matrix' getFixedEf(object, ...) ## S3 method for class 'MIresult' getFixedEf(object, exp = FALSE, ...) ## S3 method for class 'hyperpar.inla' getVarComponent(object, transformation = NULL, ...) ## S3 method for class 'matrix' getVarComponent(object, ...) ## S3 method for class 'geese' coef(object, ...) ## S3 method for class 'tps' coef(object, ...) ## S3 method for class 'geese' predict(object, x, ...) ## S3 method for class 'tps' predict(object, newdata = NULL, type = c("link", "response"), ...) ## S3 method for class 'geese' residuals(object, y, x,...) ## S3 method for class 'geese' vcov(object, ...) ## S3 method for class 'tps' vcov(object, robust, ...) ## S3 method for class 'logistf' vcov(object, ...)
getFormattedSummary(fits, type = 12, est.digits = 2, se.digits = 2, robust, random = FALSE, VE = FALSE, to.trim = FALSE, rows = NULL, coef.direct = FALSE, trunc.large.est = TRUE, scale.factor = 1, p.digits = 3, remove.leading0 = FALSE, p.adj.method = "fdr", ...) getVarComponent(object, ...) getFixedEf(object, ...) risk.cal(risk, binary.outcome, weights = NULL, ngroups = NULL, cuts = NULL, main = "", add = FALSE, show.emp.risk = TRUE, lcol = 2, ylim = NULL, scale = c("logit", "risk")) interaction.table(fit, v1, v2, v1.type = "continuous", v2.type = "continuous", logistic.regression = TRUE) ## S3 method for class 'coxph' getFixedEf(object, exp=FALSE,robust=FALSE, ...) ## S3 method for class 'gam' getFixedEf(object, ...) ## S3 method for class 'gee' getFixedEf(object, exp = FALSE, ...) ## S3 method for class 'geese' getFixedEf(object, robust = TRUE, ...) ## S3 method for class 'tps' getFixedEf(object, exp=FALSE, robust=TRUE, ...) ## S3 method for class 'glm' getFixedEf(object, exp = FALSE, robust = TRUE, ret.robcov = FALSE, ...) ## S3 method for class 'svyglm' getFixedEf(object, exp = FALSE, robust = TRUE, ...) ## S3 method for class 'svy_vglm' getFixedEf(object, exp = FALSE, robust = TRUE, ...) ## S3 method for class 'svycoxph' getFixedEf(object, exp = FALSE, robust = TRUE, ...) ## S3 method for class 'inla' getFixedEf(object, ...) ## S3 method for class 'lm' getFixedEf(object, exp = F, ...) ## S3 method for class 'lme' getFixedEf(object, ...) ## S3 method for class 'logistf' getFixedEf(object, exp = FALSE, ...) ## S3 method for class 'matrix' getFixedEf(object, ...) ## S3 method for class 'MIresult' getFixedEf(object, exp = FALSE, ...) ## S3 method for class 'hyperpar.inla' getVarComponent(object, transformation = NULL, ...) ## S3 method for class 'matrix' getVarComponent(object, ...) ## S3 method for class 'geese' coef(object, ...) ## S3 method for class 'tps' coef(object, ...) ## S3 method for class 'geese' predict(object, x, ...) ## S3 method for class 'tps' predict(object, newdata = NULL, type = c("link", "response"), ...) ## S3 method for class 'geese' residuals(object, y, x,...) ## S3 method for class 'geese' vcov(object, ...) ## S3 method for class 'tps' vcov(object, robust, ...) ## S3 method for class 'logistf' vcov(object, ...)
... |
tbd... |
object |
tbdobject |
fit |
tbdfit |
coef.direct |
tbdfit |
robust |
Boolean, whether to return robust variance estimate |
exp |
tbdexp |
remove.leading0 |
tbdexp |
p.adj.method |
tbdexp |
cuts |
tbdfits |
ret.robcov |
tbdfits |
fits |
tbdfits |
type |
tbdtype |
est.digits |
tbdest.digits |
se.digits |
tbdse.digits |
p.digits |
tbdse.digits |
random |
tbdrandom |
VE |
tbdrandom |
transformation |
tbdtransformation |
weights |
tbdv1 |
v1 |
tbdv1 |
v2 |
tbdv2 |
v1.type |
tbdv1.type |
v2.type |
tbdv2.type |
logistic.regression |
tbdlogistic.regression |
newdata |
tbdx |
x |
tbdx |
y |
tbdy |
to.trim |
tbdy |
rows |
tbdy |
risk |
tbdfit |
binary.outcome |
tbdfit |
ngroups |
tbdfit |
main |
tbdfit |
add |
tbdfit |
show.emp.risk |
tbdfit |
lcol |
tbdfit |
ylim |
tbdfit |
scale |
tbdfit |
trunc.large.est |
tbdfit |
scale.factor |
tbdfit |
getFormattedSummary: from a list of fits, say lmer, inla fits, return formatted summary controlled by "type". For a matrix, return Monte Carlo variance random=TRUE returns variance components type=1: est type=2: est (se) type=3: est (2.5 percent, 97.5 percent) type=4: est se
getFixedEf returns a matrix, first column coef, second column se,
getFixedEf.matrix used to get mean and sd from a jags or winbugs sample, getVarComponent.matrix and getFixedEf.matrix do the same thing. Each column of samples is a variable
interaction.table expects coef and vcov to work with fit.
## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2, 10, 20, labels = c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) glm.D9 <- glm(weight ~ group) getFormattedSummary (list(lm.D9, glm.D9), robust=FALSE)
## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2, 10, 20, labels = c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) glm.D9 <- glm(weight ~ group) getFormattedSummary (list(lm.D9, glm.D9), robust=FALSE)
ROC/AUC methods.
fastauc
calculates the AUC using a sort operation, instead of summing over pairwise differences in R. computeRoc
computes an ROC curve. plotRoc
plots an ROC curve. addRoc
adds an ROC curve to a plot. classification.error
computes classification error
fastauc (score, outcome, t0 = 0, t1 = 1, reverse.sign.if.nece = TRUE, quiet = FALSE) computeRoc (score, outcome, reverse.sign.if.nece = TRUE, cutpoints = NULL) plotRoc(x, add = FALSE, type = "l", diag.line=TRUE,...) addRoc (x,...) classification.error(score, outcome, threshold=NULL, verbose=FALSE)
fastauc (score, outcome, t0 = 0, t1 = 1, reverse.sign.if.nece = TRUE, quiet = FALSE) computeRoc (score, outcome, reverse.sign.if.nece = TRUE, cutpoints = NULL) plotRoc(x, add = FALSE, type = "l", diag.line=TRUE,...) addRoc (x,...) classification.error(score, outcome, threshold=NULL, verbose=FALSE)
score |
a vector. Linear combination or score. |
outcome |
a vector of 0 and 1. Outcome. |
t0 |
a number between 0 and 1 that is the lower boundary of pAUC |
t1 |
a number between 0 and 1 that is the upper boundary of pAUC |
reverse.sign.if.nece |
a boolean. If TRUE, score is multiplied by -1 if AUC is less than 0.5. |
x |
a list of two elements: sensitivity and specificity. |
diag.line |
boolean. If TRUE, a diagonal line is plotted |
add |
boolean. If TRUE, add to existing plot. If FALSE, create a new plot. |
quiet |
boolean |
cutpoints |
cutpoints |
threshold |
threshold |
verbose |
boolean |
type |
line type for |
... |
arguments passed to |
These functions originally come from Thomas Lumley and Tianxi Cai et al.
computeRoc
returns a list of sensitivity and specificity. plotRoc
and addRoc
plots ROC curves.
Shuxin Yin
Youyi Fong [email protected]
Krisztian Sebestyen
n=1e2 score=c(rnorm(n/2,1), rnorm(n/2,0)) outcome=rep(1:0, each=n/2) # cannot print due to r cmd check #plotRoc(computeRoc(score, outcome)) # commented out b/c slower on pc and cause note when r cmd check ## test, fastauc2 is a version without all the checking #score=rnorm(1e5) #outcome=rbinom(1e5,1,.5) #system.time(for (i in 1:1e2) fastauc(score,outcome)) # 4.9 sec #system.time(for (i in 1:1e2) fastauc2(score,outcome)) # 3.8 sec
n=1e2 score=c(rnorm(n/2,1), rnorm(n/2,0)) outcome=rep(1:0, each=n/2) # cannot print due to r cmd check #plotRoc(computeRoc(score, outcome)) # commented out b/c slower on pc and cause note when r cmd check ## test, fastauc2 is a version without all the checking #score=rnorm(1e5) #outcome=rbinom(1e5,1,.5) #system.time(for (i in 1:1e2) fastauc(score,outcome)) # 4.9 sec #system.time(for (i in 1:1e2) fastauc2(score,outcome)) # 3.8 sec
sim.dat.tvarying.three simulates from a model with time varing age group variale of three levels, sim.dat.tvarying.two two.
sim.dat.tvarying.three(n, followup.length, incidence.density, age.sim = c("tvaryinggroup", "baselinegroup", "continuous","bt"), random.censoring.rate = 0.05, seed) sim.dat.tvarying.two(n, followup.length, incidence.density, age.sim = c("tvaryinggroup", "baselinegroup", "continuous","bt"), random.censoring.rate = 0.05, seed)
sim.dat.tvarying.three(n, followup.length, incidence.density, age.sim = c("tvaryinggroup", "baselinegroup", "continuous","bt"), random.censoring.rate = 0.05, seed) sim.dat.tvarying.two(n, followup.length, incidence.density, age.sim = c("tvaryinggroup", "baselinegroup", "continuous","bt"), random.censoring.rate = 0.05, seed)
n |
integer. Sample size. |
followup.length |
numeric. Length of followup, in years. |
incidence.density |
numeric. Incidence rate per year. |
age.sim |
string. Choose between one of three possibilities. tvaryinggroup: age group is time-varying covariate; baselinegroup: age group is a baseline covariate; continuous: age is a continuous covariate; bt: age group by treatment interaction uses baseline age group, while age group main effect uses time-dependent age group |
random.censoring.rate |
numeric. Amount of random censoring. |
seed |
integer. Random number generator seed. |
In sim.dat.tvarying.three, baseline age is uniformly distributed between 2.0 and 16.0, and divivded into three groups at 6 and 12. In sim.dat.tvarying.two, baseline age is uniformly distributed between 2.0 and 12.0, and divivded into two groups at 6.
Return a data frame with the following columns:
ptid |
subject identifier |
trt |
treatment indicator 0/1 |
for.non.tvarying.ana |
Boolean, used to subset dataset for non-time dependent analysis |
C |
censoring time |
baseline.age |
age years at baseline |
agegrp |
a factor with levels |
baseline.agegrp |
a factor with levels |
tstart |
left bound of time interval |
tstop |
right bound of time interval |
d |
event indicator |
X |
followup time, in years |
Youyi Fong
library(survival) dat=sim.dat.tvarying.three(n=6000,followup.length=3, incidence.density=0.05, age.sim="tvaryinggroup", seed=1) f.tvarying = Surv(tstart,tstop,d) ~ trt*agegrp f = Surv(X,d) ~ trt*baseline.agegrp fits=list() fits[["tvarying"]]=coxph(f.tvarying, dat) fits[["baseline"]]=coxph(f, subset(dat, for.non.tvarying.ana)) fits
library(survival) dat=sim.dat.tvarying.three(n=6000,followup.length=3, incidence.density=0.05, age.sim="tvaryinggroup", seed=1) f.tvarying = Surv(tstart,tstop,d) ~ trt*agegrp f = Surv(X,d) ~ trt*baseline.agegrp fits=list() fits[["tvarying"]]=coxph(f.tvarying, dat) fits[["baseline"]]=coxph(f, subset(dat, for.non.tvarying.ana)) fits
H calculates entropy.
H(p, logbase = c("e", "2")) mutual.info(two.way.table, logbase = c("e", "2")) cor.mixed(x, ...) ## Default S3 method: cor.mixed(x, na.fun, method=c("pearson","spearman"), ...) ## S3 method for class 'vector' cor.mixed(x, y, na.fun, method=c("pearson","spearman"), ...) ## S3 method for class 'formula' cor.mixed(formula, data, na.fun, method=c("pearson","spearman"), ...) skew (x, na.rm = FALSE) info.cor(two.way.table) yule.y(two.by.two.matrix) kappacor(two.by.two.matrix, weight = c(1, 1), maximum = FALSE) l.measure(two.by.two.matrix)
H(p, logbase = c("e", "2")) mutual.info(two.way.table, logbase = c("e", "2")) cor.mixed(x, ...) ## Default S3 method: cor.mixed(x, na.fun, method=c("pearson","spearman"), ...) ## S3 method for class 'vector' cor.mixed(x, y, na.fun, method=c("pearson","spearman"), ...) ## S3 method for class 'formula' cor.mixed(formula, data, na.fun, method=c("pearson","spearman"), ...) skew (x, na.rm = FALSE) info.cor(two.way.table) yule.y(two.by.two.matrix) kappacor(two.by.two.matrix, weight = c(1, 1), maximum = FALSE) l.measure(two.by.two.matrix)
p |
either a count vector or a probability vector, but can not be a vector of membership indicator |
logbase |
tbdlogbase |
na.rm |
tbdlogbase |
two.way.table |
tbdtwo.way.table |
x |
tbdx |
... |
tbd... |
na.fun |
tbdna.fun |
method |
tbdmethod |
y |
tbdy |
formula |
tbdformula |
data |
tbddata |
two.by.two.matrix |
tbdtwo.by.two.matrix |
weight |
tbdweight |
maximum |
tbdmaximum |
H(rep(1/5,5)) H(rep(3,5))
H(rep(1/5,5)) H(rep(3,5))
%+% concatenates its arguments and returns a string.
a %.% b contain(s1, s2) trim (x, trim.trailing=TRUE, trim.leading=TRUE) escapeUnderline(name) fileStem(file.name) firstIndex(s1, s2) getExt(file.name) getFileStem(file.name) getStem(file.name) lastIndex(s1, s2) remove.prefix(s, sep = "_")
a %.% b contain(s1, s2) trim (x, trim.trailing=TRUE, trim.leading=TRUE) escapeUnderline(name) fileStem(file.name) firstIndex(s1, s2) getExt(file.name) getFileStem(file.name) getStem(file.name) lastIndex(s1, s2) remove.prefix(s, sep = "_")
a |
a |
b |
b |
s1 |
s1 |
s2 |
s2 |
name |
name |
file.name |
file.name |
s |
s |
sep |
sep |
x |
sep |
trim.leading |
sep |
trim.trailing |
sep |
x=1 x %.% "b" %.% "c"
x=1 x %.% "b" %.% "c"
Testing functions.
hosmerlem(y, yhat, g = 10) quick.t.test(x, y, var.equal = FALSE) signtest(x) tukey.mtest(mu, ms, n) vector.t.test(mean.x, mean.y, var.x, var.y, n) myfisher.test(x,y,...) mycor.test(x, method = c("pearson", "kendall", "spearman"), idx = NULL)
hosmerlem(y, yhat, g = 10) quick.t.test(x, y, var.equal = FALSE) signtest(x) tukey.mtest(mu, ms, n) vector.t.test(mean.x, mean.y, var.x, var.y, n) myfisher.test(x,y,...) mycor.test(x, method = c("pearson", "kendall", "spearman"), idx = NULL)
... |
tbd |
y |
tbdy |
yhat |
tbdyhat |
g |
tbdg |
x |
tbdx |
var.equal |
tbdvar.equal |
method |
tbdmethod |
mu |
tbdmu |
ms |
tbdms |
n |
tbdn |
mean.x |
tbdmean.x |
mean.y |
tbdmean.y |
var.x |
tbdvar.x |
var.y |
tbdvar.y |
idx |
tbdvar.y |
signtest(runif(10))
signtest(runif(10))
Vaccine efficacy plots.
VEplot (object, ...) ## S3 method for class 'cox.zph' VEplot(object, resid = TRUE, se = TRUE, df = 4, nsmo = 40, var, ylab="VE", xlab="Time", xaxt="s", cex.axis=1, ...) ## S3 method for class 'glm' VEplot(object, X1, X2, x, ...) ## S3 method for class 'cox.zph' myplot(object, resid = TRUE, se = TRUE, df = 4, nsmo = 40, var, coef.transform=NULL, ylab=NULL, xlab="Time", xaxt="s", cex.axis=1, ...)
VEplot (object, ...) ## S3 method for class 'cox.zph' VEplot(object, resid = TRUE, se = TRUE, df = 4, nsmo = 40, var, ylab="VE", xlab="Time", xaxt="s", cex.axis=1, ...) ## S3 method for class 'glm' VEplot(object, X1, X2, x, ...) ## S3 method for class 'cox.zph' myplot(object, resid = TRUE, se = TRUE, df = 4, nsmo = 40, var, coef.transform=NULL, ylab=NULL, xlab="Time", xaxt="s", cex.axis=1, ...)
object |
An object |
resid |
Boolean, whether to plot residuals |
se |
Boolean, whether to plot confidence band |
df |
degrees of freedom |
nsmo |
number of points used to plot the fitted spline |
var |
estimated variance matrix from the Cox model fit |
xlab |
x label |
xaxt |
x axis |
cex.axis |
cex for axis |
ylab |
y label |
coef.transform |
a function to transform Cox hazard ratio estimate |
X1 |
a matrix of dimension k by p, where k is the length of x (see below) and p is the length of coef(object) |
X2 |
a matrix of dimension k by p, where k is the length of x (see below) and p is the length of coef(object) |
x |
a vector of length k that represents the x coordinate of the VE plot |
... |
additional parameters |
VEplot and myplot.cox.zph are extensions of survival::plot.cox.zph to plot VE curve and other transformations.
myplot.cox.zph adds the following parameters to the original list of parameters in plot.cox.zph: coef.transform: a function to transform the coefficients ylab: y axis label xlab: x axis label
VEplot.glm computes a series of k VEs: for i in 1...k, VE[i] = P(Y=1|X1[i,])/P(Y=1|X2[i,]). It returns a 3 by k matrix, whose first row contains VE estimates and the second and third rows contain lower and upper bounds, respectively.
Youyi Fong, Dennis Chao
Durham, Longini, Halloran, Clemens, Azhar and Rao (1998) "Estimation of vaccine efficacy in the presence of waning: application to cholera vaccines." American Journal of Epidemiology 147(10): 948-959.
library(survival) vfit <- coxph(Surv(time,status) ~ trt + factor(celltype) + karno + age, data=veteran, x=TRUE) temp <- cox.zph(vfit) par(mfrow=c(2,2)) for (v in c("trt","age")) { VEplot(temp, var=v, resid=FALSE, main=v, ylab="VE", cex.axis=1.5) plot(temp, var=v, resid=FALSE, main=v) } library(survival) fit <- glm(status ~ trt + trt*age, data=veteran) summary(fit) age=seq(min(veteran$age),max(veteran$age),length=10) out = VEplot(fit, X1=cbind(1,1,age,1*age), X2=cbind(1,0,age,0*age), x=age) out
library(survival) vfit <- coxph(Surv(time,status) ~ trt + factor(celltype) + karno + age, data=veteran, x=TRUE) temp <- cox.zph(vfit) par(mfrow=c(2,2)) for (v in c("trt","age")) { VEplot(temp, var=v, resid=FALSE, main=v, ylab="VE", cex.axis=1.5) plot(temp, var=v, resid=FALSE, main=v) } library(survival) fit <- glm(status ~ trt + trt*age, data=veteran) summary(fit) age=seq(min(veteran$age),max(veteran$age),length=10) out = VEplot(fit, X1=cbind(1,1,age,1*age), X2=cbind(1,0,age,0*age), x=age) out