Title: | Stochastic Process Model for Analysis of Longitudinal and Time-to-Event Outcomes |
---|---|
Description: | Utilities to estimate parameters of the models with survival functions induced by stochastic covariates. Miscellaneous functions for data preparation and simulation are also provided. For more information, see: (i)"Stochastic model for analysis of longitudinal data on aging and mortality" by Yashin A. et al. (2007), Mathematical Biosciences, 208(2), 538-551, <DOI:10.1016/j.mbs.2006.11.006>; (ii) "Health decline, aging and mortality: how are they related?" by Yashin A. et al. (2007), Biogerontology 8(3), 291(302), <DOI:10.1007/s10522-006-9073-3>. |
Authors: | I. Zhbannikov, Liang He, K. Arbeev, I. Akushevich, A. Yashin. |
Maintainer: | Ilya Y. Zhbannikov <[email protected]> |
License: | GPL |
Version: | 1.7.12 |
Built: | 2024-12-10 06:46:19 UTC |
Source: | CRAN |
function loading results in global environment
assign_to_global(pos = 1, what, value)
assign_to_global(pos = 1, what, value)
pos |
defaults to 1 which equals an assingment to global environment |
what |
variable to assign to |
value |
value to assign |
This is the longitudinal genetic dataset.
Liang He
An internal function to compute m and gamma based on continuous-time model (Yashin et. al., 2007)
func1(tt, y, a, f1, Q, f, b, theta)
func1(tt, y, a, f1, Q, f, b, theta)
tt |
tt - time |
y |
y |
a |
a (see Yashin et. al, 2007) |
f1 |
f1 (see Yashin et. al, 2007) |
Q |
Q (see Yashin et. al, 2007) |
f |
f (see Yashin et. al, 2007) |
b |
b (see Yashin et. al, 2007) |
theta |
theta |
list(m, gamma) Next values of m and gamma (see Yashin et. al, 2007)
An internal function to obtain column index by its name
get.column.index(x, col.name)
get.column.index(x, col.name)
x |
Dataset |
col.name |
Column name |
column index(es) in the provided dataset
An internal function to compute next Y based on continous-time model (Yashin et. al., 2007)
getNextY.cont(y1, t1, t2, a, f1, Q, f, b, theta)
getNextY.cont(y1, t1, t2, a, f1, Q, f, b, theta)
y1 |
y1 |
t1 |
t1 |
t2 |
t2 |
a |
a (see Yashin et. al, 2007) |
f1 |
f1 (see Yashin et. al, 2007) |
Q |
Q (see Yashin et. al, 2007) |
f |
f (see Yashin et. al, 2007) |
b |
b (see Yashin et. al, 2007) |
theta |
theta (see Yashin et. al, 2007) |
y.next Next value of Y
An internal function to compute next value of physiological variable Y
getNextY.cont2(y1, t1, t2, b, a, f1)
getNextY.cont2(y1, t1, t2, b, a, f1)
y1 |
y1 |
t1 |
t1 |
t2 |
t2 |
b |
b (see Yashin et. al, 2007) |
a |
a (see Yashin et. al, 2007) |
f1 |
f1 (see Yashin et. al, 2007) |
y.next Next value of y
An internal function to compute the next value of physiological variable Y based on discrete-time model (Akushevich et. al., 2005)
getNextY.discr(y1, u, R, Sigma)
getNextY.discr(y1, u, R, Sigma)
y1 |
y1 |
u |
u (see Akushevich et. al, 2005) |
R |
R (see Akushevich et. al, 2005) |
Sigma |
Sigma (see Akushevich et. al, 2005) |
y.next Next value of y
An internal function to compute next m based on dicrete-time model
getNextY.discr.m(y1, u, R)
getNextY.discr.m(y1, u, R)
y1 |
y1 |
u |
u |
R |
R |
m Next value of m (see Yashin et. al, 2007)
An internal function to compute previous value of physiological variable Y based on discrete-time model
getPrevY.discr(y2, u, R, Sigma)
getPrevY.discr(y2, u, R, Sigma)
y2 |
y2 |
u |
u |
R |
R |
Sigma |
Sigma |
y1 Previous value of y
An internal function to compute previous m based on discrete-time model
getPrevY.discr.m(y2, u, R)
getPrevY.discr.m(y2, u, R)
y2 |
y2 |
u |
u |
R |
R |
m Next value of m (see Yashin et. al, 2007)
This is the longitudinal dataset.
Ilya Y Zhbannikov [email protected]
Likelihood-ratio test
LRTest(LA, L0, df = 1)
LRTest(LA, L0, df = 1)
LA |
Log-likelihood for alternative hyphotesis |
L0 |
Log-likelihood for null hyphotesis |
df |
Degrees of freedom for Chi-square test |
p-value of LR test.
An internal function to compute m from
m(y, t1, t2, a, f1)
m(y, t1, t2, a, f1)
y |
Current value of Y |
t1 |
t1 |
t2 |
t2 |
a |
a (see Yashin et. al, 2007) |
f1 |
f1 (see Yashin et. al, 2007) |
m m (see Yashin et. al, 2007)
An internal function which construct short data format from a given long
make.short.format( x, col.id = 1, col.status = 2, col.t1 = 3, col.t2 = 4, col.cov = 5 )
make.short.format( x, col.id = 1, col.status = 2, col.t1 = 3, col.t2 = 4, col.cov = 5 )
x |
Dataset |
col.id |
Column ID index |
col.status |
Column status index |
col.t1 |
Column t1 index |
col.t2 |
Column t2 index |
col.cov |
Column covariates indices |
column index(es) in the provided dataset
An internal function to compute mu
mu(y, mu0, b, Q, theta, tt)
mu(y, mu0, b, Q, theta, tt)
y |
Current value of y |
mu0 |
mu0 (see Yashin et. al, 2007) |
b |
b (see Yashin et. al, 2007) |
Q |
Q (see Yashin et. al, 2007) |
theta |
theta (see Yashin et. al, 2007) |
tt |
t (time) |
mu Next value of mu
Data pre-processing for analysis with stochastic process model methodology.
prepare_data( x, col.id = NA, col.status = NA, col.age = NA, col.age.event = NA, covariates = NA, interval = 1, verbose = FALSE )
prepare_data( x, col.id = NA, col.status = NA, col.age = NA, col.age.event = NA, covariates = NA, interval = 1, verbose = FALSE )
x |
A path to the file with table of follow-up oservations (longitudinal table). File formats: csv, sas7bdat |
col.id |
A name of column containing subject ID. This ID should be the same in both x (longitudinal) and y (vital statistics) tables. None: if col.id not provided, the first column of the x and first column of the y will be used by default. |
col.status |
A name of the column containing status variable (0/1, which is an indicator of death/censoring). Note: if not provided - then the column #2 from the y (vital statistics) dataset will be used. |
col.age |
A name of age column (also called 't1'). This column represents a time (age) of measurement. If not provided then the 3rd column from the longitudinal dataset (x) will be used. |
col.age.event |
A name of 'event' column. The event column indicates a time when the even occured (e.g. system failure). Note: if not provided then the 3rd column from the y (vital statistics) dataset will be used. |
covariates |
A list of covariates (physiological variables). If covariates not provided, then all columns from longitudinal table having index > 3 will be used as covariates. |
interval |
A number of breaks between observations for data for discrete model. This interval must be integer and should be equal or greater than 1. Default = 1 unit of time. |
verbose |
A verbosing output indicator. Default=FALSE. |
A list of two elements: first element contains a preprocessed data for continuous model, with arbitrary intervals between observations and second element contains a prepocessed data table for a discrete model (with constant intervals between observations).
## Not run: library(stpm) data <- prepare_data(x=system.file("extdata","longdat.csv",package="stpm")) head(data[[1]]) head(data[[2]]) ## End(Not run)
## Not run: library(stpm) data <- prepare_data(x=system.file("extdata","longdat.csv",package="stpm")) head(data[[1]]) head(data[[2]]) ## End(Not run)
Prepares continuouts-time dataset.
prepare_data_cont( merged.data, col.status.ind, col.id.ind, col.age.ind, col.age.event.ind, col.covar.ind, verbose, dt )
prepare_data_cont( merged.data, col.status.ind, col.id.ind, col.age.ind, col.age.event.ind, col.covar.ind, verbose, dt )
merged.data |
a longitudinal study dataset. |
col.status.ind |
index of "status" column. |
col.id.ind |
subject id column index. |
col.age.ind |
index of the age column. |
col.age.event.ind |
an index of the column which represents the time in which event occured. |
col.covar.ind |
a set of column indexes which represent covariates. |
verbose |
turns on/off verbosing output. |
dt |
interval between observations. |
Prepares discrete-time dataset.
prepare_data_discr( merged.data, interval, col.status.ind, col.id.ind, col.age.ind, col.age.event.ind, col.covar.ind, verbose )
prepare_data_discr( merged.data, interval, col.status.ind, col.id.ind, col.age.ind, col.age.event.ind, col.covar.ind, verbose )
merged.data |
a longitudinal study dataset. |
interval |
interval between observations. |
col.status.ind |
index of "status" column. |
col.id.ind |
subject id column index. |
col.age.ind |
index of the age column. |
col.age.event.ind |
an index of the column which represents the time in which event occured. |
col.covar.ind |
a set of column indexes which represent covariates. |
verbose |
turns on/off verbosing output. Filling the last cell |
An internal function to compute sigma square analytically
sigma_sq(t1, t2, b)
sigma_sq(t1, t2, b)
t1 |
t1 |
t2 |
t2 |
b |
b (see Yashin et. al, 2007) |
sigma_square (see Akushevich et. al, 2005)
Multi-dimension simulation function for data with partially observed covariates (multidimensional GenSPM) with arbitrary intervals
sim_pobs( N = 10, aH = -0.05, aL = -0.01, f1H = 60, f1L = 80, QH = 2e-08, QL = 2.5e-08, fH = 60, fL = 80, bH = 4, bL = 5, mu0H = 8e-06, mu0L = 1e-05, thetaH = 0.08, thetaL = 0.1, p = 0.25, ystart = 80, tstart = 30, tend = 105, dt = 1, sd0 = 1, mode = "observed", gomp = FALSE, nobs = NULL )
sim_pobs( N = 10, aH = -0.05, aL = -0.01, f1H = 60, f1L = 80, QH = 2e-08, QL = 2.5e-08, fH = 60, fL = 80, bH = 4, bL = 5, mu0H = 8e-06, mu0L = 1e-05, thetaH = 0.08, thetaL = 0.1, p = 0.25, ystart = 80, tstart = 30, tend = 105, dt = 1, sd0 = 1, mode = "observed", gomp = FALSE, nobs = NULL )
N |
Number of individuals. |
aH |
A k by k matrix, which characterize the rate of the adaptive response when Z = 1. |
aL |
A k by k matrix, which characterize the rate of the adaptive response when Z = 0. |
f1H |
A particular state, which if a deviation from the normal (or optimal) when Z = 1. This is a vector with length of k. |
f1L |
A particular state, which if a deviation from the normal (or optimal) when Z = 0. This is a vector with length of k. |
QH |
A matrix k by k, which is a non-negative-definite symmetric matrix when Z = 1. |
QL |
A matrix k by k, which is a non-negative-definite symmetric matrix when Z = 0. |
fH |
A vector-function (with length k) of the normal (or optimal) state when Z = 1. |
fL |
A vector-function (with length k) of the normal (or optimal) state when Z = 0. |
bH |
A diffusion coefficient, k by k matrix when Z = 1. |
bL |
A diffusion coefficient, k by k matrix when Z = 0. |
mu0H |
mortality at start period of time when Z = 1. |
mu0L |
mortality at start period of time when Z = 0. |
thetaH |
A displacement coefficient of the Gompertz function when Z = 1. |
thetaL |
A displacement coefficient of the Gompertz function when Z = 0. |
p |
A proportion of carriers in a sumulated population (default p = 0.25). |
ystart |
A vector with length equal to number of dimensions used, defines starting values of covariates. |
tstart |
A number that defines starting time (30 by default). |
tend |
A number, defines final time (105 by default). |
dt |
A discrete step size between two observations. A random uniform value is then added to this step size. |
sd0 |
A standard deviation for modelling the next physiological variable (covariate) value. |
mode |
Can have the following values: "observed" (default), "unobserved". This represents a type of group to simulate: a group with observed variable Z, or group with unbobserved variable Z. |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 and Q are used: mu0 = mu0*exp(theta*t). |
nobs |
A number of observations (lines) for individual observations. |
A table with simulated data.
Arbeev, K.G. et al (2009). Genetic model for longitudinal studies of aging, health, and longevity
Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.
library(stpm) dat <- sim_pobs(N=50) head(dat)
library(stpm) dat <- sim_pobs(N=50) head(dat)
Multi-dimensional simulation function for continuous-time SPM.
simdata_cont( N = 10, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 1e-05, theta = 0.08, ystart = 80, tstart = 30, tend = 105, dt = 1, sd0 = 1, nobs = NULL, gomp = TRUE, format = "long" )
simdata_cont( N = 10, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 1e-05, theta = 0.08, ystart = 80, tstart = 30, tend = 105, dt = 1, sd0 = 1, nobs = NULL, gomp = TRUE, format = "long" )
N |
Number of individuals. |
a |
A k by k matrix, represents the adaptive capacity of the organism |
f1 |
A trajectory that corresponds to the long-term average value of the stochastic process Y(t), which describes a trajectory of individual covariate (physiological variable) influenced by different factors represented by a random Wiener process W(t). This is a vector with length of k. |
Q |
A matrix k by k, which is a non-negative-definite symmetric matrix, represents a sensitivity of risk function to deviation from the norm. |
f |
A vector with length of k, represents the normal (or optimal) state of physiological variable. |
b |
A diffusion coefficient, k by k matrix, characterizes a strength of the random disturbances from Wiener process W(t). |
mu0 |
A baseline mortality. |
theta |
A displacement coefficient. |
ystart |
A vector with length equal of k, defines starting values of covariates. |
tstart |
A number that defines starting time (30 by default). |
tend |
A number, defines final time (105 by default). |
dt |
A discrete step size between two observations. A random uniform value is then added to this step size. |
sd0 |
a standard deviation for modelling the next covariate value. |
nobs |
A number of observations (lines) for individual observations. |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 and Q are used: mu0 = mu0*exp(theta*t). |
format |
Data format: "long" (default), "short". |
A table with simulated data.
Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.
library(stpm) dat <- simdata_cont(N=50) head(dat)
library(stpm) dat <- simdata_cont(N=50) head(dat)
Multi-dimension simulation function
simdata_discr( N = 100, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 1e-05, theta = 0.08, ystart = 80, tstart = 30, tend = 105, dt = 1, nobs = NULL, format = "long" )
simdata_discr( N = 100, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 1e-05, theta = 0.08, ystart = 80, tstart = 30, tend = 105, dt = 1, nobs = NULL, format = "long" )
N |
Number of individuals |
a |
A k by k matrix, which characterize the rate of the adaptive response. |
f1 |
A particular state, which is a deviation from the normal (or optimal). This is a vector with length of k. |
Q |
A matrix k by k, which is a non-negative-definite symmetric matrix. |
f |
A vector-function (with length k) of the normal (or optimal) state. |
b |
A diffusion coefficient, k by k matrix. |
mu0 |
mortality at start period of time. |
theta |
A displacement coefficient of the Gompertz function. |
ystart |
A vector with length equal to number of dimensions used, defines starting values of covariates. Default ystart = 80. |
tstart |
Starting time (age). Can be a number (30 by default) or a vector of two numbers: c(a, b) - in this case, starting value of time is simulated via uniform(a,b) distribution. |
tend |
A number, defines final time (105 by default). |
dt |
A time step (1 by default). |
nobs |
A number, defines a number of observations (lines) for an individual, NULL by default. |
format |
Data format: "long" (default), "short". |
A table with simulated data.
Akushevich I., Kulminski A. and Manton K. (2005), Life tables with covariates: Dynamic model for Nonlinear Analysis of Longitudinal Data. Mathematical Population Studies, 12(2), pp.: 51-80. <DOI:10.1080/08898480590932296>.
library(stpm) data <- simdata_discr(N=100) head(data)
library(stpm) data <- simdata_discr(N=100) head(data)
This script simulates data using familial frailty model. We use the following variation: gamma(mu, ssq), where mu is the mean and ssq is sigma square. See: https://www.rocscience.com/help/swedge/webhelp/swedge/Gamma_Distribution.htm
simdata_gamma_frailty( N = 10, f = list(at = "-0.05", f1t = "80", Qt = "2e-8", ft = "80", bt = "5", mu0t = "1e-3"), step = 1, tstart = 30, tend = 105, ystart = 80, sd0 = 1, nobs = NULL, gamma_mu = 1, gamma_ssq = 0.5 )
simdata_gamma_frailty( N = 10, f = list(at = "-0.05", f1t = "80", Qt = "2e-8", ft = "80", bt = "5", mu0t = "1e-3"), step = 1, tstart = 30, tend = 105, ystart = 80, sd0 = 1, nobs = NULL, gamma_mu = 1, gamma_ssq = 0.5 )
N |
Number of individuals. |
f |
a list of formulas that define age (time) - dependency. Default: list(at="a", f1t="f1", Qt="Q*exp(theta*t)", ft="f", bt="b", mu0t="mu0*exp(theta*t)") |
step |
An interval between two observations, a random uniformally-distributed value is then added to this step. |
tstart |
Starting time (age). Can be a number (30 by default) or a vector of two numbers: c(a, b) - in this case, starting value of time is simulated via uniform(a,b) distribution. |
tend |
A number, defines final time (105 by default). |
ystart |
A starting value of covariates. |
sd0 |
A standard deviation for modelling the next covariate value, sd0 = 1 by default. |
nobs |
A number of observations (lines) for individual observations. |
gamma_mu |
A parameter which is a mean value, default = 1 |
gamma_ssq |
A sigma squared, default = 0.5. |
A table with simulated data.
Yashin, A. et al (2007), Health decline, aging and mortality: how are they related? Biogerontology, 8(3), 291-302.<DOI:10.1007/s10522-006-9073-3>.
library(stpm) dat <- simdata_gamma_frailty(N=10) head(dat)
library(stpm) dat <- simdata_gamma_frailty(N=10) head(dat)
Simulation function for continuous trait with time-dependant coefficients.
simdata_time_dep( N = 10, f = list(at = "-0.05", f1t = "80", Qt = "2e-8", ft = "80", bt = "5", mu0t = "1e-3"), step = 1, tstart = 30, tend = 105, ystart = 80, sd0 = 1, nobs = NULL, format = "short" )
simdata_time_dep( N = 10, f = list(at = "-0.05", f1t = "80", Qt = "2e-8", ft = "80", bt = "5", mu0t = "1e-3"), step = 1, tstart = 30, tend = 105, ystart = 80, sd0 = 1, nobs = NULL, format = "short" )
N |
Number of individuals. |
f |
a list of formulas that define age (time) - dependency. Default: list(at="a", f1t="f1", Qt="Q*exp(theta*t)", ft="f", bt="b", mu0t="mu0*exp(theta*t)") |
step |
An interval between two observations, a random uniformally-distributed value is then added to this step. |
tstart |
Starting time (age). Can be a number (30 by default) or a vector of two numbers: c(a, b) - in this case, starting value of time is simulated via uniform(a,b) distribution. |
tend |
A number, defines final time (105 by default). |
ystart |
A starting value of covariates. |
sd0 |
A standard deviation for modelling the next covariate value, sd0 = 1 by default. |
nobs |
A number of observations (lines) for individual observations. |
format |
Data format: "short" (default), "long". |
A table with simulated data.
Yashin, A. et al (2007), Health decline, aging and mortality: how are they related? Biogerontology, 8(3), 291-302.<DOI:10.1007/s10522-006-9073-3>.
library(stpm) dat <- simdata_time_dep(N=100) head(dat)
library(stpm) dat <- simdata_time_dep(N=100) head(dat)
A central function that estimates Stochastic Process Model parameters a from given dataset.
spm( x, model = "discrete", formulas = list(at = "a", f1t = "f1", Qt = "Q", ft = "f", bt = "b", mu0t = "mu0"), start = NULL, tol = NULL, stopifbound = FALSE, lb = NULL, ub = NULL, pinv.tol = 0.01, theta.range = seq(0.01, 0.2, by = 0.001), verbose = FALSE, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08) )
spm( x, model = "discrete", formulas = list(at = "a", f1t = "f1", Qt = "Q", ft = "f", bt = "b", mu0t = "mu0"), start = NULL, tol = NULL, stopifbound = FALSE, lb = NULL, ub = NULL, pinv.tol = 0.01, theta.range = seq(0.01, 0.2, by = 0.001), verbose = FALSE, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08) )
x |
A dataset: is the output from prepare_data(...) function and consists of two separate data tables: (1) a data table for continuous-time model and (2) a data table for discrete-time model. |
model |
A model type. Choices are: "discrete", "continuous" or "time-dependent". |
formulas |
A list of parameter formulas used in the "time-dependent" model.
Default: |
start |
A starting values of coefficients in the "time-dependent" model. |
tol |
A tolerance threshold for matrix inversion (NULL by default). |
stopifbound |
A flag (default=FALSE) if it is set then the optimization stops when any of the parametrs achives lower or upper boundary. |
lb |
Lower boundary, default |
ub |
Upper boundary, default |
pinv.tol |
A tolerance threshold for matrix pseudo-inverse. Default: 0.01. |
theta.range |
A user-defined range of the parameter |
verbose |
A verbosing output indicator (FALSE by default). |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 and Q are used: mu0 = mu0*exp(theta*t), Q = Q*exp(theta*t). |
opts |
A list of options for |
For "discrete" (dmodel) and "continuous" (cmodel) model types: (1) a list of model parameter estimates for the discrete model type described in "Life tables with covariates: Dynamic Model for Nonlinear Analysis of Longitudinal Data", Akushevich et al, 2005.<DOI:10.1080/08898480590932296>, and (2) a list of model parameter estimates for the continuous model type described in "Stochastic model for analysis of longitudinal data on aging and mortality", Yashin et al, 2007, Math Biosci.<DOI:10.1016/j.mbs.2006.11.006>.
For the "time-dependent" model (model parameters depend on time): a set of model parameter estimates.
Yashin, A. et al (2007), Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.
Akushevich I., Kulminski A. and Manton K. (2005). Life tables with covariates: Dynamic model for Nonlinear Analysis of Longitudinal Data. Mathematical Popu-lation Studies, 12(2), pp.: 51-80. <DOI: 10.1080/08898480590932296>.
Yashin, A. et al (2007), Health decline, aging and mortality: how are they related? Biogerontology, 8(3), 291-302.<DOI:10.1007/s10522-006-9073-3>.
## Not run: library(stpm) data.continuous <- simdata_cont(N=1000) data.discrete <- simdata_discr(N=1000) data <- list(data.continuous, data.discrete) p.discr.model <- spm(data) p.discr.model p.cont.model <- spm(data, model="continuous") p.cont.model p.td.model <- spm(data, model="time-dependent",f=list(at="aa*t+bb", f1t="f1", Qt="Q", ft="f", bt="b", mu0t="mu0"), start=list(a=-0.001, bb=0.05, f1=80, Q=2e-8, f=80, b=5, mu0=1e-3)) p.td.model ## End(Not run)
## Not run: library(stpm) data.continuous <- simdata_cont(N=1000) data.discrete <- simdata_discr(N=1000) data <- list(data.continuous, data.discrete) p.discr.model <- spm(data) p.discr.model p.cont.model <- spm(data, model="continuous") p.cont.model p.td.model <- spm(data, model="time-dependent",f=list(at="aa*t+bb", f1t="f1", Qt="Q", ft="f", bt="b", mu0t="mu0"), start=list(a=-0.001, bb=0.05, f1=80, Q=2e-8, f=80, b=5, mu0=1e-3)) p.td.model ## End(Not run)
This function implements a analytical solution to estimate the parameters in the continuous SPM model by assuming all the parameters are constants.
spm_con_1d( spm_data, a = NA, b = NA, q = NA, f = NA, f1 = NA, mu0 = NA, theta = NA, lower = c(), upper = c(), control = list(xtol_rel = 1e-06), global = FALSE, verbose = TRUE, ahessian = FALSE )
spm_con_1d( spm_data, a = NA, b = NA, q = NA, f = NA, f1 = NA, mu0 = NA, theta = NA, lower = c(), upper = c(), control = list(xtol_rel = 1e-06), global = FALSE, verbose = TRUE, ahessian = FALSE )
spm_data |
A dataset for the SPM model. See the STPM package for more details about the format. |
a |
The initial value for the paramter |
b |
The initial value for the paramter |
q |
The initial value for the paramter |
f |
The initial value for the paramter |
f1 |
The initial value for the paramter |
mu0 |
The initial value for the paramter |
theta |
The initial value for the paramter |
lower |
A vector of the lower bound of the parameters. |
upper |
A vector of the upper bound of the parameters. |
control |
A list of the control parameters for the optimization paramters. |
global |
A logical variable indicating whether the MLSL (TRUE) or the L-BFGS (FALSE) algorithm is used for the optimization. |
verbose |
A logical variable indicating whether initial information is printed. |
ahessian |
A logical variable indicating whether the approximate (FALSE) or analytical (TRUE) Hessian is returned. |
est The estimates of the parameters.
hessian The Hessian matrix of the estimates.
lik The minus log-likelihood.
con A number indicating the convergence. See the 'nloptr' package for more details.
message Extra message about the convergence. See the 'nloptr' package for more details.
He, L., Zhbannikov, I., Arbeev, K. G., Yashin, A. I., and Kulminski, A.M., 2017. Genetic stochastic process model for detecting pleiotropic and interaction effects with longitudinal data.
{ library(stpm) dat <- simdata_cont(N=500) colnames(dat) <- c("id", "xi", "t1", "t2", "y", "y.next") res <- spm_con_1d(as.data.frame(dat), a=-0.05, b=2, q=1e-8, f=80, f1=90, mu0=1e-3, theta=0.08) }
{ library(stpm) dat <- simdata_cont(N=500) colnames(dat) <- c("id", "xi", "t1", "t2", "y", "y.next") res <- spm_con_1d(as.data.frame(dat), a=-0.05, b=2, q=1e-8, f=80, f1=90, mu0=1e-3, theta=0.08) }
This function implements a continuous genetic SPM model by assuming all the parameters are constants.
spm_con_1d_g( spm_data, gene_data, a = NA, b = NA, q = NA, f = NA, f1 = NA, mu0 = NA, theta = NA, effect = c("a"), lower = c(), upper = c(), control = list(xtol_rel = 1e-06), global = FALSE, verbose = TRUE, ahessian = FALSE, method = "lbfgs", method.hessian = "L-BFGS-B" )
spm_con_1d_g( spm_data, gene_data, a = NA, b = NA, q = NA, f = NA, f1 = NA, mu0 = NA, theta = NA, effect = c("a"), lower = c(), upper = c(), control = list(xtol_rel = 1e-06), global = FALSE, verbose = TRUE, ahessian = FALSE, method = "lbfgs", method.hessian = "L-BFGS-B" )
spm_data |
A dataset for the SPM model. See the STPM pacakge for more details about the format. |
gene_data |
A two column dataset containing the genotypes for the individuals in spm_data.
The first column |
a |
The initial value for the paramter |
b |
The initial value for the paramter |
q |
The initial value for the paramter |
f |
The initial value for the paramter |
f1 |
The initial value for the paramter |
mu0 |
The initial value for the paramter |
theta |
The initial value for the paramter |
effect |
A character vector of the parameters that are linked to genotypes.
The vector can contain any combination of |
lower |
A vector of the lower bound of the parameters. |
upper |
A vector of the upper bound of the parameters. |
control |
A list of the control parameters for the optimization paramters. |
global |
A logical variable indicating whether the MLSL (TRUE) or the L-BFGS (FALSE) algorithm is used for the optimization. |
verbose |
A logical variable indicating whether initial information is printed. |
ahessian |
A logical variable indicating whether the approximate (FALSE) or analytical (TRUE) Hessian is returned. |
method |
Optimization method.
Can be one of the following: lbfgs, mlsl, mma, slsqp, tnewton, varmetric.
Default: |
method.hessian |
Optimization method for hessian calculation (if ahessian=F).
Default: |
est The estimates of the parameters.
hessian The Hessian matrix of the estimates.
hessian The Hessian matrix of the estimates.
lik The minus log-likelihood.
con A number indicating the convergence. See the 'nloptr' package for more details.
message Extra message about the convergence. See the 'nloptr' package for more details.
beta The coefficients of the genetic effect on the parameters to be linked to genotypes.
He, L., Zhbannikov, I., Arbeev, K. G., Yashin, A. I., and Kulminski, A.M., 2017. Genetic stochastic process model for detecting pleiotropic and interaction effects with longitudinal data.
## Not run: library(stpm) data(ex_spmcon1dg) res <- spm_con_1d_g(ex_data$spm_data, ex_data$gene_data, a = -0.02, b=0.2, q=0.01, f=3, f1=3, mu0=0.01, theta=1e-05, upper=c(-0.01,3,0.1,10,10,0.1,1e-05), lower=c(-1,0.01,0.00001,1,1,0.001,1e-05), effect=c('q')) ## End(Not run)
## Not run: library(stpm) data(ex_spmcon1dg) res <- spm_con_1d_g(ex_data$spm_data, ex_data$gene_data, a = -0.02, b=0.2, q=0.01, f=3, f1=3, mu0=0.01, theta=1e-05, upper=c(-0.01,3,0.1,10,10,0.1,1e-05), lower=c(-1,0.01,0.00001,1,1,0.001,1e-05), effect=c('q')) ## End(Not run)
Continuous multi-dimensional optimization with linear terms in mu only
spm_cont_lin( dat, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 2e-05, theta = 0.08, stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, pinv.tol = 0.01, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08) )
spm_cont_lin( dat, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 2e-05, theta = 0.08, stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, pinv.tol = 0.01, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08) )
dat |
A data table. |
a |
A starting value of the rate of adaptive response to any deviation of Y from f1(t). |
f1 |
A starting value of the average age trajectories of the variables which process is forced to follow. |
Q |
Starting values of the linear hazard term. |
f |
A starting value of the "optimal" value of variable which corresponds to the minimum of hazard rate at a respective time. |
b |
A starting value of a diffusion coefficient representing a strength of the random disturbance from Wiener Process. |
mu0 |
A starting value of the baseline hazard. |
theta |
A starting value of the parameter theta (axe displacement of Gompertz function). |
stopifbound |
Estimation stops if at least one parameter achieves lower or upper boundaries. #'Check the NLopt website for a description of the algorithms. Default: NLOPT_LN_NELDERMEAD |
lb |
Lower bound of parameters under estimation. |
ub |
Upper bound of parameters under estimation. The program stops when the number of function evaluations exceeds maxeval. Default: 500. |
verbose |
An indicator of verbosing output. |
pinv.tol |
A tolerance value for pseudo-inverse of matrix gamma (see Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.) |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 is used: mu0 = mu0*exp(theta*t). |
opts |
A list of options for |
spm_continuous
runs much slower that discrete but more precise and can handle time
intervals with different lengths.
A set of estimated parameters a, f1, Q, f, b, mu0, theta and
additional variable limit
which indicates if any parameter
achieved lower or upper boundary conditions (FALSE by default).
status Optimization status (see documentation for nloptr package).
LogLik A logarithm likelihood.
objective A value of objective function (given by nloptr).
message A message given by nloptr optimization function (see documentation for nloptr package).
Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.
library(stpm) set.seed(123) #Reading the data: data <- simdata_cont(N=2) head(data) #Parameters estimation: pars <- spm_cont_lin(dat=data,a=-0.05, f1=80, Q=2e-8, f=80, b=5, mu0=2e-5) pars
library(stpm) set.seed(123) #Reading the data: data <- simdata_cont(N=2) head(data) #Parameters estimation: pars <- spm_cont_lin(dat=data,a=-0.05, f1=80, Q=2e-8, f=80, b=5, mu0=2e-5) pars
Continuous multi-dimensional optimization with quadratic and linear terms
spm_cont_quad_lin( dat, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 2e-05, theta = 0.08, Q1 = 1e-08, stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, pinv.tol = 0.01, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08) )
spm_cont_quad_lin( dat, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 2e-05, theta = 0.08, Q1 = 1e-08, stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, pinv.tol = 0.01, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08) )
dat |
A data table. |
a |
A starting value of the rate of adaptive response to any deviation of Y from f1(t). |
f1 |
A starting value of the average age trajectories of the variables which process is forced to follow. |
Q |
Starting values of the quadratic hazard term. |
f |
A starting value of the "optimal" value of variable which corresponds to the minimum of hazard rate at a respective time. |
b |
A starting value of a diffusion coefficient representing a strength of the random disturbance from Wiener Process. |
mu0 |
A starting value of the baseline hazard. |
theta |
A starting value of the parameter theta (axe displacement of Gompertz function). |
Q1 |
Q for linear term |
stopifbound |
Estimation stops if at least one parameter achieves lower or upper boundaries. #'Check the NLopt website for a description of the algorithms. Default: NLOPT_LN_NELDERMEAD |
lb |
Lower bound of parameters under estimation. |
ub |
Upper bound of parameters under estimation. The program stops when the number of function evaluations exceeds maxeval. Default: 500. |
verbose |
An indicator of verbosing output. |
pinv.tol |
A tolerance value for pseudo-inverse of matrix gamma (see Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.) |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 is used: mu0 = mu0*exp(theta*t). |
opts |
A list of options for |
spm_continuous
runs much slower that discrete but more precise and can handle time
intervals with different lengths.
A set of estimated parameters a, f1, Q, f, b, mu0, theta and
additional variable limit
which indicates if any parameter
achieved lower or upper boundary conditions (FALSE by default).
status Optimization status (see documentation for nloptr package).
LogLik A logarithm likelihood.
objective A value of objective function (given by nloptr).
message A message given by nloptr optimization function (see documentation for nloptr package).
Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.
library(stpm) set.seed(123) #Reading the data: data <- simdata_cont(N=2) head(data) #Parameters estimation: pars <- spm_cont_quad_lin(dat=data,a=-0.05, f1=80, Q=2e-8, f=80, b=5, mu0=2e-5, Q1=1e-08) pars
library(stpm) set.seed(123) #Reading the data: data <- simdata_cont(N=2) head(data) #Parameters estimation: pars <- spm_cont_quad_lin(dat=data,a=-0.05, f1=80, Q=2e-8, f=80, b=5, mu0=2e-5, Q1=1e-08) pars
Continuous multi-dimensional optimization
spm_continuous( dat, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 2e-05, theta = 0.08, stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, pinv.tol = 0.01, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08), logmu0 = FALSE )
spm_continuous( dat, a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 2e-05, theta = 0.08, stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, pinv.tol = 0.01, gomp = FALSE, opts = list(algorithm = "NLOPT_LN_NELDERMEAD", maxeval = 100, ftol_rel = 1e-08), logmu0 = FALSE )
dat |
A data table. |
a |
A starting value of the rate of adaptive response to any deviation of Y from f1(t). |
f1 |
A starting value of the average age trajectories of the variables which process is forced to follow. |
Q |
Starting values of the quadratic hazard term. |
f |
A starting value of the "optimal" value of variable which corresponds to the minimum of hazard rate at a respective time. |
b |
A starting value of a diffusion coefficient representing a strength of the random disturbance from Wiener Process. |
mu0 |
A starting value of the baseline hazard. |
theta |
A starting value of the parameter theta (axe displacement of Gompertz function). |
stopifbound |
Estimation stops if at least one parameter achieves lower or upper boundaries. #'Check the NLopt website for a description of the algorithms. Default: NLOPT_LN_NELDERMEAD |
lb |
Lower bound of parameters under estimation. |
ub |
Upper bound of parameters under estimation. The program stops when the number of function evaluations exceeds maxeval. Default: 500. |
verbose |
An indicator of verbosing output. |
pinv.tol |
A tolerance value for pseudo-inverse of matrix gamma (see Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.) |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 is used: mu0 = mu0*exp(theta*t). |
opts |
A list of options for |
logmu0 |
Natural logarith of baseline mortality.
Default: |
spm_continuous
runs much slower that discrete but more precise and can handle time
intervals with different lengths.
A set of estimated parameters a, f1, Q, f, b, mu0, theta and
additional variable limit
which indicates if any parameter
achieved lower or upper boundary conditions (FALSE by default).
status Optimization status (see documentation for nloptr package).
LogLik A logarithm likelihood.
objective A value of objective function (given by nloptr).
message A message given by nloptr optimization function (see documentation for nloptr package).
Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.
library(stpm) set.seed(123) #Reading the data: data <- simdata_cont(N=2) head(data) #Parameters estimation: pars <- spm_continuous(dat=data,a=-0.05, f1=80, Q=2e-8, f=80, b=5, mu0=2e-5) pars
library(stpm) set.seed(123) #Reading the data: data <- simdata_cont(N=2) head(data) #Parameters estimation: pars <- spm_continuous(dat=data,a=-0.05, f1=80, Q=2e-8, f=80, b=5, mu0=2e-5) pars
Discrete multi-dimensional optimization
spm_discrete( dat, theta_range = seq(0.02, 0.2, by = 0.001), tol = NULL, verbose = FALSE )
spm_discrete( dat, theta_range = seq(0.02, 0.2, by = 0.001), tol = NULL, verbose = FALSE )
dat |
A data table. |
theta_range |
A range of |
tol |
A tolerance threshold for matrix inversion (NULL by default). |
verbose |
An indicator of verbosing output. |
This function is way more faster that continuous spm_continuous_MD(...)
(but less precise) and used mainly in
estimation a starting point for the spm_continuous_MD(...)
.
A list of two elements ("dmodel", "cmodel"): (1) estimated parameters u, R, b, Sigma, Q, mu0, theta for discrete-time model and (2) estimated parameters a, f1, Q, f, b, mu0, theta for continuous-time model. Note: b and mu0 from first list are different from b and mu0 from the second list.
Akushevich I., Kulminski A. and Manton K. (2005), Life tables with covariates: Dynamic model for Nonlinear Analysis of Longitudinal Data. Mathematical Population Studies, 12(2), pp.: 51-80. <DOI:10.1080/08898480590932296>.
library(stpm) data <- simdata_discr(N=10) #Parameters estimation pars <- spm_discrete(data) pars
library(stpm) data <- simdata_discr(N=10) #Parameters estimation pars <- spm_discrete(data) pars
Continuous-time multi-dimensional optimization for SPM with partially observed covariates (multidimensional GenSPM)
spm_pobs( x = NULL, y = NULL, aH = -0.05, aL = -0.01, f1H = 60, f1L = 80, QH = 2e-08, QL = 2.5e-08, fH = 60, fL = 80, bH = 4, bL = 5, mu0H = 8e-06, mu0L = 1e-05, thetaH = 0.08, thetaL = 0.1, p = 0.25, stopifbound = FALSE, algorithm = "NLOPT_LN_NELDERMEAD", lb = NULL, ub = NULL, maxeval = 500, verbose = FALSE, pinv.tol = 0.01, mode = "observed", gomp = TRUE, ftol_rel = 1e-06 )
spm_pobs( x = NULL, y = NULL, aH = -0.05, aL = -0.01, f1H = 60, f1L = 80, QH = 2e-08, QL = 2.5e-08, fH = 60, fL = 80, bH = 4, bL = 5, mu0H = 8e-06, mu0L = 1e-05, thetaH = 0.08, thetaL = 0.1, p = 0.25, stopifbound = FALSE, algorithm = "NLOPT_LN_NELDERMEAD", lb = NULL, ub = NULL, maxeval = 500, verbose = FALSE, pinv.tol = 0.01, mode = "observed", gomp = TRUE, ftol_rel = 1e-06 )
x |
A data table with genetic component. |
y |
A data table without genetic component. |
aH |
A k by k matrix. Characterizes the rate of the adaptive response for Z = 1. |
aL |
A k by k matrix. Characterize the rate of the adaptive response for Z = 0. |
f1H |
A deviation from the norm (or optimal) state for Z = 1. This is a vector of length k. |
f1L |
A deviation from the norm (or optimal) for Z = 0. This is a vector of length k. |
QH |
A matrix k by k, which is a non-negative-definite symmetric matrix for Z = 1. |
QL |
A matrix k by k, which is a non-negative-definite symmetric matrix for Z = 0. |
fH |
A vector with length of k. Represents the normal (or optimal) state for Z = 1. |
fL |
A vector with length of k. Represents the normal (or optimal) state for Z = 0. |
bH |
A diffusion coefficient, k by k matrix for Z = 1. |
bL |
A diffusion coefficient, k by k matrix for Z = 0. |
mu0H |
A baseline mortality for Z = 1. |
mu0L |
A baseline mortality for Z = 0. |
thetaH |
A displacement coefficient for Z = 1. |
thetaL |
A displacement coefficient for Z = 0. |
p |
a hyphotetical percentage of presence of partially observed covariate in a population (default p=0.25). |
stopifbound |
If TRUE then estimation stops if at least one parameter achieves lower or upper boundaries. |
algorithm |
An optimization algorithm used, can be one of those provided by |
lb |
Lower bound of parameter values. |
ub |
Upper bound of parameter values. |
maxeval |
Maximum number of iterations of the algorithm for |
verbose |
An indicator of verbosing output (FALSE by default). |
pinv.tol |
A tolerance value for pseudo-inverse of matrix gamma (see Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.) |
mode |
Can be one of the following: "observed" (default), "unobserved" or "combined". mode = "observed" represents analysing only dataset with observed variable Z. mode = "unobserved" represents analysing only dataset of unobserved variable Z. mode = "combined" denoted joint analysis of both observed and unobserved datasets. |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 is used: mu0 = mu0*exp(theta*t). |
ftol_rel |
Relative tolerance threshold for likelihood function (defalult: 1e-6), see http://ab-initio.mit.edu/wiki/index.php/NLopt_Reference |
A set of estimated parameters aH, aL, f1H, f1H, QH, QL, fH, fL, bH, bL, mu0H, mu0L, thetaH, thetaL, p and
additional variable limit
which indicates if any parameter
achieved lower or upper boundary conditions (FALSE by default).
Arbeev, K.G. et al (2009). Genetic model for longitudinal studies of aging, health, and longevity
Yashin, A.I. et al (2007). Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.<DOI:10.1016/j.mbs.2006.11.006>.
## Not run: library(stpm) #Reading the data: data <- sim_pobs(N=1000) head(data) #Parameters estimation: pars <- spm_pobs(x=data) pars ## End(Not run)
## Not run: library(stpm) #Reading the data: data <- sim_pobs(N=1000) head(data) #Parameters estimation: pars <- spm_pobs(x=data) pars ## End(Not run)
A data projection with previously estimated or user-defined parameters. Projections are constructed for a cohort with fixed or normally distributed initial covariates.
spm_projection( x, N = 100, ystart = 80, model = "discrete", tstart = 30, tend = 105, dt = 1, sd0 = 1, nobs = NULL, gomp = TRUE, format = "short" )
spm_projection( x, N = 100, ystart = 80, model = "discrete", tstart = 30, tend = 105, dt = 1, sd0 = 1, nobs = NULL, gomp = TRUE, format = "short" )
x |
A list of parameters from output of the |
N |
A number of individuals to simulate, N=100 by default. |
ystart |
A vector of starting values of covariates (variables), ystart=80 by default. |
model |
A model type. Choices are: "discrete", "continuous" or "time-dependent". |
tstart |
Start time (age), default=30. Can be an interval: c(a, b) - in this case,
the starting time is sumulated via |
tend |
End time (age), default=105. |
dt |
A time interval between observations, dt=1 by default. |
sd0 |
A standard deviation value for simulation of the next value of variable. sd0=1 by default. |
nobs |
A number of observations (lines) for i-th individual. |
gomp |
A flag (FALSE by default). When it is set, then time-dependent exponential form of mu0 and Q are used: mu0 = mu0*exp(theta*t), Q = Q*exp(theta*t). Only for continous-time SPM. |
format |
Data format: "short" (default), "long". |
An object of 'spm.projection' class with two elements. (1) A simulated data set. (2) A summary statistics which includes (i) age-specific means of state variables and (ii) Survival probabilities.
Yashin, A. et al (2007), Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.
Akushevich I., Kulminski A. and Manton K. (2005). Life tables with covariates: Dynamic model for Nonlinear Analysis of Longitudinal Data. Mathematical Popu-lation Studies, 12(2), pp.: 51-80. <DOI: 10.1080/08898480590932296>.
Yashin, A. et al (2007), Health decline, aging and mortality: how are they related? Biogerontology, 8(3), 291-302.<DOI:10.1007/s10522-006-9073-3>.
## Not run: library(stpm) set.seed(123) # Setting up the model model.par <- list() model.par$a <- matrix(c(-0.05, 1e-3, 2e-3, -0.05), nrow=2, ncol=2, byrow=TRUE) model.par$f1 <- matrix(c(90, 35), nrow=1, ncol=2) model.par$Q <- matrix(c(1e-8, 1e-9, 1e-9, 1e-8), nrow=2, ncol=2, byrow=TRUE) model.par$f <- matrix(c(80, 27), nrow=1, ncol=2) model.par$b <- matrix(c(6, 2), nrow=2, ncol=2) model.par$mu0 <- 1e-6 model.par$theta <- 0.09 # Projection # Discrete-time model data.proj.discrete <- spm_projection(model.par, N=5000, ystart=c(80, 27)) plot(data.proj.discrete$stat$srv.prob) # Continuous-time model data.proj.continuous <- spm_projection(model.par, N=5000, ystart=c(80, 27), model="continuous") plot(data.proj.continuous$stat$srv.prob) # Time-dependent model model.par <- list(at = "-0.05", f1t = "80", Qt = "2e-8", ft= "80", bt = "5", mu0t = "1e-5*exp(0.11*t)") data.proj.time_dependent <- spm_projection(model.par, N=500, ystart=80, model="time-dependent") plot(data.proj.time_dependent$stat$srv.prob, xlim = c(30,105)) ## End(Not run)
## Not run: library(stpm) set.seed(123) # Setting up the model model.par <- list() model.par$a <- matrix(c(-0.05, 1e-3, 2e-3, -0.05), nrow=2, ncol=2, byrow=TRUE) model.par$f1 <- matrix(c(90, 35), nrow=1, ncol=2) model.par$Q <- matrix(c(1e-8, 1e-9, 1e-9, 1e-8), nrow=2, ncol=2, byrow=TRUE) model.par$f <- matrix(c(80, 27), nrow=1, ncol=2) model.par$b <- matrix(c(6, 2), nrow=2, ncol=2) model.par$mu0 <- 1e-6 model.par$theta <- 0.09 # Projection # Discrete-time model data.proj.discrete <- spm_projection(model.par, N=5000, ystart=c(80, 27)) plot(data.proj.discrete$stat$srv.prob) # Continuous-time model data.proj.continuous <- spm_projection(model.par, N=5000, ystart=c(80, 27), model="continuous") plot(data.proj.continuous$stat$srv.prob) # Time-dependent model model.par <- list(at = "-0.05", f1t = "80", Qt = "2e-8", ft= "80", bt = "5", mu0t = "1e-5*exp(0.11*t)") data.proj.time_dependent <- spm_projection(model.par, N=500, ystart=80, model="time-dependent") plot(data.proj.time_dependent$stat$srv.prob, xlim = c(30,105)) ## End(Not run)
A function for the model with time-dependent model parameters.
spm_time_dep( x, start = list(a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 0.001), frm = list(at = "a", f1t = "f1", Qt = "Q", ft = "f", bt = "b", mu0t = "mu0"), stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, opts = NULL, lrtest = FALSE )
spm_time_dep( x, start = list(a = -0.05, f1 = 80, Q = 2e-08, f = 80, b = 5, mu0 = 0.001), frm = list(at = "a", f1t = "f1", Qt = "Q", ft = "f", bt = "b", mu0t = "mu0"), stopifbound = FALSE, lb = NULL, ub = NULL, verbose = FALSE, opts = NULL, lrtest = FALSE )
x |
Input data table. |
start |
A list of starting parameters, default:
|
frm |
A list of formulas that define age (time) - dependency.
Default: |
stopifbound |
Estimation stops if at least one parameter
achieves lower or upper boundaries. Default: |
lb |
Lower bound of parameters under estimation. |
ub |
Upper bound of parameters under estimation. |
verbose |
Turns on verbosing output. |
opts |
A list of options for |
lrtest |
Indicates should Likelihood-Ratio test be performed.
Possible values: |
A set of estimates of a
, f1
, Q
,
f
, b
, mu0
.
status Optimization status (see documentation for nloptr package).
LogLik A logarithm likelihood.
objective A value of objective function (given by nloptr).
message A message given by nloptr
optimization function
(see documentation for nloptr package).
Yashin, A. et al (2007), Health decline, aging and mortality: how are they related? Biogerontology, 8(3), 291-302.<DOI:10.1007/s10522-006-9073-3>.
library(stpm) set.seed(123) #Data preparation: n <- 5 data <- simdata_time_dep(N=n) # Estimation: opt.par <- spm_time_dep(data) opt.par
library(stpm) set.seed(123) #Data preparation: n <- 5 data <- simdata_time_dep(N=n) # Estimation: opt.par <- spm_time_dep(data) opt.par
Multiple Data Imputation with SPM
spm.impute( x, id = 1, case = 2, t1 = 3, t2 = 3, covariates = 4, minp = 5, theta_range = seq(0.01, 0.2, by = 0.001) )
spm.impute( x, id = 1, case = 2, t1 = 3, t2 = 3, covariates = 4, minp = 5, theta_range = seq(0.01, 0.2, by = 0.001) )
x |
A longitudinal dataset with missing observations |
id |
A name (text) or index (numeric) of ID column. Default: 1 |
case |
A case status column name (text) or index (numeric). Default: 2 |
t1 |
A t1 (or t if short format is used) column name (text) or index (numeric). Default: 3 |
t2 |
A t2 column name (if long format is used) (text) or index (numeric). Default: 4 |
covariates |
A list of covariate column names or indices. Default: 5 |
minp |
Number of imputations. Default: 5 |
theta_range |
A range of parameter theta used for optimization, default: seq(0.01, 0.15, by=0.001). |
A list(imputed, imputations)
imputed An imputed dataset.
imputations Temporary imputed datasets used in multiple imputaitons.
## Not run: library(stpm) ##Data preparation ## data <- simdata_discr(N=1000, dt = 2) miss.id <- sample(x=dim(data)[1], size=round(dim(data)[1]/4)) # ~25% missing data incomplete.data <- data incomplete.data[miss.id,5] <- NA incomplete.data[miss.id-1,6] <- NA ## End of data preparation ## # Estimate parameters from the complete dataset # p <- spm_discrete(data, theta_range = seq(0.075, 0.09, by=0.001)) p ##### Multiple imputation with SPM ##### imp.data <- spm.impute(x=incomplete.data, minp=5, theta_range=seq(0.075, 0.09, by=0.001))$imputed head(imp.data) ## Estimate SPM parameters from imputed data and compare them to the p ## pp.test <- spm_discrete(imp.data, theta_range = seq(0.075, 0.09, by=0.001)) pp.test ## End(Not run)
## Not run: library(stpm) ##Data preparation ## data <- simdata_discr(N=1000, dt = 2) miss.id <- sample(x=dim(data)[1], size=round(dim(data)[1]/4)) # ~25% missing data incomplete.data <- data incomplete.data[miss.id,5] <- NA incomplete.data[miss.id-1,6] <- NA ## End of data preparation ## # Estimate parameters from the complete dataset # p <- spm_discrete(data, theta_range = seq(0.075, 0.09, by=0.001)) p ##### Multiple imputation with SPM ##### imp.data <- spm.impute(x=incomplete.data, minp=5, theta_range=seq(0.075, 0.09, by=0.001))$imputed head(imp.data) ## Estimate SPM parameters from imputed data and compare them to the p ## pp.test <- spm_discrete(imp.data, theta_range = seq(0.075, 0.09, by=0.001)) pp.test ## End(Not run)
Utilities to estimate parameters of the models with survival functions induced by stochastic covariates. Miscellaneous functions for data preparation and simulation are also provided. For more information, see: "Stochastic model for analysis of longitudinal data on aging and mortality" by Yashin A. et al, 2007, Mathematical Biosciences, 208(2), 538-551 <DOI:10.1016/j.mbs.2006.11.006>.
I. Y. Zhbannikov, Liang He, K. G. Arbeev, I. Akushevich, A. I. Yashin.
Yashin, A. et al (2007), Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical Biosciences, 208(2), 538-551.
Akushevich I., Kulminski A. and Manton K. (2005). Life tables with covariates: Dynamic model for Nonlinear Analysis of Longitudinal Data. Mathematical Popu-lation Studies, 12(2), pp.: 51-80. <DOI: 10.1080/08898480590932296>.
Yashin, A. et al (2007), Health decline, aging and mortality: how are they related? Biogerontology, 8(3), 291-302.<DOI:10.1007/s10522-006-9073-3>.
## Not run: library(stpm) #Prepare data for optimization data <- prepare_data(x=system.file("extdata","longdat.csv",package="stpm"), covariates="BMI") #Parameters estimation (default model: discrete-time): p.discr.model <- spm(data) p.discr.model # Continuous-time model: p.cont.model <- spm(data, model="continuous") p.cont.model #Model with time-dependent coefficients: data <- prepare_data(x=system.file("extdata","longdat.csv",package="stpm"), covariates="BMI") p.td.model <- spm(data, model="time-dependent") p.td.model ## End(Not run)
## Not run: library(stpm) #Prepare data for optimization data <- prepare_data(x=system.file("extdata","longdat.csv",package="stpm"), covariates="BMI") #Parameters estimation (default model: discrete-time): p.discr.model <- spm(data) p.discr.model # Continuous-time model: p.cont.model <- spm(data, model="continuous") p.cont.model #Model with time-dependent coefficients: data <- prepare_data(x=system.file("extdata","longdat.csv",package="stpm"), covariates="BMI") p.td.model <- spm(data, model="time-dependent") p.td.model ## End(Not run)
Returns string w/o leading or trailing whitespace
trim(x)
trim(x)
x |
a string to trim |
Returns string w/o leading whitespace
trim.leading(x)
trim.leading(x)
x |
a string to trim |
Returns string w/o trailing whitespace
trim.trailing(x)
trim.trailing(x)
x |
a string to trim |
Vital (mortality) statistics.
Ilya Y Zhbannikov [email protected]