Title: | Fitting Semi-Parametric Generalized log-Gamma Regression Models |
---|---|
Description: | Set of tools to fit a linear multiple or semi-parametric regression models with the possibility of non-informative random right-censoring. Under this setup, the localization parameter of the response variable distribution is modeled by using linear multiple regression or semi-parametric functions, whose non-parametric components may be approximated by natural cubic spline or P-splines. The supported distribution for the model error is a generalized log-gamma distribution which includes the generalized extreme value and standard normal distributions as important special cases. Inference is based on penalized likelihood and bootstrap methods. Also, some numerical and graphical devices for diagnostic of the fitted models are offered. |
Authors: | Carlos Alberto Cardozo Delgado <[email protected]> and G. Paula and L. Vanegas |
Maintainer: | Carlos Alberto Cardozo Delgado <[email protected]> |
License: | GPL-3 |
Version: | 0.2.2 |
Built: | 2024-11-22 06:54:52 UTC |
Source: | CRAN |
bootglg
is used to generate parametric bootstrap inference, such as, estimated standard errors and approximate confidence intervals for a generalized log-gamma regression.
bootglg(formula, data, B = 500, alpha = 0.05, type = "normal", plt_den = FALSE)
bootglg(formula, data, B = 500, alpha = 0.05, type = "normal", plt_den = FALSE)
formula |
a symbolic description of the systematic component of the model to be fitted. |
data |
data.frame, contains the variables in the formula object. |
B |
integer, represents the number of bootstrap replications. Default value is 500. |
alpha |
numeric, represents a confidence level for the bootstrap intervals. Default value is 0.05. |
type |
character, indicates the type of bootstrap confidence interval for the estimated parameters. The options are: 'normal', 't_student' or 'bootstrap_t'. These intervals used the bootstrap estimated standard error of the ML estimates of the parameters. Other kind of bootstrap intervals are the percentile-type intervals. We offer the option 'BCa'. It is a bias-corrected and accelerated percentile interval. The default value for the 'type' argument is 'normal'. |
plt_den |
boolean value, to request a density-type plot of the bootstrap estimates. Default value is FALSE. |
ml_estimates
is a vector of maximum likelihood estimates associated with the coefficients of linear structure, scale, and shape parameters.
boot_mean_estimates
is a vector of mean of the bootstrap estimates associated with the coefficients of linear structure, scale, and shape parameters.
boot_bias_estimates
is a vector of bootstrap estimate of bias associated with the coefficients of linear structure, scale, and shape estimators.
boot_sd_estimates
is a vector of bootstrap standard errors of the estimates associated with the coefficients of linear structure, scale, and shape estimators.
type
indicates the type of confidence intervals.
intervals
array of the confidence intervals of the coefficients of linear structure, scale, and shape.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C. A., Paula G. and Vanegas L. sglg: An R package to fit semi-parametric generalized log-gamma regression models. In preparation.
Efron B and Tibshirani R (1993). An introduction to the Bootstrap. Chapman & Hall, Inc.
################################################################################################## set.seed(1) n <- 300 x1 <- runif(n, 0, 1) t_beta <- 1.2 t_sigma <- 0.5 t_lambda <- 0.7 error <- rglg(n, 0, t_sigma, t_lambda) y1 <- t_beta*x1 + error data <- data.frame(y1, x1) # The following examples are based on 50 bootstrap replications. # A 90% bootstrap confidence interval with the method 'normal'. bootglg(y1 ~ x1 - 1, data = data, type='normal', B = 50, alpha = 0.1) # A 95% bootstrap confidence interval with the method 't_student'. bootglg(y1 ~ x1 - 1, data = data, type='t_student', B = 50) # A 95% bootstrap confidence interval with the method 'bootstrap_t'. bootglg(y1 ~ x1 - 1, data = data, type='bootstrap_t', B = 50) # A 98% bootstrap confidence interval with the method 'BCa'. # bootglg(y1 ~ x1 - 1, data = data, type='BCa', B = 50, alpha = 0.02) #################################################################################################
################################################################################################## set.seed(1) n <- 300 x1 <- runif(n, 0, 1) t_beta <- 1.2 t_sigma <- 0.5 t_lambda <- 0.7 error <- rglg(n, 0, t_sigma, t_lambda) y1 <- t_beta*x1 + error data <- data.frame(y1, x1) # The following examples are based on 50 bootstrap replications. # A 90% bootstrap confidence interval with the method 'normal'. bootglg(y1 ~ x1 - 1, data = data, type='normal', B = 50, alpha = 0.1) # A 95% bootstrap confidence interval with the method 't_student'. bootglg(y1 ~ x1 - 1, data = data, type='t_student', B = 50) # A 95% bootstrap confidence interval with the method 'bootstrap_t'. bootglg(y1 ~ x1 - 1, data = data, type='bootstrap_t', B = 50) # A 98% bootstrap confidence interval with the method 'BCa'. # bootglg(y1 ~ x1 - 1, data = data, type='BCa', B = 50, alpha = 0.02) #################################################################################################
deBoor
builds the basis matrix and penalty matrix to approximate a smooth function using
cubic B-spline cubic.
deBoor2(t, knots)
deBoor2(t, knots)
t |
a vector of values. |
knots |
a set of internal knot. |
nknot number of knots.
knots set of knots.
N basis matrix.
K penalty matrix.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
set.seed(1) t_1 <- runif(120) range(t_1) t_2 <- t_1 + 2 #runif(120,2,3) range(t_2) knot <- 10 dB1 <- deBoor2(t_1,knot) dB2 <- deBoor2(t_2,knot) dB1$knots dB2$knots plot(0,0,xlim=c(-0.5,3.5)) points(dB1$knots,rep(0,length(dB1$knots)),pch=20) delta <- dB2$knots[1] - dB1$knots[1] points(dB2$knots-delta,rep(0,length(dB2$knots)),pch=2,col= 'blue') dB1$K dB2$K zeros <- vector() plot(t_1,dB1$N[,1],pch=20) for(j in 1:knot){ points(t_1,dB1$N[,j],pch=20,col=j) zeros[j] <- sum(dB1$N[,j]==0) } zeros/120 cond_tNN <- vector() KnotS <- 3:50 for(j in KnotS){ dB1 <- deBoor2(t_1,j) print(dB1$knots[2]- dB1$knots[1]) min_max <- range(eigen(t(dB1$N)%*%dB1$N)$values) cond_tNN[j-2] <- min_max[1]/min_max[2] } cond_tNN plot(KnotS,cond_tNN,pch=20,ylim=c(0,0.07))
set.seed(1) t_1 <- runif(120) range(t_1) t_2 <- t_1 + 2 #runif(120,2,3) range(t_2) knot <- 10 dB1 <- deBoor2(t_1,knot) dB2 <- deBoor2(t_2,knot) dB1$knots dB2$knots plot(0,0,xlim=c(-0.5,3.5)) points(dB1$knots,rep(0,length(dB1$knots)),pch=20) delta <- dB2$knots[1] - dB1$knots[1] points(dB2$knots-delta,rep(0,length(dB2$knots)),pch=2,col= 'blue') dB1$K dB2$K zeros <- vector() plot(t_1,dB1$N[,1],pch=20) for(j in 1:knot){ points(t_1,dB1$N[,j],pch=20,col=j) zeros[j] <- sum(dB1$N[,j]==0) } zeros/120 cond_tNN <- vector() KnotS <- 3:50 for(j in KnotS){ dB1 <- deBoor2(t_1,j) print(dB1$knots[2]- dB1$knots[1]) min_max <- range(eigen(t(dB1$N)%*%dB1$N)$values) cond_tNN[j-2] <- min_max[1]/min_max[2] } cond_tNN plot(KnotS,cond_tNN,pch=20,ylim=c(0,0.07))
deviance_residuals
is used to generate deviance residuals for a generalized log-gamma regression model.
deviance_residuals(object, ...)
deviance_residuals(object, ...)
object |
an object of the class sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg(). |
... |
other arguments. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
# Example 1 n <- 300 error <- rglg(n,0,1,1) y <- 0.5 + error fit <- glg(y~1,data=as.data.frame(y)) deviance_residuals(fit) # Example 2 n <- 300 error <- rglg(n,0,1,1) x <- runif(n,-3,3) y <- 0.5 + 2*x + error fit <- glg(y~x,data=as.data.frame(y,x)) deviance_residuals(fit)
# Example 1 n <- 300 error <- rglg(n,0,1,1) y <- 0.5 + error fit <- glg(y~1,data=as.data.frame(y)) deviance_residuals(fit) # Example 2 n <- 300 error <- rglg(n,0,1,1) x <- runif(n,-3,3) y <- 0.5 + 2*x + error fit <- glg(y~x,data=as.data.frame(y,x)) deviance_residuals(fit)
dglg
is used to calculate the density distribution function of a generalized log-gamma variable at x.
dglg(x, location, scale, shape)
dglg(x, location, scale, shape)
x |
numeric, a real number. |
location |
numeric, represent the location parameter of a generalized log-gamma distribution. Default value is 0. |
scale |
numeric, represent the scale parameter of a generalized log-gamma distribution. Default value is 1. |
shape |
numeric, represent the shape parameter of a generalized log-gamma distribution. Default value is 1. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
x <- seq(-4,4,length=100) dglg(x,location=0,scale=1,shape=1) plot(x,dglg(x,location=0,scale=1,shape=1),type="l",xlab="x",ylab="Density")
x <- seq(-4,4,length=100) dglg(x,location=0,scale=1,shape=1) plot(x,dglg(x,location=0,scale=1,shape=1),type="l",xlab="x",ylab="Density")
entropy
is used to obtain the entropy for a generalized log-gamma distribution.
entropy(mu, sigma, lambda)
entropy(mu, sigma, lambda)
mu |
numeric, represent the location parameter of a generalized log-gamma distribution. Default value is 0. |
sigma |
numeric, represent the scale parameter of a generalized log-gamma distribution. Default value is 1. |
lambda |
numeric, represent the shape parameter of a generalized log-gamma distribution. Default value is 1. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
entropy(0,1,-1) # Extreme value type I distribution, maximum case. entropy(0,1,1) # Extreme value type I distribution, minimum case. entropy(0,1,0.077) # Standard normal distribution.
entropy(0,1,-1) # Extreme value type I distribution, maximum case. entropy(0,1,1) # Extreme value type I distribution, minimum case. entropy(0,1,0.077) # Standard normal distribution.
Build a Normal probability plot with simulated envelope for a deviance-type residuals in semiparametric or multiple linear generalized log-gamma regression models.
envelope.sglg(fit, Rep)
envelope.sglg(fit, Rep)
fit |
an object of the class sglg. This object is returned from the call to glg(), sglg(). |
Rep |
a positive integer. This is the number of replications on which to build the simulated envelope. Default is Rep=50. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
Ortega, E., Paula, G. A. and Bolfarine, H. (2008) Deviance residuals in generalized log-gamma regression models with censored observations. Journal of Statistical Computation and Simulation, 78, 747-764.
rows <- 120 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 0.5 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) error <- rglg(rows, 0, 1, t_lambda) y1 <- X %*%t_beta + t_sigma * error data.example <- data.frame(y1,X) fit <- glg(y1 ~ x1 + x2 - 1,data=data.example) envelope.sglg(fit,Rep=50)
rows <- 120 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 0.5 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) error <- rglg(rows, 0, 1, t_lambda) y1 <- X %*%t_beta + t_sigma * error data.example <- data.frame(y1,X) fit <- glg(y1 ~ x1 + x2 - 1,data=data.example) envelope.sglg(fit,Rep=50)
glg
is used to fit a multiple linear regression model suitable for analysis of data sets in which the response variable is continuous, strictly positive, and asymmetric.
In this setup, the location parameter of the response variable is explicitly modeled by a linear function of the parameters.
glg( formula, data, shape = -0.75, Tolerance = 5e-05, Maxiter = 1000, format = "complete", envelope = FALSE )
glg( formula, data, shape = -0.75, Tolerance = 5e-05, Maxiter = 1000, format = "complete", envelope = FALSE )
formula |
a symbolic description of the systematic component of the model to be fitted. |
data |
a data frame with the variables in the model. |
shape |
an optional value for the shape parameter of the error distribution of a generalized log-gamma distribution. Default value is 0.2. |
Tolerance |
an optional positive value, which represents the convergence criterion. Default value is 1e-04. |
Maxiter |
an optional positive integer giving the maximal number of iterations for the estimating process. Default value is 1e03. |
format |
an optional string value that indicates if you want a simple or a complete report of the estimating process. Default value is 'complete'. |
envelope |
an optional and internal logical value that indicates if the glg function will be employed for build an envelope plot. Default value is 'FALSE'. |
mu a vector of parameter estimates associated with the location parameter.
sigma estimate of the scale parameter associated with the model.
lambda estimate of the shape parameter associated with the model.
interval estimate of a 95% confidence interval for each estimate parameters associated with the model.
Deviance the deviance associated with the model.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
set.seed(22) rows <- 300 x1 <- rbinom(rows, 1, 0.5) x2 <- runif(rows, 0, 1) X <- cbind(x1,x2) t_beta <- c(0.5, 2) t_sigma <- 1 ###################### # # # Extreme value case # # # ###################### t_lambda <- 1 error <- rglg(rows, 0, 1, t_lambda) y1 <- error y1 <- X %*%t_beta + t_sigma*error data.example <- data.frame(y1,X) data.example <- data.frame(y1) fit <- glg(y1 ~ x1 + x2 - 1, data=data.example) logLik(fit) # -449.47 # Time: 14 milliseconds summary(fit) deviance_residuals(fit) ############################# # # # Normal case: A limit case # # # ############################# # When the parameter lambda goes to zero the GLG tends to a standard normal distribution. set.seed(8142031) y1 <- X %*%t_beta + t_sigma * rnorm(rows) data.example <- data.frame(y1, X) fit0 <- glg(y1 ~ x1 + x2 - 1,data=data.example) logLik(fit0) fit0$AIC fit0$mu ############################################ # # # A comparison with a normal linear model # # # ############################################ fit2 <- lm(y1 ~ x1 + x2 - 1,data=data.example) logLik(fit2) AIC(fit2) coefficients(fit2)
set.seed(22) rows <- 300 x1 <- rbinom(rows, 1, 0.5) x2 <- runif(rows, 0, 1) X <- cbind(x1,x2) t_beta <- c(0.5, 2) t_sigma <- 1 ###################### # # # Extreme value case # # # ###################### t_lambda <- 1 error <- rglg(rows, 0, 1, t_lambda) y1 <- error y1 <- X %*%t_beta + t_sigma*error data.example <- data.frame(y1,X) data.example <- data.frame(y1) fit <- glg(y1 ~ x1 + x2 - 1, data=data.example) logLik(fit) # -449.47 # Time: 14 milliseconds summary(fit) deviance_residuals(fit) ############################# # # # Normal case: A limit case # # # ############################# # When the parameter lambda goes to zero the GLG tends to a standard normal distribution. set.seed(8142031) y1 <- X %*%t_beta + t_sigma * rnorm(rows) data.example <- data.frame(y1, X) fit0 <- glg(y1 ~ x1 + x2 - 1,data=data.example) logLik(fit0) fit0$AIC fit0$mu ############################################ # # # A comparison with a normal linear model # # # ############################################ fit2 <- lm(y1 ~ x1 + x2 - 1,data=data.example) logLik(fit2) AIC(fit2) coefficients(fit2)
This function provides some useful statistics to assess the quality of fit of generalized log-gamma probabilistic model, including the statistics Cramer-von Mises and Anderson-Darling. It can also calculate other goodness of fit such as Hannan-Quin Information Criterion and Kolmogorov-Smirnov test.
gnfit(starts, data)
gnfit(starts, data)
starts |
numeric vector. Initial parameters to maximize the likelihood function |
data |
numeric vector. A sample of a generalized log-gamma distribution. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
## Not run: set.seed(1) # The size of the sample must be median or large to obtain a good estimates n <- 100 sample <- rglg(n,location=0,scale=0.5,shape=0.75) # This step takes a few minutes. result <- gnfit(starts=c(0.1,0.75,1),data=sample) result ## End(Not run)
## Not run: set.seed(1) # The size of the sample must be median or large to obtain a good estimates n <- 100 sample <- rglg(n,location=0,scale=0.5,shape=0.75) # This step takes a few minutes. result <- gnfit(starts=c(0.1,0.75,1),data=sample) result ## End(Not run)
Gu
builds the basis matrix and penalty matrix to approximate a smooth function using
natural cubic splines based on the Gu basis form.
Gu(t, knot)
Gu(t, knot)
t |
the covariate. |
knot |
a integer value that represent the number of knots of the natural cubic spline. |
nknot number of knots.
knots set of knots.
N basis matrix.
K penalty matrix.
Carlos Alberto Cardozo Delgado <[email protected]>
Wood, S. (2006) Generalized additive models: An R introduction. Chapman and Hall.
Carlos Alberto Cardozo Delgado. Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
t <- runif(1000) knot <- 6 N_gu <- Gu(t,knot)
t <- runif(1000) knot <- 6 N_gu <- Gu(t,knot)
influence.sglg extracts from a object of class sglg the local influence measures and displays their graphs versus the index of the observations.
## S3 method for class 'sglg' influence(model, ...)
## S3 method for class 'sglg' influence(model, ...)
model |
an object of the class sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg(). |
... |
other arguments. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
rows <- 100 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) error <- rglg(rows, 0, 1, t_lambda) y1 <- X %*%t_beta + t_sigma * error data.example <- data.frame(y1,X) fit1 <- glg(y1 ~ x1 + x2 - 1,data=data.example) influence(fit1)
rows <- 100 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) error <- rglg(rows, 0, 1, t_lambda) y1 <- X %*%t_beta + t_sigma * error data.example <- data.frame(y1,X) fit1 <- glg(y1 ~ x1 + x2 - 1,data=data.example) influence(fit1)
logLik.sglg extracts log-likehood from a model from an object of class 'sglg'.
## S3 method for class 'sglg' logLik(object, ...)
## S3 method for class 'sglg' logLik(object, ...)
object |
an object of the class sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg() function. |
... |
other arguments. |
lss
is used to obtain the mean, median, mode, variance, coefficient of variation, skewness and kurtosis for a generalized log-gamma distribution.
lss(mu = 0, sigma = 1, lambda = 1)
lss(mu = 0, sigma = 1, lambda = 1)
mu |
numeric, represents the location parameter of a generalized log-gamma distribution. Default value is 0. |
sigma |
numeric, represents the scale parameter of a generalized log-gamma distribution. Default value is 1. |
lambda |
numeric, represents the shape parameter of a generalized log-gamma distribution. Default value is 1. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
National Institute of Standards and Technology, NIST. Engineering Statistics Handbook. https://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm
lss(0,1,-1) # Extreme value type I distribution, maximum case. lss(0,1,1) # Extreme value type I distribution, minimum case. lss(0,1,0.01) # Standard normal distribution.
lss(0,1,-1) # Extreme value type I distribution, maximum case. lss(0,1,1) # Extreme value type I distribution, minimum case. lss(0,1,0.01) # Standard normal distribution.
order_glg
is used to obtain a random sample of the K-th order statistics from a generalized log-gamma distribution.
order_glg(size, mu, sigma, lambda, k, n, alpha = 0.05)
order_glg(size, mu, sigma, lambda, k, n, alpha = 0.05)
size |
numeric, represents the size of the sample. |
mu |
numeric, represents the location parameter. Default value is 0. |
sigma |
numeric, represents the scale parameter. Default value is 1. |
lambda |
numeric, represents the shape parameter. Default value is 1. |
k |
numeric, represents the K-th smallest value from a sample. |
n |
numeric, represents the size of the sample to compute the order statistic from. |
alpha |
numeric, (1 - alpha) represents the confidence of an interval for the population median of the distribution of the k-th order statistic. Default value is 0.05. |
A list with a random sample of order statistics from a generalized log-gamma distribution, the value of its join probability density function evaluated in the random sample and a (1 - alpha) confidence interval for the population median of the distribution of the k-th order statistic.
Carlos Alberto Cardozo Delgado <[email protected]>.
Gentle, J, Computational Statistics, First Edition. Springer - Verlag, 2009.
Naradajah, S. and Rocha, R. (2016) Newdistns: An R Package for New Families of Distributions, Journal of Statistical Software.
# A random sample of size 10 of order statistics from a Extreme Value Distribution. order_glg(10,0,1,1,1,50) ## Not run: # A small comparison between two random sampling methods of order statistics # Method 1 m <- 10 output <- rep(0,m) order_sample <- function(m,n,k){ for(i in 1:m){ sample <- rglg(n) order_sample <- sort(sample) output[i] <- order_sample[k] } return(output) } N <- 10000 n <- 200 k <- 100 system.time(order_sample(N,n,k)) sample_1 <- order_sample(N,n,k) hist(sample_1) summary(sample_1) # Method 2 system.time(order_glg(N,0,1,1,k,n)) sample_2 <- order_glg(N,0,1,1,k,n)$sample hist(sample_2) summary(sample_2) ## End(Not run)
# A random sample of size 10 of order statistics from a Extreme Value Distribution. order_glg(10,0,1,1,1,50) ## Not run: # A small comparison between two random sampling methods of order statistics # Method 1 m <- 10 output <- rep(0,m) order_sample <- function(m,n,k){ for(i in 1:m){ sample <- rglg(n) order_sample <- sort(sample) output[i] <- order_sample[k] } return(output) } N <- 10000 n <- 200 k <- 100 system.time(order_sample(N,n,k)) sample_1 <- order_sample(N,n,k) hist(sample_1) summary(sample_1) # Method 2 system.time(order_glg(N,0,1,1,k,n)) sample_2 <- order_glg(N,0,1,1,k,n)$sample hist(sample_2) summary(sample_2) ## End(Not run)
pglg
is used to calculate the cumulative distribution function of a generalized log-gamma variable at x.
pglg(x, location, scale, shape)
pglg(x, location, scale, shape)
x |
numeric, a vector of real values. |
location |
numeric, represents the location parameter of a generalized log-gamma distribution. Default value is 0. |
scale |
numeric, represents the scale parameter of a generalized log-gamma distribution. Default value is 1. |
shape |
numeric, represents the shape parameter of a generalized log-gamma distribution. Default value is 1. |
A vector with the same size of x with the cumulative probability values of a generalized log-gamma distribution.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
x <- runif(3,-1,1) pglg(sort(x),location=0, scale=1, shape=1)
x <- runif(3,-1,1) pglg(sort(x),location=0, scale=1, shape=1)
plotnpc
displays a graph of a fitted nonparametric effect, either natural cubic spline or P-spline, from an object of class sglg.
plotnpc(fit, conf_lev)
plotnpc(fit, conf_lev)
fit |
an object of the class sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg(). |
conf_lev |
is the confidence level of the asymptotic confidence band. Default value is 0.05. |
Carlos Alberto Cardozo Delgado <[email protected]>
Eilers P.H.C. and Marx B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science. 11, 89-121.
Wood, S. (2017). Additive generalized models: An R introduction. Chapman and Hall.
set.seed(1) n <- 300 error <- rglg(n,0,0.5,1) t <- as.matrix((2*1:n - 1)/(2*n)) colnames(t) <- "t" f_t <- cos(4*pi*t) y <- 0.8 + f_t + error colnames(y) <- "y" data <- as.data.frame(cbind(y,1,t)) fit1 <- sglg(y ~ 1,npc=t,data=data,basis = "deBoor",alpha0=0.0001) summary(fit1) # The adjusted (black) non-linear component plotnpc(fit1,conf_lev=0.02)
set.seed(1) n <- 300 error <- rglg(n,0,0.5,1) t <- as.matrix((2*1:n - 1)/(2*n)) colnames(t) <- "t" f_t <- cos(4*pi*t) y <- 0.8 + f_t + error colnames(y) <- "y" data <- as.data.frame(cbind(y,1,t)) fit1 <- sglg(y ~ 1,npc=t,data=data,basis = "deBoor",alpha0=0.0001) summary(fit1) # The adjusted (black) non-linear component plotnpc(fit1,conf_lev=0.02)
plotsurv.sglg
is used to plot simultaneously the Kaplan-Meier and parametric estimators of the survival function.
plotsurv.sglg(fit)
plotsurv.sglg(fit)
fit |
an object of the class sglg. This object is returned from the call to survglg() or ssurvglg(). |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos A. Cardozo, G. Paula and L. Vanegas. Semi-parametric accelerated failure time models with generalized log-gamma erros. In preparation.
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
require(survival) rows <- 240 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) s <- t_sigma^2 a <- 1/s t_ini1 <- exp(X %*% t_beta) * rgamma(rows, scale = s, shape = a) cens.time <- rweibull(rows, 0.6, 14) delta1 <- ifelse(t_ini1 > cens.time, 1, 0) obst1 <- t_ini1 for (i in 1:rows) { if (delta1[i] == 1) { obst1[i] <- cens.time[i] } } data.example <- data.frame(obst1,delta1,X) fit3 <- survglg(Surv(log(obst1),delta1) ~ x1 + x2 - 1, data=data.example,shape=0.9) plotsurv.sglg(fit3)
require(survival) rows <- 240 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) s <- t_sigma^2 a <- 1/s t_ini1 <- exp(X %*% t_beta) * rgamma(rows, scale = s, shape = a) cens.time <- rweibull(rows, 0.6, 14) delta1 <- ifelse(t_ini1 > cens.time, 1, 0) obst1 <- t_ini1 for (i in 1:rows) { if (delta1[i] == 1) { obst1[i] <- cens.time[i] } } data.example <- data.frame(obst1,delta1,X) fit3 <- survglg(Surv(log(obst1),delta1) ~ x1 + x2 - 1, data=data.example,shape=0.9) plotsurv.sglg(fit3)
qglg
is used to calculate the quantile function of a generalized log-gamma variable at x.
qglg(x, location, scale, shape)
qglg(x, location, scale, shape)
x |
numeric, a vector with values between 0 and 1. |
location |
numeric, represents the location parameter of a generalized log-gamma distribution. Default value is 0. |
scale |
numeric, represents the scale parameter of a generalized log-gamma distribution. Default value is 1. |
shape |
numeric, represents the shape parameter of a generalized log-gamma distribution. Default value is 1. |
A vector with the same size of x with the quantile values of a generalized log-gamma distribution.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
# Calculating the quartiles of a glg(0,1,-1) distribution x <- c(0.25, 0.5, 0.75) qglg(x, location = 0, scale = 1, shape = -1)
# Calculating the quartiles of a glg(0,1,-1) distribution x <- c(0.25, 0.5, 0.75) qglg(x, location = 0, scale = 1, shape = -1)
quantile_residuals
is used to generate quantile residuals for a generalized log-gamma regression model.
quantile_residuals(fit)
quantile_residuals(fit)
fit |
is an object sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg(). |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
# Example 1 n <- 400 set.seed(4) error <- rglg(n,0,0.5,1) y <- as.data.frame(0.5 + error) names(y) <- "y" fit_0 <- glg(y~1,data=y) fit_0$mu fit_0$sigma fit_0$lambda quantile_residuals(fit_0) # Example 2 n <- 500 set.seed(6) error <- rglg(n,0,0.5,1) x1 <- runif(n,-2,2) beta <- c(0.5,2) y <- cbind(1,x1)%*%beta + error data <- data.frame(y=y,x1=x1) fit_1 <- glg(y~x1,data=data) fit_1$mu fit_1$sigma fit_1$lambda quantile_residuals(fit_1)
# Example 1 n <- 400 set.seed(4) error <- rglg(n,0,0.5,1) y <- as.data.frame(0.5 + error) names(y) <- "y" fit_0 <- glg(y~1,data=y) fit_0$mu fit_0$sigma fit_0$lambda quantile_residuals(fit_0) # Example 2 n <- 500 set.seed(6) error <- rglg(n,0,0.5,1) x1 <- runif(n,-2,2) beta <- c(0.5,2) y <- cbind(1,x1)%*%beta + error data <- data.frame(y=y,x1=x1) fit_1 <- glg(y~x1,data=data) fit_1$mu fit_1$sigma fit_1$lambda quantile_residuals(fit_1)
residuals.sglg extracts the deviance-type residuals for a model from an object of class 'sglg'.
## S3 method for class 'sglg' residuals(object, ...)
## S3 method for class 'sglg' residuals(object, ...)
object |
an object of the class sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg() function. |
... |
other arguments. |
rglg
is used to generate random numbers for a generalized log-gamma distribution.
rglg(n, location, scale, shape)
rglg(n, location, scale, shape)
n |
numeric, size of the random sample. |
location |
numeric, represents the location parameter of a generalized log-gamma distribution. Default value is 0. |
scale |
numeric, represents the scale parameter of a generalized log-gamma distribution. Default value is 1. |
shape |
numeric, represents the shape parameter of a generalized log-gamma distribution. Default value is 1. |
A vector of size n with the generalized log-gamma random values.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
u <- rglg(100, location = 0, scale = 1, shape = -1)
u <- rglg(100, location = 0, scale = 1, shape = -1)
sglg
is used to fit a semi-parametric regression model suitable for analysis of data sets in which the response variable is continuous, strictly positive, and asymmetric.
In this setup, the location parameter of the response variable is explicitly modeled by semi-parametric functions, whose nonparametric components may be approximated by
natural cubic splines or cubic P-splines.
sglg( formula, npc, basis, data, shape = 0.2, method, alpha0, Knot, Tolerance = 5e-05, Maxiter = 1000, format = "complete" )
sglg( formula, npc, basis, data, shape = 0.2, method, alpha0, Knot, Tolerance = 5e-05, Maxiter = 1000, format = "complete" )
formula |
a symbolic description of the systematic component of the model to be fitted. See details for further information. |
npc |
a matrix with the nonparametric variables of the systematic part of the model to be fitted. Must be included the names of each variables. |
basis |
a name of the cubic spline basis to be used in the model. Supported basis include deBoor and Gu basis which are a B-spline basis and a natural cubic spline basis, respectively. |
data |
an optional data frame, list containing the variables in the model. |
shape |
an optional value for the shape parameter of the error distribution of a generalized log-gamma distribution. Default value is 0.2. |
method |
There are two possibles algorithms to estimate the parameters. The default algorithm is 'FS' Fisher-Scoring, the other option is 'GSFS' an adequate combination between the block matrix version of non-linear Gauss-Seidel algorithm and Fisher-Scoring algorithm. |
alpha0 |
is a vector of positive values for the smoothing parameters alpha. Default vector with 1 in each entry. |
Knot |
is a vector of the number of knots in each non-linear component of the model. |
Tolerance |
an optional positive value, which represents the convergence criterion. Default value is 5e-05. |
Maxiter |
an optional positive integer giving the maximal number of iterations for the estimating process. Default value is 1e03. |
format |
an optional string value that indicates if you want a simple or a complete report of the estimating process. Default value is 'complete'. |
mu a vector of parameter estimates associated with the location parameter.
sigma estimate of the scale parameter associated with the model.
lambda estimate of the shape parameter associated with the model.
interval estimate of a 95% confidence interval for each estimate parameters associated with the model.
Deviance the deviance associated with the model.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
set.seed(1) rows<- 300 t_beta <- c(0.5,2) t_sigma <- 0.5 t_lambda <- 1 x1 <- runif(rows,-3,3) x2 <- rbinom(rows,1,0.5) X <- cbind(x1,x2) t <- as.matrix((2*1:rows - 1)/(2*rows)) colnames(t) <- "t" f_t <- cos(4*pi*t) error <- rglg(rows,0,1,t_lambda) y <- X %*%t_beta + f_t + t_sigma*error colnames(y) <- "y" data <- data.frame(y,X,t) fit1 <- sglg(y ~ x1 + x2 - 1, npc = t, data = data, basis = "deBoor", alpha0 = 0.1) logLik(fit1) # -288.1859 time: 90 milliseconds quantile_residuals(fit1) fit2 <- sglg(y ~ x1 + x2 - 1, npc=t, data=data, basis = "Gu", alpha0=0.005) logLik(fit2) ################################################# # An example with two non-parametric components # ################################################# set.seed(2) t_2 <- as.matrix(rnorm(rows, sd = 0.5)) colnames(t_2) <- 't_2' f_t_2 <- exp(t_2) error <- rglg(rows,0,1,t_lambda) y_2 <- X %*%t_beta + f_t + f_t_2 + t_sigma*error colnames(y_2) <- 'y_2' data2 <- data.frame(y_2,X,t,t_2) npcs <- cbind(t,t_2) fit3 <- sglg(y_2 ~ x1 + x2 - 1, npc = npcs, data = data2, alpha0 = c(0.45,0.65)) logLik(fit3) #############################################################################
set.seed(1) rows<- 300 t_beta <- c(0.5,2) t_sigma <- 0.5 t_lambda <- 1 x1 <- runif(rows,-3,3) x2 <- rbinom(rows,1,0.5) X <- cbind(x1,x2) t <- as.matrix((2*1:rows - 1)/(2*rows)) colnames(t) <- "t" f_t <- cos(4*pi*t) error <- rglg(rows,0,1,t_lambda) y <- X %*%t_beta + f_t + t_sigma*error colnames(y) <- "y" data <- data.frame(y,X,t) fit1 <- sglg(y ~ x1 + x2 - 1, npc = t, data = data, basis = "deBoor", alpha0 = 0.1) logLik(fit1) # -288.1859 time: 90 milliseconds quantile_residuals(fit1) fit2 <- sglg(y ~ x1 + x2 - 1, npc=t, data=data, basis = "Gu", alpha0=0.005) logLik(fit2) ################################################# # An example with two non-parametric components # ################################################# set.seed(2) t_2 <- as.matrix(rnorm(rows, sd = 0.5)) colnames(t_2) <- 't_2' f_t_2 <- exp(t_2) error <- rglg(rows,0,1,t_lambda) y_2 <- X %*%t_beta + f_t + f_t_2 + t_sigma*error colnames(y_2) <- 'y_2' data2 <- data.frame(y_2,X,t,t_2) npcs <- cbind(t,t_2) fit3 <- sglg(y_2 ~ x1 + x2 - 1, npc = npcs, data = data2, alpha0 = c(0.45,0.65)) logLik(fit3) #############################################################################
Tool that supports the estimation of the shape parameter in semi-parametric or multiple linear accelerated failure time model with generalized log-gamma errors under the presence of censored data. The estimation is based on the profiled likelihood function for the shape parameter of the model.
shape(formula, npc, data, interval, semi, step)
shape(formula, npc, data, interval, semi, step)
formula |
a symbolic description of the systematic component of the model to be fitted. |
npc |
a data frame with potential nonparametric variables of the systematic part of the model to be fitted. |
data |
a data frame which contains the variables in the model. |
interval |
an optional numerical vector of length 2. In this interval is the maximum likelihood estimate of the shape parameter of the model. By default is [0.1,1.5]. |
semi |
a logical value. TRUE means that the model has a non-parametric component. By default is FALSE. |
step |
an optional positive value. This parameter represents the length of the step of the partition of the interval parameter. By default is 0.1. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph. D. thesis. Sao Paulo University.
rows <- 200 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) s <- t_sigma^2 a <- 1/s t_ini1 <- exp(X %*% t_beta) * rgamma(rows, scale = s, shape = a) cens.time <- rweibull(rows, 0.3, 14) delta <- ifelse(t_ini1 > cens.time, 1, 0) obst1 = t_ini1 for (i in 1:rows) { if (delta[i] == 1) { obst1[i] = cens.time[i] } } example <- data.frame(obst1,delta,X) lambda <- shape(Surv(log(obst1),delta) ~ x1 + x2 - 1, data=example) lambda # To change interval or step or both options lambda <- shape(Surv(log(obst1),delta) ~ x1 + x2 - 1, data=example, interval=c(0.95,1.3), step=0.05) lambda
rows <- 200 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 t_lambda <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) s <- t_sigma^2 a <- 1/s t_ini1 <- exp(X %*% t_beta) * rgamma(rows, scale = s, shape = a) cens.time <- rweibull(rows, 0.3, 14) delta <- ifelse(t_ini1 > cens.time, 1, 0) obst1 = t_ini1 for (i in 1:rows) { if (delta[i] == 1) { obst1[i] = cens.time[i] } } example <- data.frame(obst1,delta,X) lambda <- shape(Surv(log(obst1),delta) ~ x1 + x2 - 1, data=example) lambda # To change interval or step or both options lambda <- shape(Surv(log(obst1),delta) ~ x1 + x2 - 1, data=example, interval=c(0.95,1.3), step=0.05) lambda
Tool that supports the selection of the smoothing parameters in semi-parametric generalized log-gamma models. The selection is based on the AIC, BIC, or Generalized Cross Validation methods.
smoothp(formula, npc, data, method = "PAIC", basis, interval, step)
smoothp(formula, npc, data, method = "PAIC", basis, interval, step)
formula |
a symbolic description of the systematic component of the model to be fitted. |
npc |
a data frame with potential nonparametric variables of the systematic part of the model to be fitted. |
data |
a data frame which contains the variables in the model. |
method |
There are three possible criteria to estimate the smoothing parameters: Penalized Akaike Criterion 'PAIC', Penalized Bayesian Criterion 'PBIC' and Generalized Cross Validation 'GCV'. The default method is 'PAIC'. |
basis |
a name of the cubic spline basis to be used in the model. Supported basis include deBoor and Gu basis. |
interval |
an optional numerical vector of length 2. In this interval is the maximum likelihood estimate of the shape parameter of the model. By default is [0.1,2]. |
step |
an optional positive value. This parameter represents the length of the step of the partition of the interval parameter. By default is 0.2. |
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph.D. thesis. Sao Paulo University.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
set.seed(1) rows<- 150 t_beta <- c(0.5,2) t_sigma <- 0.5 t_lambda <- 1 x1 <- runif(rows,-3,3) x2 <- rbinom(rows,1,0.5) X <- cbind(x1,x2) t <- as.matrix((2*1:rows - 1)/(2*rows)) colnames(t) <- "t" f_t <- cos(4*pi*t) error <- rglg(rows,0,1,t_lambda) y <- X %*%t_beta + f_t + t_sigma*error colnames(y) <- "y" data <- data.frame(y,X,t) fit1 <- sglg(y ~ x1 + x2 - 1,npc=t,data=data,basis = "deBoor",alpha0=1) fit1$AIC # We can get (probably) better values of alpha with the function smoothp smoothp(y ~ x1 + x2 - 1,npc=t,data=data,basis = "deBoor") fit2 <- sglg(y ~ x1 + x2 - 1,npc=t,data=data,basis = "Gu",alpha0=0.5) fit2$BIC # Again using the smooth function smoothp(y ~ x1 + x2 - 1,npc=t,data=data,basis = "Gu",method='PBIC') ################################################# # An example with two non-parametric components # ################################################# set.seed(2) t_2 <- as.matrix(rnorm(rows,sd=0.5)) colnames(t_2) <- 't_2' f_t_2 <- exp(t_2) error <- rglg(rows,0,1,t_lambda) y_2 <- X %*%t_beta + f_t + f_t_2 + t_sigma*error colnames(y_2) <- 'y_2' data2 <- data.frame(y_2,X,t,t_2) npcs <- cbind(t,t_2) # Some intuition about the best alpha values smoothp(y ~ x1 + x2 - 1,npc=npcs,data=data, method='GCV')
set.seed(1) rows<- 150 t_beta <- c(0.5,2) t_sigma <- 0.5 t_lambda <- 1 x1 <- runif(rows,-3,3) x2 <- rbinom(rows,1,0.5) X <- cbind(x1,x2) t <- as.matrix((2*1:rows - 1)/(2*rows)) colnames(t) <- "t" f_t <- cos(4*pi*t) error <- rglg(rows,0,1,t_lambda) y <- X %*%t_beta + f_t + t_sigma*error colnames(y) <- "y" data <- data.frame(y,X,t) fit1 <- sglg(y ~ x1 + x2 - 1,npc=t,data=data,basis = "deBoor",alpha0=1) fit1$AIC # We can get (probably) better values of alpha with the function smoothp smoothp(y ~ x1 + x2 - 1,npc=t,data=data,basis = "deBoor") fit2 <- sglg(y ~ x1 + x2 - 1,npc=t,data=data,basis = "Gu",alpha0=0.5) fit2$BIC # Again using the smooth function smoothp(y ~ x1 + x2 - 1,npc=t,data=data,basis = "Gu",method='PBIC') ################################################# # An example with two non-parametric components # ################################################# set.seed(2) t_2 <- as.matrix(rnorm(rows,sd=0.5)) colnames(t_2) <- 't_2' f_t_2 <- exp(t_2) error <- rglg(rows,0,1,t_lambda) y_2 <- X %*%t_beta + f_t + f_t_2 + t_sigma*error colnames(y_2) <- 'y_2' data2 <- data.frame(y_2,X,t,t_2) npcs <- cbind(t,t_2) # Some intuition about the best alpha values smoothp(y ~ x1 + x2 - 1,npc=npcs,data=data, method='GCV')
ssurvglg
is used to fit a semi-parametric regression model in which the response variable is continuous, strictly positive, asymmetric and there are right censored observations.
In this setup, the location parameter of the logarithm of the variable is explicitly modeled by semi-parametric functions, whose nonparametric components may be approximated by
natural cubic splines or cubic P-splines.
ssurvglg(formula, npc, basis, data, shape, alpha0, Maxiter, Tolerance)
ssurvglg(formula, npc, basis, data, shape, alpha0, Maxiter, Tolerance)
formula |
a symbolic description of the systematic component of the model to be fitted. See details for further information. |
npc |
a data frame with potential nonparametric variables of the systematic part of the model to be fitted. |
basis |
a name of the cubic spline basis to be used in the model. Supported basis include deBoor and Gu basis which are a B-spline basis and a natural cubic spline basis, respectively. |
data |
an optional data frame, list containing the variables in the model. |
shape |
an optional value for the shape parameter of the model. |
alpha0 |
is a vector of initial values for the smoothing parameter alpha. |
Maxiter |
an optional positive integer giving the maximal number of iterations for the estimating process. Default value is 1e03. |
Tolerance |
an optional positive value, which represents the convergence criterion. Default value is 1e-04. |
mu a vector of parameter estimates asociated with the location parameter.
sigma estimate of the scale parameter associated with the model.
lambda estimate of the shape parameter associated with the model.
interval estimate of a 95% confidence interval for each estimate parameters associated with the model.
Deviance the deviance associated with the model.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos A. Cardozo, G. Paula and L. Vanegas. Semi-parametric accelerated failure time models with generalized log-gamma erros: Censored case. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
require(survival) rows <- 150 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 0.5 t_lambda <- 1 set.seed(8142030) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(rows, 0, 1) X <- cbind(x1,x2) t_knot1 <- 6 ts1 <- seq(0, 1, length = t_knot1) t_g1 <- 0.4 * sin(pi * ts1) BasisN <- function(n, knot) { N <- matrix(0, n, knot) m <- n/knot block <- rep(1,m) for (i in 1:knot) { l <- (i - 1) * m + 1 r <- i * m N[l:r, i] <- block } return(N) } s_N1 <- BasisN(rows, length(ts1)) x3 <- s_N1 %*% ts1 colnames(x3) <- 'x3' sys <- X %*% t_beta + s_N1%*%t_g1 t_ini1 <- exp(sys) * rweibull(rows,1/t_sigma,1) cens.time <- rweibull(rows, 1.5, 14) delta <- ifelse(t_ini1 > cens.time, 1, 0) obst1 = t_ini1 for(i in 1:rows) { if (delta[i] == 1) { obst1[i] = cens.time[i] } } d_example <- data.frame(obst1, delta, X, x3) fit4 <- ssurvglg(Surv(log(obst1),delta)~ x1 + x2 - 1,npc=x3,data = d_example,shape=0.9) summary(fit4)
require(survival) rows <- 150 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 0.5 t_lambda <- 1 set.seed(8142030) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(rows, 0, 1) X <- cbind(x1,x2) t_knot1 <- 6 ts1 <- seq(0, 1, length = t_knot1) t_g1 <- 0.4 * sin(pi * ts1) BasisN <- function(n, knot) { N <- matrix(0, n, knot) m <- n/knot block <- rep(1,m) for (i in 1:knot) { l <- (i - 1) * m + 1 r <- i * m N[l:r, i] <- block } return(N) } s_N1 <- BasisN(rows, length(ts1)) x3 <- s_N1 %*% ts1 colnames(x3) <- 'x3' sys <- X %*% t_beta + s_N1%*%t_g1 t_ini1 <- exp(sys) * rweibull(rows,1/t_sigma,1) cens.time <- rweibull(rows, 1.5, 14) delta <- ifelse(t_ini1 > cens.time, 1, 0) obst1 = t_ini1 for(i in 1:rows) { if (delta[i] == 1) { obst1[i] = cens.time[i] } } d_example <- data.frame(obst1, delta, X, x3) fit4 <- ssurvglg(Surv(log(obst1),delta)~ x1 + x2 - 1,npc=x3,data = d_example,shape=0.9) summary(fit4)
summary.sglg extracts displays the summary of the fitted model including parameter estimates, associated (approximated) standard errors and goodness-of-fit statistics from a model from an object of class 'sglg'.
## S3 method for class 'sglg' summary(object, ...)
## S3 method for class 'sglg' summary(object, ...)
object |
an object of the class sglg. This object is returned from the call to glg(), sglg(), survglg() or ssurvglg() function. |
... |
other arguments. |
survglg
is used to fit a multiple linear regression model in which the response variable is continuous, strictly positive, asymmetric and there are right censored observations.
In this setup, the location parameter of the logarithm of the response variable is modeled by a linear model of the parameters.
survglg(formula, data, shape, Maxiter, Tolerance)
survglg(formula, data, shape, Maxiter, Tolerance)
formula |
a symbolic description of the systematic component of the model to be fitted. See details for further information. |
data |
an optional data frame, list containing the variables in the model. |
shape |
an optional value for the shape parameter of the model. |
Maxiter |
an optional positive integer giving the maximal number of iterations for the estimating process. Default value is 1e03. |
Tolerance |
an optional positive value, which represents the convergence criterion. Default value is 1e-04. |
mu a vector of parameter estimates asociated with the location parameter.
sigma estimate of the scale parameter associated with the model.
lambda estimate of the shape parameter associated with the model.
interval estimate of a 95% confidence interval for each estimate parameters associated with the model.
Deviance the deviance associated with the model.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos A. Cardozo, G. Paula and L. Vanegas. Semi-parametric accelerated failure time models with generalized log-gamma erros. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
require(survival) rows <- 240 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) s <- t_sigma^2 a <- 1/s t_ini1 <- exp(X %*% t_beta) * rgamma(rows, scale = s, shape = a) cens.time <- rweibull(rows, 0.3, 14) delta1 <- ifelse(t_ini1 > cens.time, 1, 0) obst1 <- t_ini1 for (i in 1:rows) { if (delta1[i] == 1) { obst1[i] <- cens.time[i] } } data.example <- data.frame(obst1,delta1,X) fit3 <- survglg(Surv(log(obst1),delta1) ~ x1 + x2 - 1, data=data.example,shape=0.9) logLik(fit3) summary(fit3)
require(survival) rows <- 240 columns <- 2 t_beta <- c(0.5, 2) t_sigma <- 1 set.seed(8142031) x1 <- rbinom(rows, 1, 0.5) x2 <- runif(columns, 0, 1) X <- cbind(x1,x2) s <- t_sigma^2 a <- 1/s t_ini1 <- exp(X %*% t_beta) * rgamma(rows, scale = s, shape = a) cens.time <- rweibull(rows, 0.3, 14) delta1 <- ifelse(t_ini1 > cens.time, 1, 0) obst1 <- t_ini1 for (i in 1:rows) { if (delta1[i] == 1) { obst1[i] <- cens.time[i] } } data.example <- data.frame(obst1,delta1,X) fit3 <- survglg(Surv(log(obst1),delta1) ~ x1 + x2 - 1, data=data.example,shape=0.9) logLik(fit3) summary(fit3)
survival_gg
is used to obtain the value of survival, hazard and cumulative hazard functions of a generalized gamma distribution at a positive value.
survival_gg(x, mu, sigma, lambda)
survival_gg(x, mu, sigma, lambda)
x |
numeric, represent a vector of positive values. Default value is 1. |
mu |
numeric, represents the location parameter of a generalized gamma distribution. Default value is 0. |
sigma |
numeric, represents the scale parameter of a generalized gamma distribution. Default value is 1. |
lambda |
numeric, represents the shape parameter of a generalized gamma distribution. Default value is 1. |
A list of three vectors, survival, hazard, and cumulative hazard values of a generalized gamma distribution.
Carlos Alberto Cardozo Delgado <[email protected]>
Carlos Alberto Cardozo Delgado, Semi-parametric generalized log-gamma regression models. Ph.D. thesis. Sao Paulo University.
Jerald F. Lawless (2003). Statistical Models and Methods for Lifetime Data. Second Edition. John-Wiley & Sons
survival_gg(0.0001,0,1,1) # Extreme value type I distribution, maximum case. times <- seq(0.05,7,by=0.05) plot(times, survival_gg(times,0,1,1)$survival_value,type='l') plot(times, survival_gg(times,0,1,1)$hazard_value,type='l') plot(times, survival_gg(times,0,1,1)$cumulative_hazard_value,type='l')
survival_gg(0.0001,0,1,1) # Extreme value type I distribution, maximum case. times <- seq(0.05,7,by=0.05) plot(times, survival_gg(times,0,1,1)$survival_value,type='l') plot(times, survival_gg(times,0,1,1)$hazard_value,type='l') plot(times, survival_gg(times,0,1,1)$cumulative_hazard_value,type='l')