Title: | Quantile Regression for Random Variables on the Unit Interval |
---|---|
Description: | Employs a two-parameter family of distributions for modelling random variables on the (0, 1) interval by applying the cumulative distribution function (cdf) of one parent distribution to the quantile function of another. |
Authors: | Yiyun Shou [aut, cre], Michael Smithson [aut] |
Maintainer: | Yiyun Shou <[email protected]> |
License: | GPL-3 |
Version: | 1.3.1-2 |
Built: | 2024-12-24 06:43:25 UTC |
Source: | CRAN |
Employs a two-parameter family of distributions for modelling random variables on the (0, 1) interval by applying the cumulative distribution function (cdf) of one parent distribution to the quantile function of another.
Package: | cdfquantreg |
Type: | Package |
Date: | 2022-05-19 |
License: | GPL-3 |
The cdfquantreg package includes 36 members of a two-parameter family of distributions for modelling random variables on the (0, 1) interval (see cdfqrFamily). This family has explicit pdfs, cdfs, and quantile functions. The two parameters consist of a location parameter and a dispersion parameter. The location parameter models the median and the dispersion parameter models the spread of other quantiles around the median (see Smithson and Shou, 2016, for details about the distribution family and the models). Separate submodels may be specified for the location and for the dispersion parameters, permitting different or overlapping sets of predictors in each.
The package offers maximum likelihood (see cdfquantreg)and bootstrap (see qrBoot) estimation methods. All model functions return S3 objects. In addition to the usual goodness of fit information, the package provides root-mean-squared errors in both the raw and logit scales, and the gradient. Model diagnostics include raw, Pearson, and deviance residuals (see residuals.cdfqr), and dfbetas (see influence.cdfqr).
For each distribution, the package provides evaluations of the pdf (dq), cdf (pq), and quantile (qq), as well as random samples from any of them (rq). Evaluations of skew and kurtosis (qrPwlm) also are available using probability-weighted L-moments.
Yiyun Shou and Michael Smithson
Maintainer: Yiyun Shou ([email protected])
Shou, Y. and Smithson, M., (2019). cdfquantreg: An R Package for CDF-Quantile Regression. Journal of Statistical Software,88(1), pp.1–30, doi: 10.18637/jss.v088.i01
A data from a study that investigates the judgment under ambiguity and conflict
Ambdata
Ambdata
A data frame with 166 rows and 2 variables:
subject ID
Rating in each judgment scenario
Index for judgment scenarios
https://pubmed.ncbi.nlm.nih.gov/16594767/
Likelihood Ratio Tests for fitted cdfqr Objects.
## S3 method for class 'cdfqr' anova(object, ..., test = "LRT")
## S3 method for class 'cdfqr' anova(object, ..., test = "LRT")
object |
The fitted cdfqr model. |
... |
One or more cdfqr model objects for model comparison. |
test |
The model comparison test, currently only 'LRT' is implemented. |
data(cdfqrExampleData) fit_null <- cdfquantreg(crc99 ~ 1 | 1, 't2','t2', data = JurorData) fit_mod1 <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) anova(fit_null, fit_mod1)
data(cdfqrExampleData) fit_null <- cdfquantreg(crc99 ~ 1 | 1, 't2','t2', data = JurorData) fit_mod1 <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) anova(fit_null, fit_mod1)
A data from a study that investigates the relationship between stress and anxiety.
AnxStrData
AnxStrData
A data frame with 166 rows and 2 variables:
Scores on Anxiety subscale
Scores on Stress subscale
https://pubmed.ncbi.nlm.nih.gov/16594767/
Likelihood functions for generating OpenBUGS model file.
bugsLikelihood(fd, sd)
bugsLikelihood(fd, sd)
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
A string to be written in the BUGS model file.
bugsLikelihood('t2','t2')
bugsLikelihood('t2','t2')
Generating OpenBUGS model file
bugsModel(formula, fd, sd, random = NULL, modelname = "bugmodel", wd = getwd())
bugsModel(formula, fd, sd, random = NULL, modelname = "bugmodel", wd = getwd())
formula |
A formula object, with the DV on the left of an ~ operator, and predictors on the right. For the part on the right of '~', the specification of submodels can be separated by '|'. So |
fd |
A string that specifies the parent distribution (see cdfqrFamily). |
sd |
A string that specifies the sub-family distribution. |
random |
Character or vector of characters that indicates the random effect factors. |
modelname |
The name of the model file; optional. |
wd |
The working directory in which OpenBUGS will work (i.e., generate the model files and chain information). |
A model ‘.txt’ file is generated in the specified working directory. The function also returns a list of values:
Default initial values for MCMC two chain procedure.
A list of variables that are included in the estimation.
a list of characters that specify the nodes to be monitored.
## Not run: # Need write access in the working directory before executing the code. # No random component bugsModel(y ~ x1 | x2, 't2','t2', random = NULL) # Random component as subject ID bugsModel(y ~ x1 | x2, 't2','t2', random = 'ID') ## End(Not run)
## Not run: # Need write access in the working directory before executing the code. # No random component bugsModel(y ~ x1 | x2, 't2','t2', random = NULL) # Random component as subject ID bugsModel(y ~ x1 | x2, 't2','t2', random = 'ID') ## End(Not run)
Density function, distribution function, quantile function, and random generation of variates for a specified cdf-quantile distribution.
cdfft(q, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version) pdfft(y, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version) qqft(p, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version) rqft(n, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version)
cdfft(q, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version) pdfft(y, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version) qqft(p, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version) rqft(n, sigma, theta, fd, sd, mu = NULL, inner = TRUE, version)
q |
vector of quantiles. |
sigma |
vector of standard deviations. |
theta |
vector of skewness. |
fd |
A string that specifies the parent distribution. At the moment, only "arcsinh", "cauchit" and "t2" can be used. See details. |
sd |
A string that specifies the child distribution. At the moment, only "arcsinh", "cauchy" and "t2" can be used. See details. |
mu |
vector of means if 3-parameter case is used. |
inner |
A logic value that indicates if the inner ( |
version |
A string indicates that which version will be used. "V" is the tilt parameter function while "W" indicates the Jones Pewsey transformation. |
y |
vector of quantiles. |
p |
vector of probabilities. |
n |
Number of random samples. |
pdfft
gives the density, rqft
generates random variate, qqft
gives the quantile function, and cdfft
gives the cumulative density of specified distribution.
Control Optimization Parameters for CDF-Quantile Probability Distributions.
cdfqr.control(method = "BFGS", maxit = 5000, trace = FALSE)
cdfqr.control(method = "BFGS", maxit = 5000, trace = FALSE)
method |
Characters string specifying the method argument passed to optim. |
maxit |
Integer specifying the maxit argument (maximal number of iterations) passed to optim. |
trace |
Logical or integer controlling whether tracing information on the progress of the optimization should be produced |
A list with the arguments specified.
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2', 't2', data = JurorData,control = cdfqr.control(trace = TRUE))
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2', 't2', data = JurorData,control = cdfqr.control(trace = TRUE))
The cdfquantreg family consists of the currently available distributions that can be used to fit quantile regression models via the cdfquantreg() function.
cdfqrFamily(shape = "all")
cdfqrFamily(shape = "all")
shape |
To show all distributions or the set of distribution for a specific type of shape. Can be |
The cdfquantreg package includes a two-parameter family of distributions for
modeling random variables on the (0, 1) interval by applying the cumulative
distribution function (cdf) of one “parent” distribution to the
quantile function of another.
The naming of these distributions is “parent - child” or
“fd - sd”, where “fd” is the parent distribution, and “sd”
is the child distribution.
The distributions have four characteristic shapes: Logit-logistic, bimodal, trimodal, and finite-tailed.
Here is the list of currently available distributions.
Bimodal Shape Distributions
Distribution | R input | Alternative Input | Shape |
Burr VII-ArcSinh | fd = "burr7", sd = "arcsinh" |
family = "burr7-arcsinh" |
Bimodal |
Burr VII-Cauchy | fd = "burr7", sd = "cauchy" |
family = "burr7-cauchy" |
Bimodal |
Burr VII-T2 | fd = "burr7", sd = "t2" |
family = "burr7-t2" |
Bimodal |
Burr VIII-ArcSinh | fd = "burr8", sd = "arcsinh" |
family = "burr8-arcsinh" |
Bimodal |
Burr VIII-Cauchy | fd = "burr8", sd = "cauchy" |
family = "burr8-cauchy" |
Bimodal |
Burr VIII-T2 | fd = "burr8", sd = "t2" |
family = "burr8-t2" |
Bimodal |
Logit-ArcSinh | fd = "logit", sd = "arcsinh" |
family = "logit-arcsinh" |
Bimodal |
Logit-Cauchy | fd = "logit", sd = "cauchy" |
family = "logit-cauchy" |
Bimodal |
Logit-T2 | fd = "logit", sd = "t2" |
family = "logit-t2" |
Bimodal |
T2-ArcSinh | fd = "t2", sd = "arcsinh" |
family = "t2-arcsinh" |
Bimodal |
T2-Cauchy | fd = "t2", sd = "cauchy" |
family = "t2-cauchy" |
Bimodal |
Trimodal Shape Distributions
Distribution | R input | Alternative Input | Shape |
ArcSinh-Burr VII | fd = "arcsinh", sd = "burr7" |
family = "arcsinh-burr7" |
Trimodal |
ArcSinh-Burr VIII | fd = "arcsinh", sd = "burr8" |
family = "arcsinh-burr8" |
Trimodal |
ArcSinh-Logistic | fd = "arcsinh", sd = "logistic" |
family = "arcsinh-logistic" |
Trimodal |
ArcSinh-T2 | fd = "arcsinh", sd = "t2" |
family = "arcsinh-t2" |
Trimodal |
Cauchit-Burr VII | fd = "cauchit", sd = "burr7" |
family = "cauchit-burr7" |
Trimodal |
Cauchit-Burr VIII | fd = "cauchit", sd = "burr8" |
family = "cauchit-burr8" |
Trimodal |
Cauchit-Logistic | fd = "cauchit", sd = "logistic" |
family = "cauchit-logistic" |
Trimodal |
Cauchit-T2 | fd = "cauchit", sd = "t2" |
family = "cauchit-t2" |
Trimodal |
T2-Burr VII | fd = "t2", sd = "burr7" |
family = "t2-burr7" |
Trimodal |
T2-Burr VIII | fd = "t2", sd = "burr8" |
family = "t2-burr8" |
Trimodal |
T2-Logistic | fd = "t2", sd = "logistic" |
family = "t2-logistic" |
Trimodal |
Logit-logistic Shape Distributions
Distribution | R input | Alternative Input | Shape |
Burr VII-Burr VII | fd = "burr7", sd = "burr7" |
family = "burr7-burr7" |
Logit-logistic |
Burr VII-Burr VIII | fd = "burr7", sd = "burr8" |
family = "burr7-burr8" |
Logit-logistic |
Burr VII-Logistic | fd = "burr7", sd = "logistic" |
family = "burr7-logistic" |
Logit-logistic |
Burr VIII-Burr VII | fd = "burr8", sd = "burr7" |
family = "burr8-burr7" |
Logit-logistic |
Burr VIII-Burr VIII | fd = "burr8", sd = "burr8" |
family = "burr8-burr8" |
Logit-logistic |
Burr VIII-Logistic | fd = "burr8", sd = "logistic" |
family = "burr8-logistic" |
Bimodal |
Logit-Burr VII | fd = "logit", sd = "burr7" |
family = "logit-burr7" |
Logit-logistic |
Logit-Burr VIII | fd = "logit", sd = "burr8" |
family = "logit-burr8" |
Logit-logistic |
Logit-Logistic | fd = "logit", sd = "logistic" |
family = "logit-logistic" |
Logit-logistic |
Finite-tailed Shape Distributions
Distribution | R input | Alternative Input | Shape |
ArcSinh-ArcSinh | fd = "arcsinh", sd = "arcsinh" |
family = "arcsinh-arcsinh" |
Finite-tailed |
ArcSinh-Cauchy | fd = "arcsinh", sd = "cauchy" |
family = "arcsinh-cauchy" |
Finite-tailed |
Cauchit-ArcSinh | fd = "cauchit", sd = "arcsinh" |
family = "cauchit-arcsinh" |
Finite-tailed |
Cauchit-Cauchy | fd = "cauchit", sd = "cauchy" |
family = "cauchit-cauchy" |
Finite-tailed |
T2-T2 | fd = "t2", sd = "t2" |
family = "t2-t2" |
Finite-tailed |
Kumaraswamy Distribution
Distribution | R input | Alternative Input | Shape |
Kumaraswamy | fd = "", sd = "" |
family = "-" |
|
A list of distributions that are available in the current version of package.
cdfqrFamily()
cdfqrFamily()
cdfquantreg
is the main function to fit a cdf quantile regression with a variety of distributions.
cdfquantreg( formula, fd = NULL, sd = NULL, data, family = NULL, start = NULL, control = cdfqr.control(...), ... )
cdfquantreg( formula, fd = NULL, sd = NULL, data, family = NULL, start = NULL, control = cdfqr.control(...), ... )
formula |
A formula object, with the dependent variable (DV) on the left of an ~ operator, and predictors on the right. For the part on the right of '~', the specification of the location and dispersion submodels can be separated by '|'. So |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the child distribution. |
data |
The data in a data.frame format |
family |
If 'fd' and 'sd' are not provided, the name of a member of the family of distributions can be provided (See |
start |
The starting values for model fitting. If not provided, default values will be used. |
control |
Control optimization parameters (See |
... |
Currently ignored. |
The cdfquantreg function fits a quantile regression model with a distributions from the cdf-quantile family selected by the user (Smithson and Shou, 2015). The model is specified in a two-part formula, one part containing the predictors of the location parameter, and the second part containing the predictors of the dispersion parameter. The models are fitted in two stages, the first of which uses the Nelder-Mead algorithm and the second of which takes the estimates from the first stage and applies the BFGS algorithm to refine the estimates.
An object of class cdfquantreg
will be returned. Generic functions such as summary,print (e.g., print.cdfqr) and coef can be used to extract output (see summary.cdfqr for more details about the generic functions that can be used).
Class of object is a list with the following output:
A named vector of coefficients.
Raw residuals, the difference between the fitted values and the data.
The fitted values, including full model fitted values, fitted values for the mean component, and fitted values for the dispersion component.
The model root mean squared errors
The root mean squared errors between the logit of the fitted values, and the logit of the response values.
The variance-covariance matrix of the coefficient estimates.
Akaike's Information Criterion and Bayesian Information Criterion.
The deviance for the model.
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, fd ='t2',sd ='t2', data = JurorData) summary(fit)
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, fd ='t2',sd ='t2', data = JurorData) summary(fit)
cdfquantregC
is the a function to fit a censored cdf quantile regression with a variety of distributions .
cdfquantregC( formula, fd = NULL, sd = NULL, data, family = NULL, censor = "DB", c1 = NULL, c2 = NULL, start = NULL, control = cdfqr.control(...), ... )
cdfquantregC( formula, fd = NULL, sd = NULL, data, family = NULL, censor = "DB", c1 = NULL, c2 = NULL, start = NULL, control = cdfqr.control(...), ... )
formula |
A formula object, with the dependent variable (DV) on the left of an ~ operator, and predictors on the right. For the part on the right of '~', the specification of the location and dispersion submodels can be separated by '|'. So |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the child distribution. |
data |
The data in a data.frame format |
family |
If 'fd' and 'sd' are not provided, the name of a member of the family of distributions can be provided (See |
censor |
A string variable to indicate how many censored point is used- only left censored |
c1 |
The left censored value, if NULL, the minimum value in the data will be used |
c2 |
The right censored value, if NULL, the maximum value in the data will be used |
start |
The starting values for model fitting. If not provided, default values will be used. |
control |
Control optimization parameters (See |
... |
Currently ignored. |
The cdfquantreg function fits a quantile regression model with a distributions from the cdf-quantile family selected by the user (Smithson and Shou, 2015). The model is specified in a two-part formula, one part containing the predictors of the location parameter, and the second part containing the predictors of the dispersion parameter. The models are fitted in two stages, the first of which uses the Nelder-Mead algorithm and the second of which takes the estimates from the first stage and applies the BFGS algorithm to refine the estimates.
An object of class cdfquantreg
will be returned. Generic functions such as summary,print (e.g., print.cdfqr) and coef can be used to extract output (see summary.cdfqr for more details about the generic functions that can be used).
Class of object is a list with the following output:
A named vector of coefficients.
Raw residuals, the difference between the fitted values and the data.
The fitted values, including full model fitted values, fitted values for the mean component, and fitted values for the dispersion component.
The model root mean squared errors
The root mean squared errors between the logit of the fitted values, and the logit of the response values.
The variance-covariance matrix of the coefficient estimates.
Akaike's Information Criterion and Bayesian Information Criterion.
The deviance for the model.
data(cdfqrExampleData) fit <- cdfquantregC(crc99 ~ vert | confl, c1 = 0.001, c2= 0.999, fd ='t2',sd ='t2', data = JurorData) summary(fit)
data(cdfqrExampleData) fit <- cdfquantregC(crc99 ~ vert | confl, c1 = 0.001, c2= 0.999, fd ='t2',sd ='t2', data = JurorData) summary(fit)
cdfquantregFT
is a function to fit a cdf quantile regression with a variety of finite tailed distributions. It can account for data that has boundary values.
cdfquantregFT( formula, fd = NULL, sd = NULL, mu.fo = NULL, inner = FALSE, version = "V", data, family = NULL, start = NULL, ssn = 20, control = cdfqr.control(...), ... )
cdfquantregFT( formula, fd = NULL, sd = NULL, mu.fo = NULL, inner = FALSE, version = "V", data, family = NULL, start = NULL, ssn = 20, control = cdfqr.control(...), ... )
formula |
A formula object, with the dependent variable (DV) on the left of an ~ operator, and predictors on the right. For the part on the right of '~', the specification of the dispersion (sigma; first) and skewness (theta; second) submodels can be separated by '|'. So |
fd |
A string that specifies the parent distribution. At the moment, only "arcsinh", "cauchit" and "t2" can be used. See details. |
sd |
A string that specifies the child distribution. At the moment, only "arcsinh", "cauchy" and "t2" can be used. See details. |
mu.fo |
A formula object to indicate the predictors for the location submodel if the 3-parameter distribution is used, only input as |
inner |
A logic value that indicates if the inner ( |
version |
A string indicates that which version will be used. "V" is the tilt transformation while "W" indicates the Jones Pewsey transformation. |
data |
The data in a data.frame format |
family |
If 'fd' and 'sd' are not provided, the name of a member of the family of distributions can be provided (see below) for details of family functions) |
start |
The starting values for model fitting. If not provided, default values will be used. |
ssn |
The number of searches on optimal starting values to be performed. If model does not converge, can increase this number. |
control |
Control optimization parameters (See |
... |
Currently ignored. |
The cdfquantregFT function fits a quantile regression model with a distributions from the cdf-quantile finite tailed distributions. Here is the list of currently available distributions.
Bimodal Shape Distributions
Distribution | R input | Alternative Input | Available Version |
ArcSinh-ArcSinh | fd = "arcsinh", sd = "arcsinh" |
family = "arcsinh-arcsinh" |
"V", "W" |
ArcSinh-Cauchy | fd = "arcsinh", sd = "cauchy" |
family = "arcsinh-cauchy" |
"V", "W" |
Cauchit-ArcSinh | fd = "cauchit", sd = "arcsinh" |
family = "cauchit-arcsinh" |
"V", "W" |
Cauchit-Cauchy | fd = "cauchit", sd = "cauchy" |
family = "cauchit-cauchy" |
"V", "W" |
T2-T2 | fd = "t2", sd = "t2" |
family = "t2-cauchy" |
"V", "W" |
An object of class cdfqrFT
will be returned. Generic functions such as summary,print and coef can be used to extract output (see summary.cdfqr for more details about the generic functions that can be used).
Class of object is a list with the following output:
A named vector of coefficients.
Raw residuals, the difference between the fitted values and the data.
The fitted values, including full model fitted values, fitted values for the mean component, and fitted values for the dispersion component.
The model root mean squared errors
The root mean squared errors between the logit of the fitted values, and the logit of the response values.
The variance-covariance matrix of the coefficient estimates.
Akaike's Information Criterion and Bayesian Information Criterion.
The deviance for the model.
data(cdfqrExampleData) fit <- cdfquantregFT(pnurse ~ Ambulance |Ambulance , fd = "arcsinh", sd = "arcsinh", inner = FALSE, version = "V", data = yoon) summary(fit)
data(cdfqrExampleData) fit <- cdfquantregFT(pnurse ~ Ambulance |Ambulance , fd = "arcsinh", sd = "arcsinh", inner = FALSE, version = "V", data = yoon) summary(fit)
cdfquantregH
is the a function to fit a Zero/One inflated CDF-Quantile regression with a variety of distributions .
cdfquantregH( formula, zero.fo = ~1, one.fo = ~1, fd = NULL, sd = NULL, data, family = NULL, type = "ZI", start = NULL, control = cdfqr.control(...), ... )
cdfquantregH( formula, zero.fo = ~1, one.fo = ~1, fd = NULL, sd = NULL, data, family = NULL, type = "ZI", start = NULL, control = cdfqr.control(...), ... )
formula |
A formula object, with the dependent variable (DV) on the left of an ~ operator, and predictors on the right. For the part on the right of '~', the specification of the location and dispersion submodels can be separated by '|'. So |
zero.fo |
A formula object to indicate the predictors for the zero component, only input as |
one.fo |
A formula object to indicate the predictors for the one component, only input as |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the child distribution. |
data |
The data in a data.frame format |
family |
If 'fd' and 'sd' are not provided, the name of a member of the family of distributions can be provided (See |
type |
A string variable to indicate whether the model is zero-inflated |
start |
The starting values for model fitting. If not provided, default values will be used. |
control |
Control optimization parameters (See |
... |
Currently ignored. |
The cdfquantreg function fits a quantile regression model with a distributions from the cdf-quantile family selected by the user (Smithson and Shou, 2015). The model is specified in a two-part formula, one part containing the predictors of the location parameter, and the second part containing the predictors of the dispersion parameter. The models are fitted in two stages, the first of which uses the Nelder-Mead algorithm and the second of which takes the estimates from the first stage and applies the BFGS algorithm to refine the estimates.
An object of class cdfqrH
will be returned. Generic functions such as summary,print (e.g., print.cdfqr) and coef can be used to extract output (see summary.cdfqr for more details about the generic functions that can be used).
Class of object is a list with the following output:
A named vector of coefficients.
Raw residuals, the difference between the fitted values and the data.
The fitted values, including full model fitted values, fitted values for the mean component, and fitted values for the dispersion component.
The variance-covariance matrix of the coefficient estimates.
Akaike's Information Criterion and Bayesian Information Criterion.
data(cdfqrExampleData) # For one-inflated model ipcc_high <- subset(IPCC, mid == 1 & high == 1 & prob!=0) fit <- cdfquantregH(prob ~ valence | valence,one.fo = ~valence, fd ='t2',sd ='t2', type = "OI", data = ipcc_high) summary(fit) # For zero-inflated model ipcc_low <- subset(IPCC, mid == 0 & high == 0 & prob!=1) fit <- cdfquantregH(prob ~ valence | valence, zero.fo = ~valence, fd ='t2',sd ='t2', type = "ZI", data = ipcc_low) # For zero &one-inflated model ipcc_mid <- subset(IPCC, mid == 1 & high == 0) fit <- cdfquantregH(prob ~ valence | valence, zero.fo = ~valence, one.fo = ~valence, fd ='t2',sd ='t2', type = "ZO", data = ipcc_mid)
data(cdfqrExampleData) # For one-inflated model ipcc_high <- subset(IPCC, mid == 1 & high == 1 & prob!=0) fit <- cdfquantregH(prob ~ valence | valence,one.fo = ~valence, fd ='t2',sd ='t2', type = "OI", data = ipcc_high) summary(fit) # For zero-inflated model ipcc_low <- subset(IPCC, mid == 0 & high == 0 & prob!=1) fit <- cdfquantregH(prob ~ valence | valence, zero.fo = ~valence, fd ='t2',sd ='t2', type = "ZI", data = ipcc_low) # For zero &one-inflated model ipcc_mid <- subset(IPCC, mid == 1 & high == 0) fit <- cdfquantregH(prob ~ valence | valence, zero.fo = ~valence, one.fo = ~valence, fd ='t2',sd ='t2', type = "ZO", data = ipcc_mid)
Density function, distribution function, quantile function, and random generation of variates for a specified cdf-quantile distribution.
dq(x, mu, sigma, fd, sd) rq(n, mu, sigma, fd, sd) qq(p, mu, sigma, fd, sd) pq(q, mu, sigma, fd, sd)
dq(x, mu, sigma, fd, sd) rq(n, mu, sigma, fd, sd) qq(p, mu, sigma, fd, sd) pq(q, mu, sigma, fd, sd)
x |
vector of quantiles. |
mu |
vector of means. |
sigma |
vector of standard deviations. |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
n |
Number of random samples. |
p |
vector of probabilities. |
q |
vector of quantiles. |
dq
gives the density, rq
generates random variates, qq
gives the quantile function, and pq
gives the cumulative density of specified distribution.
x <- rq(5, mu = 0.5, sigma = 1, 't2','t2'); x dq(x, mu = 0.5, sigma = 1, 't2','t2') qtil <- pq(x, mu = 0.5, sigma = 1, 't2','t2');qtil qq(qtil , mu = 0.5, sigma = 1, 't2','t2')
x <- rq(5, mu = 0.5, sigma = 1, 't2','t2'); x dq(x, mu = 0.5, sigma = 1, 't2','t2') qtil <- pq(x, mu = 0.5, sigma = 1, 't2','t2');qtil qq(qtil , mu = 0.5, sigma = 1, 't2','t2')
Probability of Human Extinction Study
ExtEvent
ExtEvent
A data frame with 1170 rows and 11 variables:
Subject ID
Gender of subjects, '0'is male, '1'is female
The nation of the participants come from
effect coding for nation
effect coding for nation
political orientation of subjects
The format of probability elicitation
the order of probability judgement task.
Social conservativsm question on attitude toward gun ownership.
Probability estimates for general threats.
Probability estimates for the greatest threat.
https://www.michaelsmithson.online/
Influence Diagnosis (dfbetas) For Fitted Cdfqr Object
## S3 method for class 'cdfqr' influence( model, method = "dfbeta", type = c("full", "location", "dispersion", "skew", "zero", "one"), what = "full", plot = FALSE, id = FALSE, ... ) ## S3 method for class 'cdfqr' dfbeta( model, type = c("full", "location", "dispersion", "skew", "zero", "one"), what = "full", ... ) ## S3 method for class 'cdfqr' dfbetas( model, type = c("full", "location", "dispersion", "skew", "zero", "one"), what = "full", ... )
## S3 method for class 'cdfqr' influence( model, method = "dfbeta", type = c("full", "location", "dispersion", "skew", "zero", "one"), what = "full", plot = FALSE, id = FALSE, ... ) ## S3 method for class 'cdfqr' dfbeta( model, type = c("full", "location", "dispersion", "skew", "zero", "one"), what = "full", ... ) ## S3 method for class 'cdfqr' dfbetas( model, type = c("full", "location", "dispersion", "skew", "zero", "one"), what = "full", ... )
model |
A cdfqr model object |
method |
Currently only 'dfbeta' method is available. |
type |
A string that indicates whether the results for all parameters are to be returned, or only the submodel's parameters returned. |
what |
for influence statistics based on coefficient values, indicate the predictor variables that needs to be tested. |
plot |
if plot is needed. |
id |
for plot only, if TRUE, the case ids will be displayed in the plot. |
... |
Pass onto other functions or currently ignored |
A matrix, each row of which contains the estimated influence on parameters when that row's observation is removed from the sample.
lm.influence
, influence.measures
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2', 't2', data = JurorData) #It takes some time especially the data is large. influcne <- influence(fit) plot(influcne[,2]) ## Not run: # Same as influence(fit) dfbetval <- dfbetas(fit) ## End(Not run)
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2', 't2', data = JurorData) #It takes some time especially the data is large. influcne <- influence(fit) plot(influcne[,2]) ## Not run: # Same as influence(fit) dfbetval <- dfbetas(fit) ## End(Not run)
The IPCC data-set comprises the lower, best, and upper estimates for the phrases "likely" and "unlikely" in six IPCC report sentences.
IPCC
IPCC
A data frame with 4014 rows and 8 variables:
Subject ID number
Experimental conditions
Valence of the sentences
raw probability estimates
Linear transformed prob into (0, 1) interval
Distinguish lower, best and upper estiamtes
Distinguish lower, best and upper estiamtes
IPCC question number
https://pubmed.ncbi.nlm.nih.gov/19207697/
The IPCC-wide data-set comprises the best estimates for the phrases "likely" and "unlikely" in six IPCC report sentences.
IPCC_Wide
IPCC_Wide
A data frame with 4014 rows and 8 variables:
Each column indicates the estimates for one sentence.
Each column indicates the estimates for one sentence.
Each column indicates the estimates for one sentence.
Each column indicates the estimates for one sentence.
Each column indicates the estimates for one sentence.
Each column indicates the estimates for one sentence.
https://pubmed.ncbi.nlm.nih.gov/19207697/
The IPCC-AUS data-set comprises the best estimates for the phrases in IPCC report sentences.
IPCCAUS
IPCCAUS
A data frame with 4014 rows and 8 variables:
Subject ID
Gender of subjects, '0'is male, '1'is female
age of subjects
personal probability.
nominated probability.
https://pubmed.ncbi.nlm.nih.gov/19207697/
Juror Judgment Study.
JurorData
JurorData
A data frame with 104 rows and 3 variables:
The ratings of confidence levels with rescaling into the (0, 1) interval to avoide 1 and 0 values.
was the dummy variable for coding the conditions of verdict types, whereas
was the dummy variable for coding the conflict conditions
doi:10.1375/pplt.2004.11.1.154
Plot Fitted Values/Residuals of A cdfqr Object or Distribution
## S3 method for class 'cdfqr' plot( x, mu = NULL, sigma = NULL, theta = NULL, fd = NULL, sd = NULL, n = 10000, inner = TRUE, version = "V", type = c("fitted"), ... )
## S3 method for class 'cdfqr' plot( x, mu = NULL, sigma = NULL, theta = NULL, fd = NULL, sd = NULL, n = 10000, inner = TRUE, version = "V", type = c("fitted"), ... )
x |
If the plot is based on the fitted values, provide a fitted cdfqr object, alternatively, mu and sigma, and the distribution can be specified. |
mu |
Location parameter value |
sigma |
Sigma parameter value |
theta |
Skew parameter value |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
n |
The number of random variates to be generated for user specified plot. |
inner |
If finite-tailed distribution is used: a logic value that indicates if the inner ( |
version |
If finite-tailed distribution is used: A string indicates that which version will be used. "V" is the tilt parameter function while "W" indicates the Jones Pewsey transformation. |
type |
Currently only fitted values are available for generating plots. |
... |
other plot parameters pass onto |
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) plot(fit)
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) plot(fit)
Methods for obtaining the fitted/predicted values for a fitted cdfqr object.
## S3 method for class 'cdfqr' predict( object, newdata = NULL, type = c("full", "mu", "sigma", "theta", "one", "zero"), quant = 0.5, ... ) ## S3 method for class 'cdfqr' fitted( object, type = c("full", "mu", "sigma", "theta", "one", "zero"), plot = FALSE, ... )
## S3 method for class 'cdfqr' predict( object, newdata = NULL, type = c("full", "mu", "sigma", "theta", "one", "zero"), quant = 0.5, ... ) ## S3 method for class 'cdfqr' fitted( object, type = c("full", "mu", "sigma", "theta", "one", "zero"), plot = FALSE, ... )
object |
A cdfqr model fit object |
newdata |
Optional. A data frame in which to look for variables with which to predict. If not provided, the fitted values are returned |
type |
A character that indicates whether the full model prediction/fitted values are needed, or values for the 'mu' and 'sigma' submodel only. |
quant |
A number or a numeric vector (must be in (0, 1)) to specify the quantile(s) of the predicted value (when 'newdata' is provided, and predicted values for responses are required). The default is to use median to predict response values. |
... |
currently ignored |
plot |
if a plot is needed. |
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) plot(predict(fit)) plot(predict(fit))
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) plot(predict(fit)) plot(predict(fit))
qrBoot
provides a simple bootstrapping method for estimating the parameters of a cdf quantile regression model.
qrBoot(object, rn, f = coef, R = 500, ci = 0.95)
qrBoot(object, rn, f = coef, R = 500, ci = 0.95)
object |
The fitted cdfqr model object |
rn |
The sample size of bootstrap samples |
f |
A function whose one argument is the name of a cdfqr object that will be applied to the updated cdfqr object to compute the statistics of interest. The default is coef. |
R |
Number of bootstrap samples. |
ci |
The confidence interval level to obtain the bootstrap confidence intervals |
A matrix that includes the original statistics, bootstrap means, and bootstrap confidence intervals
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2', 't2', data = JurorData) qrBoot(fit, rn = 50, R = 50)
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2', 't2', data = JurorData) qrBoot(fit, rn = 50, R = 50)
Give the Gradient Function for CDF-Quantile Distribution models
qrGrad(fd, sd)
qrGrad(fd, sd)
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
grad The gradient function of parameter estimates, given a specified cdf-quantile distribution
qrGrad('t2','t2')
qrGrad('t2','t2')
Function to give the (negative) log likelihood for fitting cdfquantile distributions.
qrLogLik(y, mu, sigma, fd, sd, total = TRUE)
qrLogLik(y, mu, sigma, fd, sd, total = TRUE)
y |
the vector to be evaluated. |
mu |
mean of the distribution. |
sigma |
sigma of the distribution. |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
total |
whether the sum of logliklihood is calculated |
The negative log likelihood for fitting the data with a cdfquantile distribution.
y <- rbeta(20, 0.5, 0.5) qrLogLik(y, mu = 0.5, sigma = 1, 't2','t2')
y <- rbeta(20, 0.5, 0.5) qrLogLik(y, mu = 0.5, sigma = 1, 't2','t2')
Calculate the skew and kurtosis statistics based on probability weighted moments, via simulation method.
qrPwlm(x, n = NULL, mu = NULL, sigma = NULL, fd = NULL, sd = NULL)
qrPwlm(x, n = NULL, mu = NULL, sigma = NULL, fd = NULL, sd = NULL)
x |
The vector of values for the calculation of Skewness and Kurtosis. |
n |
The number of samples drawn in the simulation. The higher this value, the greater accuracy. |
mu |
vector of means. |
sigma |
vector of standard deviations. |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
This function computes the L-moment measures of skew and kurtosis, which may be computed via linear combinations of probability-weighted moments (Greenwood, Landwehr, Matalas and Wallis, 1979).
The tau3(skew) and tau4(kurtosis) values of the L-moment.
Greenwood, J. A., Landwehr, J. M., Matalas, N. C., & Wallis, J. R. (1979). Probability weighted moments: definition and relation to parameters of several distributions expressable in inverse form. Water Resources Research, 15(5), 1049-1054.
qrPwlm(n = 1000, mu = 0.5, sigma = 1, fd = 't2', sd = 't2')
qrPwlm(n = 1000, mu = 0.5, sigma = 1, fd = 't2', sd = 't2')
qrStart
is the function for generating starting values for a cdf-quantile GLM null model.
qrStart(ydata, fd = NULL, sd = NULL, skew = FALSE)
qrStart(ydata, fd = NULL, sd = NULL, skew = FALSE)
ydata |
The variable to be modeled |
fd |
A string that specifies the parent distribution. |
sd |
A string that specifies the sub-family distribution. |
skew |
If ture, the starting values will be generated for the finited tailed distribution case. |
The start values for the location parameter in a null model are the median of the empirical distribution, and a starting value for the dispersion parameter based on a specific quantile of the empirical distribution, specified according to the theoretical distribution on which the model is based. The start values for all new predictor coefficients in both the location and dispersion submodels are assigned the value 0.1.
A vector that consists initial values for mu and sigma.
x <- rbeta(100, 1, 2) qrStart(x, fd='t2', sd='t2') #[1] -0.5938286 1.3996999
x <- rbeta(100, 1, 2) qrStart(x, fd='t2', sd='t2') #[1] -0.5938286 1.3996999
Register method for cdfqr object functions.
## S3 method for class 'cdfqr' residuals(object, type = c("raw", "pearson", "deviance"), ...)
## S3 method for class 'cdfqr' residuals(object, type = c("raw", "pearson", "deviance"), ...)
object |
The cdfqr model project |
type |
The type of residuals to be extracted: |
... |
currently ignored |
residuals of a specified type.
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) residuals(fit, "pearson")
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) residuals(fit, "pearson")
scaleTR
is function that rescales values of a variable into the (0, 1) interval.
scaleTR(y, high = NULL, low = NULL, data = NULL, N = NULL, scale = 0.5)
scaleTR(y, high = NULL, low = NULL, data = NULL, N = NULL, scale = 0.5)
y |
A numeric vector, or a variable in a dataframe. |
high |
The highest possible value of that variable. The value should be equal or greater than the maximum value of y. If not supplied, the maximum value of y will be used. |
low |
The lowest possible value of that variable. The value should be equal or smaller than the minimum value of y. If not supplied, the minimum value of y will be used. |
data |
A dataframe that contains the variable y. |
N |
A integer, normally is the sample size or the number of values. If not supplied, the length of y will be used. |
scale |
A compressing parameter that determines the extend to which the boundary values are going to be pushed away from the boundary. See details. |
scaleTR
used the method suggested by Smithson and Verkuilen (2006) and applies linear transformation to values into the open interval (0, 1). It first transform the values from their original scale by taking , where
a
is the lowest possible value of that variable and b
is the highest possible value of that variable. Next, it compresses the range to avoid zeros and ones by taking , where
N
is the sample size and c
is the compressing parameter. The smaller value c
is, the boundary values would be more approaching zeros and ones, and have greater impact on the estimation of the dispersion parameters in the cdf quantile model.
y <- rnorm(20, 0, 1) ynew <- scaleTR(y)
y <- rnorm(20, 0, 1) ynew <- scaleTR(y)
Give the S3 Methods for CDF-Quantile Distribution Models
## S3 method for class 'cdfqr' summary(object, ...) ## S3 method for class 'cdfqr' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'cdfqr' coef(object, type = "full", ...) ## S3 method for class 'cdfqr' vcov(object, type = "full", ...) ## S3 method for class 'cdfqr' update(object, formula., zero.fo., one.fo., mu.fo., ..., evaluate = TRUE) ## S3 method for class 'cdfqr' confint(object, parm, level = 0.95, submodel = "full", ...) ## S3 method for class 'cdfqr' formula(x, ...) ## S3 method for class 'cdfqr' nobs(object, ...) ## S3 method for class 'cdfqr' deviance(object, ...) ## S3 method for class 'cdfqrH' logLik(object, ...) ## S3 method for class 'cdfqrH' confint( object, parm, level = 0.95, type = c("full", "mean", "sigma", "zero", "one"), ... ) ## S3 method for class 'cdfqrFT' confint(object, parm, level = 0.95, submodel = "full", ...)
## S3 method for class 'cdfqr' summary(object, ...) ## S3 method for class 'cdfqr' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'cdfqr' coef(object, type = "full", ...) ## S3 method for class 'cdfqr' vcov(object, type = "full", ...) ## S3 method for class 'cdfqr' update(object, formula., zero.fo., one.fo., mu.fo., ..., evaluate = TRUE) ## S3 method for class 'cdfqr' confint(object, parm, level = 0.95, submodel = "full", ...) ## S3 method for class 'cdfqr' formula(x, ...) ## S3 method for class 'cdfqr' nobs(object, ...) ## S3 method for class 'cdfqr' deviance(object, ...) ## S3 method for class 'cdfqrH' logLik(object, ...) ## S3 method for class 'cdfqrH' confint( object, parm, level = 0.95, type = c("full", "mean", "sigma", "zero", "one"), ... ) ## S3 method for class 'cdfqrFT' confint(object, parm, level = 0.95, submodel = "full", ...)
... |
Pass onto other functions or currently ignored |
x , object
|
The fitted cdfqr model. |
digits |
Number of digits to be retained in printed output. |
type , submodel
|
The parts of coefficients or variance-covariance matrix to be extracted.Can be "full", "mean",or "sigma". |
formula. |
Changes to the formula. See |
zero.fo. , one.fo. , mu.fo.
|
Changes to the formulas for zero/one component for hurdle models, and for location submodel for finite-tailed models. |
evaluate |
If true evaluate the new updated model else return the call for the new model. |
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) summary(fit) print(fit) logLik(fit) coef(fit) deviance(fit) vcov(fit) confint(fit) #Update the model fit2 <- update(fit, crc99 ~ vert*confl | confl) summary(fit2)
data(cdfqrExampleData) fit <- cdfquantreg(crc99 ~ vert | confl, 't2','t2', data = JurorData) summary(fit) print(fit) logLik(fit) coef(fit) deviance(fit) vcov(fit) confint(fit) #Update the model fit2 <- update(fit, crc99 ~ vert*confl | confl) summary(fit2)
Data from Modeling Proportion of Patient Time in Emergency Ward Stages
yoon
yoon
A data frame with 1170 rows and 11 variables:
case identification
day of the week ( 0 = Sunday)
0 = walk-in; 1 = ambulance-arrival
triage level
1 = triage level 1
1 = triage level 2
1 = triage level 3
1 = triage level 4
1 = triage level 5
1 = laboratory test(s) conducted
1 = x-ray conducted
1 = other intervention
length of stay in minutes
length of stay in hours
proportion of time in registration stage
proportion of time in triage stage
proportion of time in nursing care stage
proportion of time in consultation with physician(s)
proportion of time in decisional stage
preg + ptriage
pphysician + pdecis
pnurse/(pnurse + pregptriage)
pphysdecis /(pphysdecis + pregptriage)