Title: | Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects) |
---|---|
Description: | Fitting regression models where, in addition to possible linear terms, one or more covariates have segmented (i.e., broken-line or piece-wise linear) or stepmented (i.e. piece-wise constant) effects. Multiple breakpoints for the same variable are allowed. The estimation method is discussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estimation for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). Segmented mixed models, i.e. random effects in the change point, are discussed in Muggeo (2014, <doi:10.1177/1471082X13504721>). Estimation of piecewise-constant relationships and changepoints (mean-shift models) is discussed in Fasola et al. (2018, <doi:10.1007/s00180-017-0740-4>). |
Authors: | Vito M. R. Muggeo [aut, cre] |
Maintainer: | Vito M. R. Muggeo <[email protected]> |
License: | GPL |
Version: | 2.1-3 |
Built: | 2024-12-25 06:29:18 UTC |
Source: | CRAN |
Estimation and inference of regression models with piecewise linear relationships, also known as segmented regression models, with a number of break-points fixed or to be ‘selected’. Random effects changepoints are also allowed since version 1.6-0, and since version 2.0-0 it is also possible to fit regression models with piecewise constant (or ‘stepmented’) relationships.
Package: | segmented |
Type: | Package |
Version: | 2.1-3 |
Date: | 2024-10-25 |
License: | GPL |
Package segmented
aims to estimate linear and generalized linear models (and virtually any regression model) having one or more segmented or stepmented relationships in the linear predictor. Estimates of the slopes and
breakpoints are provided along with standard errors. The package includes testing/estimating
functions and methods to print, summarize and plot the results.
The algorithms used by segmented
are not grid-search. They are iterative procedures (Muggeo, 2003; Fasola et al., 2018) that need starting values only for the breakpoint parameters and therefore they are quite efficient even
with several breakpoints to be estimated. Moreover since version 0.2-9.0, segmented
implements
the bootstrap restarting (Wood, 2001) to make the algorithms less sensitive to the starting values (which can be also omitted by the user) .
Since version 0.5-0.0 a default method segmented.default
has been added. It may be employed to include segmented relationships in general regression models where specific methods do not exist. Examples include quantile, Cox, and lme regressions where the random effects do not refer to the breakpoints; see segmented.lme
to include random changepoints. segmented.default
includes some examples.
Since version 1.0-0 the estimating algorithm has been slight modified and it appears to be much stabler (in examples with noisy segmented relationhips and flat log likelihoods) then previous versions.
Hypothesis testing (about the existence of the breakpoint) and confidence intervals are performed via appropriate methods and functions.
A tentative approach to deal with unknown number of breakpoints is also provided, see option fix.npsi
in seg.control
. Also, as version 1.3-0, the selgmented
function has been introduced to select the number of breakpoints via the BIC or sequential hypothesis testing.
Since version 1.6-0, estimation of segmented mixed models has been introduced, see segmented.lme
and related function.
Since version 2.0-0, it is possible to fit segmented relationships with constraints on the slopes, see segreg
.
Finally, since 2.0-0, it is possible to fit (G)LM wherein one or more covariates have a stepmented (i.e. a step-function like) relationship, see stepmented
.
Vito M.R. Muggeo <[email protected]>
Muggeo V.M.R., Atkins D.C., Gallop R.J., Dimidjian S. (2014) Segmented mixed models with random changepoints: a maximum likelihood approach with application to treatment for depression study. Statistical Modelling, 14, 293-313.
Muggeo, V.M.R. (2017) Interval estimation for the breakpoint in segmented regression: a smoothed score-based approach. Australian & New Zealand Journal of Statistics, 59, 311–322.
Fasola S, Muggeo V.M.R., Kuchenhoff, H. (2018) A heuristic, iterative algorithm for change-point detection in abrupt change models, Computational Statistics, 2, 997–1015.
Muggeo, V.M.R. (2016) Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling. J of Statistical Computation and Simulation 86, 3059–3067.
Davies, R.B. (1987) Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 74, 33–43.
Seber, G.A.F. and Wild, C.J. (1989) Nonlinear Regression. Wiley, New York.
Bacon D.W., Watts D.G. (1971) Estimating the transistion between two intersecting straight lines. Biometrika 58: 525 – 534.
Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. Statistics in Medicine 22, 3055–3071.
Muggeo, V.M.R. (2008) Segmented: an R package to fit regression models with broken-line relationships. R News 8/1, 20–25.
Muggeo, V.M.R., Adelfio, G. (2011) Efficient change point detection in genomic sequences of continuous measurements. Bioinformatics 27, 161–166.
Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. Biometrics 57, 240–244.
Muggeo, V.M.R. (2010) Comment on ‘Estimating average annual per cent change in trend analysis’ by Clegg et al., Statistics in Medicine; 28, 3670-3682. Statistics in Medicine, 29, 1958–1960.
Computes the average annual per cent change to summarize piecewise linear relationships in segmented regression models.
aapc(ogg, parm, exp.it = FALSE, conf.level = 0.95, wrong.se = TRUE, .vcov=NULL, .coef=NULL, ...)
aapc(ogg, parm, exp.it = FALSE, conf.level = 0.95, wrong.se = TRUE, .vcov=NULL, .coef=NULL, ...)
ogg |
the fitted model returned by |
parm |
the single segmented variable of interest. It can be missing if the model includes a single segmented covariate. If missing and |
exp.it |
logical. If |
conf.level |
the confidence level desidered. |
wrong.se |
logical, if |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
... |
further arguments to be passed on to |
To summarize the fitted piecewise linear relationship, Clegg et al. (2009) proposed the 'average annual per cent change' (AAPC)
computed as the sum of the slopes () weighted by corresponding covariate sub-interval width (
), namely
. Since the weights are the breakpoint differences, the standard error of the AAPC should account
for uncertainty in the breakpoint estimate, as discussed in Muggeo (2010) and implemented by
aapc()
.
aapc
returns a numeric vector including point estimate, standard error and confidence interval for the AAPC relevant to variable specified in parm
.
exp.it=TRUE
would be appropriate only if the response variable is the log of (any) counts.
Vito M. R. Muggeo, [email protected]
Clegg LX, Hankey BF, Tiwari R, Feuer EJ, Edwards BK (2009) Estimating average annual per cent change in trend analysis. Statistics in Medicine, 28; 3670-3682.
Muggeo, V.M.R. (2010) Comment on ‘Estimating average annual per cent change in trend analysis’ by Clegg et al., Statistics in Medicine; 28, 3670-3682. Statistics in Medicine, 29, 1958–1960.
set.seed(12) x<-1:20 y<-2-.5*x+.7*pmax(x-9,0)-.8*pmax(x-15,0)+rnorm(20)*.3 o<-lm(y~x) os<-segmented(o, psi=c(5,12)) aapc(os)
set.seed(12) x<-1:20 y<-2-.5*x+.7*pmax(x-9,0)-.8*pmax(x-15,0)+rnorm(20)*.3 o<-lm(y~x) os<-segmented(o, psi=c(5,12)) aapc(os)
Given a segmented model (typically returned by a segmented
method), broken.line
computes the fitted values (and relevant standard errors) for the specified ‘segmented’ relationship.
broken.line(ogg, term = NULL, link = TRUE, interc=TRUE, se.fit=TRUE, isV=FALSE, .vcov=NULL, .coef=NULL, ...)
broken.line(ogg, term = NULL, link = TRUE, interc=TRUE, se.fit=TRUE, isV=FALSE, .vcov=NULL, .coef=NULL, ...)
ogg |
A fitted object of class segmented (returned by any |
term |
Three options. i) A named list (whose name should be one of the segmented covariates in the model |
link |
Should the predictions be computed on the scale of the link function if |
interc |
Should the model intercept be added? (provided it exists). |
se.fit |
If |
isV |
A couple of logicals indicating if the segmented terms |
.vcov |
Optional. The full covariance matrix of estimates. If |
.coef |
The regression parameter estimates. If unspecified (i.e. |
... |
Additional arguments to be passed on to |
If term=NULL
or term
is a valid segmented covariate name,
predictions for that segmented variable are the relevant fitted values from the model. If term
is a (correctly named) list with numerical values, predictions corresponding to such specified values
are computed. If link=FALSE
and ogg
inherits from the class "glm", predictions and possible standard
errors are returned on the response scale. The standard errors come from the Delta method.
Argument link
is ignored whether ogg
does not inherit from the class "glm".
A list having one component if (if se.fit=FALSE
), and two components (if se.fit=TRUE
) list representing predictions and standard errors for the segmented covariate values.
This function was written when there was not predict.segmented
(which is more general).
Vito M. R. Muggeo
segmented
, predict.segmented
, plot.segmented
, vcov.segmented
set.seed(1234) z<-runif(100) y<-rpois(100,exp(2+1.8*pmax(z-.6,0))) o<-glm(y~z,family=poisson) o.seg<-segmented(o,seg.Z=~z) ## Not run: plot(z,y) ## Not run: points(z,broken.line(o.seg,link=FALSE)$fit,col=2) #ok, but use plot.segmented()!
set.seed(1234) z<-runif(100) y<-rpois(100,exp(2+1.8*pmax(z-.6,0))) o<-glm(y~z,family=poisson) o.seg<-segmented(o,seg.Z=~z) ## Not run: plot(z,y) ## Not run: points(z,broken.line(o.seg,link=FALSE)$fit,col=2) #ok, but use plot.segmented()!
Computes confidence intervals for the breakpoints in a fitted ‘segmented’ model.
## S3 method for class 'segmented' confint(object, parm, level=0.95, method=c("delta", "score", "gradient"), rev.sgn=FALSE, var.diff=FALSE, is=FALSE, digits=max(4, getOption("digits") - 1), .coef=NULL, .vcov=NULL, ...)
## S3 method for class 'segmented' confint(object, parm, level=0.95, method=c("delta", "score", "gradient"), rev.sgn=FALSE, var.diff=FALSE, is=FALSE, digits=max(4, getOption("digits") - 1), .coef=NULL, .vcov=NULL, ...)
object |
a fitted |
parm |
the segmented variable of interest. If missing the first segmented variable in |
level |
the confidence level required, default to 0.95. |
method |
which confidence interval should be computed. One of |
rev.sgn |
vector of logicals. The length should be equal to the length of |
var.diff |
logical. If |
is |
logical. If |
digits |
controls the number of digits to print when returning the output. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
... |
additional parameters referring to Score-based confidence intervals, such as |
confint.segmented
computes confidence limits for the breakpoints. Currently there are three options, see argument method
.
method="delta"
uses the standard error coming from the Delta
method for the ratio of two random variables. This value is an approximation (slightly) better than the
one reported in the ‘psi’ component of the list returned by any segmented
method. The resulting
confidence intervals are based on the asymptotic Normal distribution of the breakpoint
estimator which is reliable just for clear-cut kink relationships. See Details in segmented
. method="score"
or method="gradient"
compute the
confidence interval via profiling the Score or the Gradient statistics smoothed out by the induced smoothing paradigm, as discussed in the reference below.
A matrix including point estimate and confidence limits of the breakpoint(s) for the
segmented variable possibly specified in parm
.
Currently method="score"
or method="gradient"
only works for segmented linear model. For segmented generalized linear model, currently only method="delta"
is available.
Vito M.R. Muggeo
Muggeo, V.M.R. (2017) Interval estimation for the breakpoint in segmented regression: a smoothed score-based approach. Australian & New Zealand Journal of Statistics 59, 311–322.
segmented
and lines.segmented
to plot the estimated breakpoints with corresponding
confidence intervals.
set.seed(10) x<-1:100 z<-runif(100) y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+10*pmax(z-.5,0)+rnorm(100,0,2) out.lm<-lm(y~x) o<-segmented(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.4)) confint(o) #delta CI for the 1st variable confint(o, "x", method="score") #also method="g"
set.seed(10) x<-1:100 z<-runif(100) y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+10*pmax(z-.5,0)+rnorm(100,0,2) out.lm<-lm(y~x) o<-segmented(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.4)) confint(o) #delta CI for the 1st variable confint(o, "x", method="score") #also method="g"
Computes confidence intervals for all regression parameters, including the the breakpoint, in a fitted ‘segmented mixed’ model.
## S3 method for class 'segmented.lme' confint(object, parm, level = 0.95, obj.boot, ...)
## S3 method for class 'segmented.lme' confint(object, parm, level = 0.95, obj.boot, ...)
object |
A fit object returned by |
parm |
A vector of numbers indicating which parameters should be considered. If missing all parameters. |
level |
The confidence level. |
obj.boot |
The possible list including the bootstrap distributions of the regression coefficients. Such list is returned by |
... |
if |
If obj.boot
is provided or ...
includes the argument B>0
, confidence intervals are computed by exploiting the bootstrap
distributions.
A matrix (or a list of matrices if bootstrap ci are requested) including the confidence intervals for the model parameters.
All the functions for segmented mixed models (*.segmented.lme) are still at an experimental stage
Vito Muggeo
## Not run: confint(os) #asymptotic CI confint(os, B=50) #boot CIs #it is possible to obtain the boot distribution beforehand ob <-vcov(os, B=50, ret.b=TRUE) confint(os, obj.boot=ob) #boot CI ## End(Not run)
## Not run: confint(os) #asymptotic CI confint(os, B=50) #boot CIs #it is possible to obtain the boot distribution beforehand ob <-vcov(os, B=50, ret.b=TRUE) confint(os, obj.boot=ob) #boot CI ## End(Not run)
Computes confidence intervals for the changepoints (or jumpoints) in a fitted ‘stepmented’ model.
## S3 method for class 'stepmented' confint(object, parm, level=0.95, method=c("delta", "score", "gradient"), round=TRUE, cheb=FALSE, digits=max(4, getOption("digits") - 1), .coef=NULL, .vcov=NULL, ...)
## S3 method for class 'stepmented' confint(object, parm, level=0.95, method=c("delta", "score", "gradient"), round=TRUE, cheb=FALSE, digits=max(4, getOption("digits") - 1), .coef=NULL, .vcov=NULL, ...)
object |
a fitted |
parm |
the stepmented variable of interest. If missing the first stepmented variable in |
level |
the confidence level required, default to 0.95. |
method |
which confidence interval should be computed. One of |
round |
logical. Should the values (estimates and lower/upper limits) rounded to the smallest observed value? |
cheb |
logical. If |
digits |
controls the number of digits to print when returning the output. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
... |
additional arguments passed to |
confint.stepmented
computes confidence limits for the changepoints. Currently the only option is 'delta'
, i.e. to compute the approximate covariance matrix via a smoothing approximation (see vcov.stepmented
) and to build the limits using the standard Normal quantiles. Note that, the limits are rounded to the lowest observed value, thus the resulting confidence interval might not be symmetric if the stepmented covariate has not equispaced values.
A matrix including point estimate and confidence limits of the breakpoint(s) for the
stepmented variable possibly specified in parm
.
Currently only method='delta' is allowed.
Vito M.R. Muggeo
stepmented
and lines.segmented
to plot the estimated breakpoints with corresponding
confidence intervals.
set.seed(10) x<-1:100 z<-runif(100) y<-2+2.5*(x>45)-1.5*(x>70)+z+rnorm(100) o<-stepmented(y, npsi=2) confint(o) #round=TRUE is default confint(o, round=FALSE)
set.seed(10) x<-1:100 z<-runif(100) y<-2+2.5*(x>45)-1.5*(x>70)+z+rnorm(100) o<-stepmented(y, npsi=2) confint(o) #round=TRUE is default confint(o, round=FALSE)
Given a generalized linear model, the Davies' test can be employed to test for a non-constant regression parameter in the linear predictor.
davies.test(obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), type=c("lrt","wald"), values=NULL, dispersion=NULL)
davies.test(obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), type=c("lrt","wald"), values=NULL, dispersion=NULL)
obj |
a fitted model typically returned by |
seg.Z |
a formula with no response variable, such as |
k |
number of points where the test should be evaluated. See Details. |
alternative |
a character string specifying the alternative hypothesis (relevant to the slope difference parameter). |
type |
the test statistic to be used (only for GLM, default to lrt).
Ignored if |
values |
optional. The evaluation points where the Davies approximation is computed. See Details for default values. |
dispersion |
the dispersion parameter for the family to be used to compute the test statistic.
When |
davies.test
tests for a non-zero difference-in-slope parameter of a segmented
relationship. Namely, the null hypothesis is , where
is the difference-in-slopes,
i.e. the coefficient of the segmented function
. The hypothesis of interest
means no breakpoint.
Roughtly speaking, the procedure computes
k
‘naive’ (i.e. assuming
fixed and known the breakpoint) test statistics for the difference-in-slope,
seeks the ‘best’ value and corresponding naive p-value (according to the alternative hypothesis), and then corrects
the selected (minimum) p-value by means of the k
values of the test statistic.
If obj
is a LM, the Davies (2002) test is implemented. This approach works even for small samples.
If obj
represents a GLM fit, relevant methods are described in Davies (1987), and the Wald or the Likelihood ratio
test statistics can be used, see argument type
. This is an asymptotic test.
The k
evaluation points are k
equally spaced values between the second and the second-last
values of the variable reported in seg.Z
. k
should not be small; I find no important difference for k
larger than 10, so default
is k=10
.
A list with class 'htest
' containing the following components:
method |
title (character) |
data.name |
the regression model and the segmented variable being tested |
statistic |
the point within the range of the covariate in |
parameter |
number of evaluation points |
p.value |
the adjusted p-value |
process |
a two-column matrix including the evaluation points and corresponding values of the test statistic |
The Davies test is not aimed at obtaining the estimate of the breakpoint.
The Davies test is based on k
evaluation points, thus the value returned in the statistic
component
(and printed as "'best' at") is the best among the k
points, and typically it will differ from the maximum likelihood estimate
returned by segmented
. Use segmented
if you are interested in the point estimate.
To test for a breakpoint in linear models with small samples, it is suggested to use davies.test()
with
objects of class "lm". If obj
is a "glm"
object with gaussian family, davies.test()
will use
an approximate test resulting in smaller p-values when the sample is small.
However if the sample size is large (n>300), the exact Davies (2002) upper bound cannot be computed (as it relies on
gamma()
function) and the approximate upper bound of Davies (1987) is returned.
Strictly speaking, the Davies test is not confined to the segmented regression; the procedure can be applied when a nuisance parameter vanishes under the null hypothesis. The test is slightly conservative, as the computed p-value is actually an upper bound.
Results should change slightly with respect to previous versions where the evaluation points were computed
as k
equally spaced values between the second and the second last observed values of the segmented
variable.
Vito M.R. Muggeo
Davies, R.B. (1987) Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 74, 33–43.
Davies, R.B. (2002) Hypothesis testing when a nuisance parameter is present only under the alternative: linear model case. Biometrika 89, 484–489.
See also pscore.test
which is more powerful, especially when the signal-to-noise ratio is low.
## Not run: set.seed(20) z<-runif(100) x<-rnorm(100,2) y<-2+10*pmax(z-.5,0)+rnorm(100,0,3) o<-lm(y~z+x) davies.test(o,~z) davies.test(o,~x) o<-glm(y~z+x) davies.test(o,~z) #it works but the p-value is too small.. ## End(Not run)
## Not run: set.seed(20) z<-runif(100) x<-rnorm(100,2) y<-2+10*pmax(z-.5,0)+rnorm(100,0,3) o<-lm(y~z+x) davies.test(o,~z) davies.test(o,~x) o<-glm(y~z+x) davies.test(o,~z) #it works but the p-value is too small.. ## End(Not run)
The down
data frame has 30 rows and 3 columns.
Variable cases
means the number of babies with Down syndrome out of total number of births
births
for mothers with mean age age
.
data(down)
data(down)
A data frame with 30 observations on the following 3 variables.
age
the mothers' mean age.
births
count of total births.
cases
count of babies with Down syndrome.
Davison, A.C. and Hinkley, D. V. (1997) Bootstrap Methods and their Application. Cambridge University Press.
Geyer, C. J. (1991) Constrained maximum likelihood exemplified by isotonic convex logistic regression. Journal of the American Statistical Association 86, 717–724.
data(down)
data(down)
Displays breakpoint iteration values for segmented fits.
draw.history(obj, term, ...)
draw.history(obj, term, ...)
obj |
a segmented fit returned by any "segmented" method. |
term |
a character to mean the ‘segmented’ variable whose breakpoint values throughout iterations have to be displayed. |
... |
graphic parameters to be passed to |
For a given term
in a segmented fit, draw.history()
produces two plots. On the left panel it displays the different breakpoint
values obtained during the estimating process, since the starting values up to the final ones, while on the right panel the objective values at different iterations. When
bootstrap restarting is employed, draw.history()
produces two plots, the values of objective function
and the number of distinct solutions against the bootstrap replicates.
None.
Vito M.R. Muggeo
data(stagnant) os<-segmented(lm(y~x,data=stagnant),seg.Z=~x,psi=-.8) # draw.history(os) #diagnostics with boot restarting os<-segmented(lm(y~x,data=stagnant),seg.Z=~x,psi=-.8, control=seg.control(n.boot=0)) # draw.history(os) #diagnostics without boot restarting
data(stagnant) os<-segmented(lm(y~x,data=stagnant),seg.Z=~x,psi=-.8) # draw.history(os) #diagnostics with boot restarting os<-segmented(lm(y~x,data=stagnant),seg.Z=~x,psi=-.8, control=seg.control(n.boot=0)) # draw.history(os) #diagnostics without boot restarting
Computes fitted values at different levels of nesting for segmented mixed objects
## S3 method for class 'segmented.lme' fitted(object, level = 1, sort=TRUE, ...)
## S3 method for class 'segmented.lme' fitted(object, level = 1, sort=TRUE, ...)
object |
Object of class |
level |
the level to be considered. Currently only levels 0 or 1 are allowed. |
sort |
If |
... |
Ignored |
Currently it works only if level=1
A numeric object including the fitted values at the specified level of nesting.
All the functions for segmented mixed models (*.segmented.lme) are still at an experimental stage
Vito Muggeo
The globTempAnom
data frame includes the global surface temperature anomalies from 1850 to 2023.
data(globTempAnom)
data(globTempAnom)
The included variables are (clearly).
Year
the calendar year.
Anomaly
the temperature anomalies computed as differences of the annual (average) measurement with respect to the 20th century average (1901-2000).
Data refer to averages measurements referring to land and ocean surface of Northern and Southern hemisphere.
https://www.ncei.noaa.gov/access/monitoring/global-temperature-anomalies/anomalies
There are several references using such dataset, e.g.
Cahill, N., Rahmstorf, S., and Parnell, A. C. (2015). Change points of global temperature. Environmental Research Letters, 10: 1-6.
data(globTempAnom)
data(globTempAnom)
Computes the intercepts of each ‘segmented’ relationship in the fitted model.
intercept(ogg, parm, rev.sgn = FALSE, var.diff=FALSE, .vcov=NULL, .coef=NULL, digits = max(4, getOption("digits") - 2),...)
intercept(ogg, parm, rev.sgn = FALSE, var.diff=FALSE, .vcov=NULL, .coef=NULL, digits = max(4, getOption("digits") - 2),...)
ogg |
an object of class "segmented", returned by any |
parm |
the segmented variable whose intercepts have to be computed. If missing all the segmented variables in the model are considered. |
rev.sgn |
vector of logicals. The length should be equal to the length of |
var.diff |
Currently ignored as only point estimates are computed. |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
digits |
controls number of digits in the returned output. |
... |
Further arguments to be passed on to |
A broken-line relationship means that a regression equation exists in the intervals
' to
', '
to
', and so on.
intercept
computes point estimates of the intercepts of the different regression equations
for each segmented relationship in the fitted model.
intercept
returns a list of one-column matrices. Each matrix represents a segmented relationship.
Vito M. R. Muggeo, [email protected]
See also slope
to compute the slopes of the different regression equations
for each segmented relationship in the fitted model.
## see ?slope ## Not run: intercept(out.seg) ## End(Not run)
## see ?slope ## Not run: intercept(out.seg) ## End(Not run)
Draws bars relevant to breakpoint estimates (point estimate and confidence limits) on the current device
## S3 method for class 'segmented' lines(x, term, bottom = TRUE, shift=FALSE, conf.level = 0.95, k = 50, pch = 18, rev.sgn = FALSE, .vcov=NULL, .coef=NULL, ...)
## S3 method for class 'segmented' lines(x, term, bottom = TRUE, shift=FALSE, conf.level = 0.95, k = 50, pch = 18, rev.sgn = FALSE, .vcov=NULL, .coef=NULL, ...)
x |
an object of class |
term |
the segmented variable of the breakpoints being drawn. It may be unspecified when there is a single segmented variable. |
bottom |
logical, indicating if the bars should be plotted at the bottom ( |
shift |
logical, indicating if the bars should be ‘shifted’ on the y-axis before plotting. Useful for multiple breakpoints with overlapped confidence intervals. |
conf.level |
the confidence level of the confidence intervals for the breakpoints. |
k |
a positive integer regulating the vertical position of the drawn bars. See Details. |
pch |
either an integer specifying a symbol or a single character to be used
in plotting the point estimates of the breakpoints. See |
rev.sgn |
should the signs of the breakpoint estimates be changed before plotting? see Details. |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
... |
further arguments passed to |
lines.segmented
simply draws on the current device the point estimates and relevant
confidence limits of the estimated breakpoints from a "segmented" object. The y coordinate
where the bars are drawn is computed as usr[3]+h
if bottom=TRUE
or
usr[4]-h
when bottom=FALSE
, where h=(usr[4]-usr[3])/abs(k)
and
usr
are the extremes of the user coordinates of the plotting region.
Therefore for larger values of k
the bars are plotted on the edges.
The argument rev.sgn
allows to change the sign of the breakpoints before plotting. This may
be useful when a null-right-slope constraint is set.
plot.segmented
to plot the fitted segmented lines, and
points.segmented
to add the fitted joinpoints.
## See ?plot.segmented
## See ?plot.segmented
Draws bars relevant to breakpoint estimates (point estimate and confidence limits) on the current device
## S3 method for class 'stepmented' lines(x, term, bottom = TRUE, shift=FALSE, conf.level = 0.95, k = 50, pch = 18, .vcov=NULL, .coef=NULL, ...)
## S3 method for class 'stepmented' lines(x, term, bottom = TRUE, shift=FALSE, conf.level = 0.95, k = 50, pch = 18, .vcov=NULL, .coef=NULL, ...)
x |
an object of class |
term |
the stepmented variable of the breakpoints being drawn. It may be unspecified when there is a single stepmented variable. |
bottom |
logical, indicating if the bars should be plotted at the bottom ( |
shift |
logical, indicating if the bars should be ‘shifted’ on the y-axis before plotting. Useful for multiple breakpoints with overlapped confidence intervals. |
conf.level |
the confidence level of the confidence intervals for the breakpoints. |
k |
a positive integer regulating the vertical position of the drawn bars. See Details. |
pch |
either an integer specifying a symbol or a single character to be used
in plotting the point estimates of the breakpoints. See |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
... |
further arguments passed to |
lines.stepmented
simply draws on the current device the point estimates and relevant
confidence limits of the estimated breakpoints from a "stepmented" object. The y coordinates
where the bars are drawn is computed as usr[3]+h
if bottom=TRUE
or
usr[4]-h
when bottom=FALSE
, where h=(usr[4]-usr[3])/abs(k)
and
usr
are the extremes of the user coordinates of the plotting region.
Therefore for larger values of k
the bars are plotted on the edges.
plot.stepmented
to plot the fitted stepmented lines
## See ?plot.stepmented
## See ?plot.stepmented
This function builds the model matrix for segmented
fits.
## S3 method for class 'segmented' model.matrix(object, ...)
## S3 method for class 'segmented' model.matrix(object, ...)
object |
A segmented fit |
... |
additional arguments |
model.matrix.segmented
The design matrix for a segmented regression model with the specified formula and data
Vito Muggeo
See Also as model.matrix
This function builds the model matrix for stepmented
fits.
## S3 method for class 'stepmented' model.matrix(object, type=c("cdf","abs","none"), k=NULL, ...)
## S3 method for class 'stepmented' model.matrix(object, type=c("cdf","abs","none"), k=NULL, ...)
object |
A stepmented fit |
k |
The (negative) exponent of the sample size to approximate the absolute value; see |
type |
The approximation for the indicator function/absolute value. If |
... |
additional arguments |
If type="none"
, model.matrix.stepmented
return the design matrix including the indicator function values and ignoring the psi terms.
The design matrix for a stepmented regression model with the specified formula and data
Vito Muggeo
See Also as model.matrix
, vcov.stepmented
The plant
data frame has 103 rows and 3 columns.
data(plant)
data(plant)
A data frame with 103 observations on the following 3 variables:
y
measurements of the plant organ.
time
times where measurements took place.
group
three attributes of the plant organ, RKV
, RKW
, RWC
.
Three attributes of a plant organ measured over time where biological reasoning indicates likelihood of multiple breakpoints. The data are scaled to the maximum value for each attribute and all attributes are measured at each time.
The data have been kindly provided by Dr Zongjian Yang at School of Land, Crop and Food Sciences, The University of Queensland, Brisbane, Australia.
## Not run: data(plant) lattice::xyplot(y~time,groups=group,auto.key=list(space="right"), data=plant) o<-segreg(y~ 0+group+seg(time, by=group, npsi=2), data=plant) summary(o) par(mfrow=c(1,2)) plot(y~time, data=plant) plot(o, term=1:3, add=TRUE, leg=NA, psi.lines=TRUE) #add the lines to the current plot plot(o, term=1:3, col=3:5, res.col=3:5, res=TRUE, leg="bottomright") ## End(Not run)
## Not run: data(plant) lattice::xyplot(y~time,groups=group,auto.key=list(space="right"), data=plant) o<-segreg(y~ 0+group+seg(time, by=group, npsi=2), data=plant) summary(o) par(mfrow=c(1,2)) plot(y~time, data=plant) plot(o, term=1:3, add=TRUE, leg=NA, psi.lines=TRUE) #add the lines to the current plot plot(o, term=1:3, col=3:5, res.col=3:5, res=TRUE, leg="bottomright") ## End(Not run)
Takes a fitted segmented
object returned by segmented()
and plots (or adds)
the fitted broken-line relationship for the selected segmented term.
## S3 method for class 'segmented' plot(x, term, add=FALSE, res=FALSE, conf.level=0, interc=TRUE, link=TRUE, res.col=grey(.15, alpha = .4), rev.sgn=FALSE, const=NULL, shade=FALSE, rug=!add, dens.rug=FALSE, dens.col = grey(0.8), transf=I, isV=FALSE, is=FALSE, var.diff=FALSE, p.df="p", .vcov=NULL, .coef=NULL, prev.trend=FALSE, smoos=NULL, hide.zeros=FALSE, leg="topleft", psi.lines=FALSE, ...)
## S3 method for class 'segmented' plot(x, term, add=FALSE, res=FALSE, conf.level=0, interc=TRUE, link=TRUE, res.col=grey(.15, alpha = .4), rev.sgn=FALSE, const=NULL, shade=FALSE, rug=!add, dens.rug=FALSE, dens.col = grey(0.8), transf=I, isV=FALSE, is=FALSE, var.diff=FALSE, p.df="p", .vcov=NULL, .coef=NULL, prev.trend=FALSE, smoos=NULL, hide.zeros=FALSE, leg="topleft", psi.lines=FALSE, ...)
x |
a fitted |
term |
Numerical or character to indicate the segmented variable having the piece-wise relationship to be plotted. If there is a single segmented variable in the fitted model |
add |
when |
res |
when |
conf.level |
If greater than zero, it means the confidence level at which the pointwise confidence itervals have to be plotted. |
interc |
If |
link |
when |
res.col |
when |
rev.sgn |
when |
const |
constant to add to each fitted segmented relationship (on the scale of the linear predictor) before plotting.
If |
shade |
if |
rug |
when |
dens.rug |
when |
dens.col |
if |
transf |
A possible function to convert the fitted values before plotting. It is only effective
if |
isV |
logical value (to be passed to |
is |
logical value (to be passed to |
var.diff |
logical value to be passed to |
p.df |
degrees of freedom when |
.vcov |
The full covariance matrix of estimates to be used when |
.coef |
The regression parameter estimates. If unspecified (i.e. |
prev.trend |
logical. If |
smoos |
logical, indicating if the residuals (provided that |
hide.zeros |
logical, indicating if the residuals (provided that |
leg |
If the plot refers to segmented relationships in groups, i.e. |
psi.lines |
if |
... |
other graphics parameters to pass to plotting commands: ‘col’, ‘lwd’ and ‘lty’ (that
can be vectors and are recycled if necessary, see the example below) for the fitted piecewise lines; ‘ylab’, ‘xlab’, ‘main’, ‘sub’, ‘cex.axis’, ‘cex.lab’, ‘xlim’ and
‘ylim’ when a new plot is produced (i.e. when |
Produces (or adds to the current device) the fitted segmented relationship between the
response and the selected term
. If the fitted model includes just a single ‘segmented’ variable,
term
may be omitted.
The partial residuals are computed as ‘fitted + residuals’, where ‘fitted’ are the fitted values of the
segmented relationship relevant to the covariate specified in term
.
Notice that for GLMs the residuals are the response residuals if link=FALSE
and
the working residuals if link=TRUE
.
None.
For models with offset, partial residuals on the response scale are not defined. Thus plot.segmented
does not work
when link=FALSE
, res=TRUE
, and the fitted model includes an offset.
When term
is a vector and multiple segmented relationships are being drawn on the same plot, col
and res.col
can be vectors. Also pch
, cex
, lty
, and lwd
can be vectors, if specified.
Vito M. R. Muggeo
segmented
to fit the model, lines.segmented
to add the estimated breakpoints on the current plot.
points.segmented
to add the joinpoints of the segmented relationship.
predict.segmented
to compute standard errors and confidence intervals for predictions from a
"segmented" fit.
set.seed(1234) z<-runif(100) y<-rpois(100,exp(2+1.8*pmax(z-.6,0))) o<-glm(y~z,family=poisson) o.seg<-segmented(o) #single segmented covariate and one breakpoint: 'seg.Z' and 'psi' can be omitted par(mfrow=c(1,2)) plot(o.seg, conf.level=0.95, shade=TRUE) points(o.seg, link=TRUE, col=2) ## new plot plot(z,y) ## add the fitted lines using different colors and styles.. plot(o.seg,add=TRUE,link=FALSE,lwd=2,col=2:3, lty=c(1,3)) lines(o.seg,col=2,pch=19,bottom=FALSE,lwd=2) #for the CI for the breakpoint points(o.seg,col=4, link=FALSE) ## using the options 'is', 'isV', 'shade' and 'col.shade'. par(mfrow=c(1,2)) plot(o.seg, conf.level=.9, is=TRUE, isV=TRUE, col=1, shade = TRUE, col.shade=2) plot(o.seg, conf.level=.9, is=TRUE, isV=FALSE, col=2, shade = TRUE, res=TRUE, res.col=4, pch=3)
set.seed(1234) z<-runif(100) y<-rpois(100,exp(2+1.8*pmax(z-.6,0))) o<-glm(y~z,family=poisson) o.seg<-segmented(o) #single segmented covariate and one breakpoint: 'seg.Z' and 'psi' can be omitted par(mfrow=c(1,2)) plot(o.seg, conf.level=0.95, shade=TRUE) points(o.seg, link=TRUE, col=2) ## new plot plot(z,y) ## add the fitted lines using different colors and styles.. plot(o.seg,add=TRUE,link=FALSE,lwd=2,col=2:3, lty=c(1,3)) lines(o.seg,col=2,pch=19,bottom=FALSE,lwd=2) #for the CI for the breakpoint points(o.seg,col=4, link=FALSE) ## using the options 'is', 'isV', 'shade' and 'col.shade'. par(mfrow=c(1,2)) plot(o.seg, conf.level=.9, is=TRUE, isV=TRUE, col=1, shade = TRUE, col.shade=2) plot(o.seg, conf.level=.9, is=TRUE, isV=FALSE, col=2, shade = TRUE, res=TRUE, res.col=4, pch=3)
Takes a fitted segmented.lme
object returned by segmented.lme()
and plots (or adds)
the fitted broken-line relationship for the segmented term.
## S3 method for class 'segmented.lme' plot(x, level=1, id = NULL, res = TRUE, pop = FALSE, yscale = 1, xscale = 1, n.plot, pos.leg = "topright", vline = FALSE, lines = TRUE, by=NULL, add=FALSE, conf.level=0, withI=TRUE, vcov.=NULL, shade=FALSE, drop.var=NULL, text.leg=NULL, id.name=TRUE, ...)
## S3 method for class 'segmented.lme' plot(x, level=1, id = NULL, res = TRUE, pop = FALSE, yscale = 1, xscale = 1, n.plot, pos.leg = "topright", vline = FALSE, lines = TRUE, by=NULL, add=FALSE, conf.level=0, withI=TRUE, vcov.=NULL, shade=FALSE, drop.var=NULL, text.leg=NULL, id.name=TRUE, ...)
x |
Object of class |
level |
An integer giving the level of grouping to be used when computing the segmented relationship(s). |
id |
A scalar or vector meaning which subjects profiles have to be drawn. If |
res |
If |
pop |
if |
yscale |
If |
xscale |
If |
n.plot |
a vector to be passed to |
pos.leg |
a character ('topright', 'topleft',...) meaning the location of the legend. Set |
vline |
logical, if |
lines |
logical, if |
by |
A named list indicating covariate names and corresponding values affecting the fitted segmented relationship. For instance:
|
add |
If |
conf.level |
The confidence level for pointwise confidence intervals. Effective only if |
withI |
If |
vcov. |
The fixed effects covariance matrix. If |
shade |
If |
drop.var |
Possible coefficient names to be removed before computing the segmented relationship (E.g. the group-specific intercept.). |
text.leg |
If specified (and |
id.name |
If |
... |
additional arguments, such as |
The function plots the 'subject'-specific segmented profiles for the 'subjects' specificed in id
or, if level=0
, the fitted segmented relationship based on fixed effects only. The number of panels to drawn is actually the minimum between length(id)
and prod(n.plot)
, but if n.plot=c(1,1)
(or also simply n.plot=1
), the ‘individual’ profiles will be pictured on the same panel.
A single or multiple (depending on level
and id
) plot showing the fitted segmented profiles.
All the functions for segmented mixed models (*.segmented.lme) are still at an experimental stage
If by
is specified (and level=0
is set), a legend is also added in the plot reporting covariate(s) name and value affecting the segmented relationship. Set pos.leg=TRUE
to have no legend. On the other hand, use text.leg
to add legend reporting the covariate baseline values.
Vito Muggeo
## Not run: #continues example from segmented.lme plot(os, yscale=-1) #different y-scales plot(os2, n.plot=1, l.col=2:6, l.lwd=2) #all segmented profiles on the same plot ## End(Not run)
## Not run: #continues example from segmented.lme plot(os, yscale=-1) #different y-scales plot(os2, n.plot=1, l.col=2:6, l.lwd=2) #all segmented profiles on the same plot ## End(Not run)
Takes a fitted stepmented
object returned by stepmented()
and plots (or adds)
the fitted piecewise constant lines for the selected stepmented term.
## S3 method for class 'stepmented' plot(x, term, add = FALSE, res = TRUE, conf.level=0, interc = TRUE, add.fx = FALSE, psi.lines = TRUE, link=TRUE, const=NULL, res.col=grey(.15, alpha = .4), surf=FALSE, zero.cor=TRUE, heurs=TRUE, shade=FALSE, se.type=c("cdf","abs","none"), k=NULL, .vcov=NULL, leg="topleft", ...)
## S3 method for class 'stepmented' plot(x, term, add = FALSE, res = TRUE, conf.level=0, interc = TRUE, add.fx = FALSE, psi.lines = TRUE, link=TRUE, const=NULL, res.col=grey(.15, alpha = .4), surf=FALSE, zero.cor=TRUE, heurs=TRUE, shade=FALSE, se.type=c("cdf","abs","none"), k=NULL, .vcov=NULL, leg="topleft", ...)
x |
a fitted |
term |
the stepmented variable having the piece-wise constant relationship to be plotted.
If there is a single stepmented variable in the fitted model |
add |
when |
res |
when |
conf.level |
the confidence level for the pointwise confidence intervals for the expected values. |
interc |
if |
add.fx |
logical. If TRUE and the object fit also includes an additional term for the same stepmented variable, the plot also portrays such ‘additional’ term. |
psi.lines |
if |
link |
if |
const |
constant to add to each fitted segmented relationship (on the scale of the linear predictor) before plotting.
If |
res.col |
when |
surf |
if the object fit |
zero.cor |
see |
heurs |
logical; if |
shade |
if |
se.type |
which standard errors should be computed? see |
k |
The value to be passed to |
.vcov |
The estimate var-covariance matrix; if |
leg |
If the plot refers to stepmented relationships in groups, i.e. |
... |
other graphics parameters to pass to plotting commands: ‘col’, ‘lwd’ and ‘lty’ (that
can be vectors and are recycled if necessary, see the example below) for the fitted piecewise constant lines; ‘ylab’, ‘xlab’, ‘main’, ‘sub’, ‘cex.axis’, ‘cex.lab’, ‘xlim’ and ‘ylim’ when a new plot is produced (i.e. when |
Produces (or adds to the current device) the fitted step-function like relationship between the
response and the selected term
. If the fitted model includes just a single ‘stepmented’ variable,
term
may be omitted. If surf=TRUE
, and res=TRUE
the point widths are proportional to the partial residual values.
None.
Implementation of confidence intervals for the conditional means in stepmented regression is under development; conf.level>0
should be used with care, especially with multiple jumpoints.
Vito M. R. Muggeo
See Also as stepmented
#Following code in stepmented.. ## Not run: par(mfrow=c(1,3)) plot(os,"x") plot(os,"z") plot(os,"z", add.fx=TRUE, psi.lines=FALSE ) lines(os, "z") #display the 'surface' par(mfrow=c(1,3)) plot(os, surf=TRUE, col=1, res.col=2) plot(os, surf=TRUE, lty=2) plot(x,z) plot(os, surf=TRUE, add=TRUE, col=4, res=FALSE) ## End(Not run)
#Following code in stepmented.. ## Not run: par(mfrow=c(1,3)) plot(os,"x") plot(os,"z") plot(os,"z", add.fx=TRUE, psi.lines=FALSE ) lines(os, "z") #display the 'surface' par(mfrow=c(1,3)) plot(os, surf=TRUE, col=1, res.col=2) plot(os, surf=TRUE, lty=2) plot(x,z) plot(os, surf=TRUE, add=TRUE, col=4, res=FALSE) ## End(Not run)
Takes a fitted segmented
object returned by segmented()
and adds
on the current plot the joinpoints of the fitted broken-line relationships.
## S3 method for class 'segmented' points(x, term, interc = TRUE, link = TRUE, rev.sgn=FALSE, transf=I, .vcov=NULL, .coef=NULL, const=0, v=TRUE, ...)
## S3 method for class 'segmented' points(x, term, interc = TRUE, link = TRUE, rev.sgn=FALSE, transf=I, .vcov=NULL, .coef=NULL, const=0, v=TRUE, ...)
x |
an object of class |
term |
the segmented variable of interest. It may be unspecified when there is a single segmented variable. |
interc |
If |
link |
when |
rev.sgn |
when |
transf |
A possible function to convert the fitted values before plotting. |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
const |
A constant to be added (on the y-scale) to the values before transforming and plotting. |
v |
logical. If |
... |
other graphics parameters to pass on to |
We call 'joinpoint' the plane point having as coordinates the breakpoints (on the x scale) and the fitted values of
the segmented relationship at that breakpoints (on the y scale). points.segmented()
simply adds the fitted
joinpoints on the current plot. This could be useful to emphasize the changes of the piecewise linear relationship.
plot.segmented
to plot the fitted segmented lines.
## Not run: #see examples in ?plot.segmented ## End(Not run)
## Not run: #see examples in ?plot.segmented ## End(Not run)
Returns predictions and optionally associated quantities (standard errors or confidence intervals) from a fitted segmented model object.
## S3 method for class 'segmented' predict(object, newdata, se.fit=FALSE, interval=c("none","confidence", "prediction"), type = c("link", "response"), na.action=na.omit, level=0.95, .coef=NULL, ...)
## S3 method for class 'segmented' predict(object, newdata, se.fit=FALSE, interval=c("none","confidence", "prediction"), type = c("link", "response"), na.action=na.omit, level=0.95, .coef=NULL, ...)
object |
a fitted segmented model coming from |
newdata |
An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. |
se.fit |
Logical. Should the standard errors be returned? |
interval |
Which interval? See |
type |
Predictions on the link or response scale? Only if |
na.action |
How to deal with missing data, if |
level |
The confidence level. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
... |
further arguments. |
Basically predict.segmented
builds the right design matrix accounting for breakpoint and passes it
to predict.lm
or predict.glm
depending on the actual model fit object
.
predict.segmented
produces a vector of predictions with possibly associated standard errors or confidence intervals.
See predict.lm
or predict.glm
.
For segmented glm fits with offset, predict.segmented
returns the fitted values including the offset.
Vito Muggeo
segreg
, segmented
, plot.segmented
, broken.line
, predict.lm
, predict.glm
n=10 x=seq(-3,3,l=n) set.seed(1515) y <- (x<0)*x/2 + 1 + rnorm(x,sd=0.15) segm <- segmented(lm(y ~ x), ~ x, psi=0.5) predict(segm,se.fit = TRUE)$se.fit #wrong (smaller) st.errors (assuming known the breakpoint) olm<-lm(y~x+pmax(x-segm$psi[,2],0)) predict(olm,se.fit = TRUE)$se.fit
n=10 x=seq(-3,3,l=n) set.seed(1515) y <- (x<0)*x/2 + 1 + rnorm(x,sd=0.15) segm <- segmented(lm(y ~ x), ~ x, psi=0.5) predict(segm,se.fit = TRUE)$se.fit #wrong (smaller) st.errors (assuming known the breakpoint) olm<-lm(y~x+pmax(x-segm$psi[,2],0)) predict(olm,se.fit = TRUE)$se.fit
Returns predictions and optionally associated quantities (standard errors or confidence intervals) from a fitted stepmented model object.
## S3 method for class 'stepmented' predict(object, newdata, se.fit=FALSE, interval=c("none","confidence", "prediction"), type = c("link", "response"), na.action=na.omit, level=0.95, .coef=NULL, .vcov=NULL, apprx.fit=c("none","cdf"), apprx.se=c("cdf","none"), ...)
## S3 method for class 'stepmented' predict(object, newdata, se.fit=FALSE, interval=c("none","confidence", "prediction"), type = c("link", "response"), na.action=na.omit, level=0.95, .coef=NULL, .vcov=NULL, apprx.fit=c("none","cdf"), apprx.se=c("cdf","none"), ...)
object |
a fitted stepmented model coming from |
newdata |
An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. |
se.fit |
Logical. Should the standard errors be returned? |
interval |
Which interval? See |
type |
Predictions on the link or response scale? Only if |
na.action |
How to deal with missing data, if |
level |
The confidence level. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
.vcov |
The estimate covariance matrix. If unspecified (i.e. |
apprx.fit |
The approximation of the |
apprx.se |
The same abovementioned approximation to compute the standard error. |
... |
further arguments, for instance |
Basically predict.stepmented
builds the right design matrix accounting for breakpoint and passes it
to predict.lm
or predict.glm
depending on the actual model fit object
.
predict.stepmented
produces a vector of predictions with possibly associated standard errors or confidence intervals.
See predict.lm
, predict.glm
, or predict.segmented
.
For stepmented glm fits with offset, predict.stepmented
returns the fitted values including the offset.
Vito Muggeo
stepreg
, stepmented
, plot.stepmented
, predict.lm
, predict.glm
n=10 x=seq(-3,3,l=n) set.seed(1515) y <- (x<0)*x/2 + 1 + rnorm(x,sd=0.15) segm <- segmented(lm(y ~ x), ~ x, psi=0.5) predict(segm,se.fit = TRUE)$se.fit
n=10 x=seq(-3,3,l=n) set.seed(1515) y <- (x<0)*x/2 + 1 + rnorm(x,sd=0.15) segm <- segmented(lm(y ~ x), ~ x, psi=0.5) predict(segm,se.fit = TRUE)$se.fit
Printing the most important features and coefficients (including the breakpoints) of a segmented model.
## S3 method for class 'segmented' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'segmented' coef(object, include.psi=FALSE, ...)
## S3 method for class 'segmented' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'segmented' coef(object, include.psi=FALSE, ...)
x |
object of class |
digits |
number of digits to be printed |
object |
object of class |
include.psi |
logical. If |
... |
arguments passed to other functions |
Vito M.R. Muggeo
summary.segmented
, print.summary.segmented
Printing and extracting the most important features of a segmented mixed model.
## S3 method for class 'segmented.lme' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'segmented.lme' fixef(object, ...) ## S3 method for class 'segmented.lme' logLik(object, ...)
## S3 method for class 'segmented.lme' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'segmented.lme' fixef(object, ...) ## S3 method for class 'segmented.lme' logLik(object, ...)
x |
object of class |
digits |
number of digits to be printed |
object |
object of class |
... |
arguments passed to other functions |
Vito M.R. Muggeo
segmented.lme
, summary.segmented.lme
Given a (generalized) linear model, the (pseudo) Score statistic tests for the existence of one breakpoint.
pscore.test(obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), values=NULL, dispersion=NULL, df.t=NULL, more.break=FALSE, n.break=1, only.term=FALSE, break.type=c("break","jump"))
pscore.test(obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), values=NULL, dispersion=NULL, df.t=NULL, more.break=FALSE, n.break=1, only.term=FALSE, break.type=c("break","jump"))
obj |
a fitted model typically returned by |
seg.Z |
a formula with no response variable, such as |
k |
optional. Number of points (equi-spaced from the min to max) used to compute the pseudo Score statistic. See Details. |
alternative |
a character string specifying the alternative hypothesis (relevant to the slope difference parameter). |
values |
optional. The evaluation points where the Score test is computed. See Details for default values. |
dispersion |
optional. the dispersion parameter for the family to be used to compute the test statistic.
When |
df.t |
optional. The degress-of-freedom used to compute the p-value. When |
more.break |
optional, logical. If |
n.break |
optional. Number of breakpoints postuled under the alternative hypothesis. |
only.term |
logical. If |
break.type |
The kind of breakpoint being tested. |
pscore.test
tests for a non-zero difference-in-slope parameter of a segmented
relationship. Namely, the null hypothesis is , where
is the difference-in-slopes,
i.e. the coefficient of the segmented function
. The hypothesis of interest
means no breakpoint. Simulation studies have shown that such Score test is more powerful than the Davies test (see reference) when the alternative hypothesis is ‘one changepoint’. If there are two or more breakpoints (for instance, a sinusoidal-like relationships),
pscore.test
can have lower power, and davies.test
can perform better.
The dispersion
value, if unspecified, is taken from obj
. If obj
represents the fit under the null hypothesis (no changepoint), the dispersion parameter estimate will be usually larger, leading to a (potentially severe) loss of power.
The k
evaluation points are k
equally spaced values in the range of the segmented covariate. k
should not be small.
Specific values can be set via values
, although I have found no important difference due to number and location of the evaluation points, thus default is k=10
equally-spaced points. However, when the possible breakpoint is believed to lie into a specified narrower range, the user can specify k
values in that range leading to higher power in detecting it, i.e. typically lower p-value.
If obj
is a (segmented) lm object, the returned p-value comes from the t-distribution with appropriate degrees of freedom. Otherwise, namely if obj
is a (segmented) glm object, the p-value is computed wrt the Normal distribution.
A list with class 'htest
' containing the following components:
method |
title (character) |
data.name |
the regression model and the segmented variable being tested |
statistic |
the empirical value of the statistic |
parameter |
number of evaluation points |
p.value |
the p-value |
process |
the alternative hypothesis set |
Vito M.R. Muggeo
Muggeo, V.M.R. (2016) Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling. J of Statistical Computation and Simulation, 86, 3059–3067.
See also davies.test
.
## Not run: set.seed(20) z<-runif(100) x<-rnorm(100,2) y<-2+10*pmax(z-.5,0)+rnorm(100,0,3) o<-lm(y~z+x) #testing for one changepoint #use the simple null fit pscore.test(o,~z) #compare with davies.test(o,~z).. #use the segmented fit os<-segmented(o, ~z) pscore.test(os,~z) #smaller p-value, as it uses the dispersion under the alternative (from 'os') #test for the 2nd breakpoint in the variable z pscore.test(os,~z, more.break=TRUE) ## End(Not run)
## Not run: set.seed(20) z<-runif(100) x<-rnorm(100,2) y<-2+10*pmax(z-.5,0)+rnorm(100,0,3) o<-lm(y~z+x) #testing for one changepoint #use the simple null fit pscore.test(o,~z) #compare with davies.test(o,~z).. #use the segmented fit os<-segmented(o, ~z) pscore.test(os,~z) #smaller p-value, as it uses the dispersion under the alternative (from 'os') #test for the 2nd breakpoint in the variable z pscore.test(os,~z, more.break=TRUE) ## End(Not run)
Given the appropriate input values, the function computes the power (sample size) corresponding to the specifed sample size (power). If a segmented fit object is provided, the power is computed taking the parameter estimates as input values.
pwr.seg(oseg, pow, n, z = "1:n/n", psi, d, s, n.range = c(10,300), X = NULL, break.type=c("break","jump"), alpha = 0.01, round.n = TRUE, alternative = c("two.sided", "greater", "less"), msg = TRUE, ci.pow=0)
pwr.seg(oseg, pow, n, z = "1:n/n", psi, d, s, n.range = c(10,300), X = NULL, break.type=c("break","jump"), alpha = 0.01, round.n = TRUE, alternative = c("two.sided", "greater", "less"), msg = TRUE, ci.pow=0)
oseg |
The fitted segmented object. If provided, the power is computed at the model parameter estimates, and all the remaining arguments but |
pow |
The desired power level. If provided |
n |
The fixed sample size. If provided |
z |
The covariate understood to have a segmented effect. Default is |
psi |
The breakpoint value within the covariate range |
d |
The slope difference |
s |
The response standard deviation |
n.range |
When |
X |
The design matrix including additional linear variables in the regression equation. Default to |
break.type |
Type of breakpoint. |
alpha |
The type-I error probability. Default to 0.01. |
round.n |
logical. If |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided", "greater" or "less". Note, this refers to the sign of the slope difference. |
msg |
logical. If |
ci.pow |
Numerical. If |
The function exploits the sampling distribution of the pseudo Score statistic under the alternative hypothesis of one breakpoint.
The computed power or sample size, with or without message (depending on msg
)
Currently the function assumes just 1 breakpoint in one covariate
Nicoletta D'Angelo and Vito Muggeo
D'Angelo N, Muggeo V.M.R. (2021) Power analysis in segmented regression, working paper
https://www.researchgate.net/publication/355885747.
Muggeo, V.M.R. (2016) Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling. J of Statistical Computation and Simulation, 86, 3059–3067.
## pwr.seg(pow=.7, psi=.5, d=1.5, s=.5) #returns the sample size ## pwr.seg(n=219, psi=.5, d=1.5, s=.5) #returns the power ## pwr.seg(n=20,z="qnorm(p, 2,5)", psi=3, d=.5, s=2) #the covariate is N(2,5) ## pwr.seg(n=20,z="qexp(p)", psi=.1, d=.5, s=.1) #the covariate is Exp(1)
## pwr.seg(pow=.7, psi=.5, d=1.5, s=.5) #returns the sample size ## pwr.seg(n=219, psi=.5, d=1.5, s=.5) #returns the power ## pwr.seg(n=20,z="qnorm(p, 2,5)", psi=3, d=.5, s=2) #the covariate is N(2,5) ## pwr.seg(n=20,z="qexp(p)", psi=.1, d=.5, s=.1) #the covariate is Exp(1)
Function used to define a segmented (stepmented) term within the segreg (stepreg) formula. The function simply passes relevant information to proper fitter functions.
seg(x, npsi = 1, psi = NA, est = NA, R = NA, fixed.psi = NULL, by = NULL, f.x = I)
seg(x, npsi = 1, psi = NA, est = NA, R = NA, fixed.psi = NULL, by = NULL, f.x = I)
x |
The segmented/stepmented (numeric) covariate |
npsi |
The number of breakpoints/jumpoints to estimate. Default to |
psi |
Numerical vector indicating possible starting value(s) for the breakpoint(s). When provided, |
est |
Possible vector (of length equal to |
R |
Matrix to constrain the slopes. If provided, it overwrites the matrix (which is built internally) coming from the specification of |
fixed.psi |
Possible fixed breakpoint values to be accounted for in addition to the estimated ones; |
by |
A possible factor meaning an interaction with the segmented term |
f.x |
an optional function meaning a function to apply to the covariate before fitting |
The function is used within segreg
and stepreg
to 'build' information about the segmented relationships to fit.
The function simply returns the covariate with added attributes relevant to segmented term
If any value is provided in fix.psi
, the corresponding slope difference coefficient will be labelled by *.fixed.*
. The slope
function will compute the 'right' slopes also accounting for the fixed breakpoints.
Vito Muggeo
##see ?segreg
##see ?segreg
Auxiliary function as user interface for 'segmented' and 'stepmented' fitting. Typically only used when calling any 'segmented' or 'stepmented' method.
seg.control(n.boot=10, display = FALSE, tol = 1e-05, it.max = 30, fix.npsi=TRUE, K = 10, quant = FALSE, maxit.glm = NULL, h = 1.25, break.boot=5, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=TRUE, seed=NULL, fn.obj=NULL, digits=NULL, alpha = NULL, fc=.95, check.next=TRUE, tol.opt=NULL, fit.psi0=NULL, eta=NULL, min.nj=2)
seg.control(n.boot=10, display = FALSE, tol = 1e-05, it.max = 30, fix.npsi=TRUE, K = 10, quant = FALSE, maxit.glm = NULL, h = 1.25, break.boot=5, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=TRUE, seed=NULL, fn.obj=NULL, digits=NULL, alpha = NULL, fc=.95, check.next=TRUE, tol.opt=NULL, fit.psi0=NULL, eta=NULL, min.nj=2)
n.boot |
number of bootstrap samples used in the bootstrap restarting algorithm. If 0 the standard algorithm,
i.e. without bootstrap restart, is used. Default to 10 that appears to be sufficient in most of problems. However
when multiple breakpoints have to be estimated it is suggested to increase |
display |
logical indicating if the value of the objective function should be printed along with current breakpoint estimates at each iteration or at each bootstrap resample (but no more than 5 breakpoints are printed). If bootstrap restarting is employed, the values of objective and breakpoint estimates should not change at the last runs. |
tol |
positive convergence tolerance. |
it.max |
integer giving the maximal number of iterations. |
fix.npsi |
logical (it replaces previous argument |
K |
the number of quantiles (or equally-spaced values) to supply as starting values for the breakpoints
when the |
quant |
logical, indicating how the starting values should be selected. If |
maxit.glm |
integer giving the maximum number of inner IWLS iterations (see details). If |
h |
positive factor modifying the increments in breakpoint updates during the estimation process (see details). |
break.boot |
Integer, less than |
size.boot |
the size of the bootstrap samples. If |
jt |
logical. If |
nonParam |
if |
random |
if |
seed |
The seed to be passed on to |
fn.obj |
A character string to be used (optionally) only when |
digits |
optional. If specified it means the desidered number of decimal points of the breakpoint to be used during the iterative algorithm. |
alpha |
optional numerical values. The breakpoints are estimated within the quantiles |
fc |
A proportionality factor ( |
check.next |
logical, effective only for stepmented fit. If |
tol.opt |
Numerical value to be passed to |
fit.psi0 |
Possible list including preliminary values. |
eta |
Only for segmented/stepmented fits: starting values to be passed to |
min.nj |
How many observations (at least) should be in the covariate intervals induced by the breakpoints? |
Fitting a ‘segmented’ GLM model is attained via fitting iteratively standard GLMs. The number of (outer)
iterations is governed by it.max
, while the (maximum) number of (inner) iterations to fit the GLM at
each fixed value of psi is fixed via maxit.glm
. Usually three-four inner iterations may be sufficient.
When the starting value for the breakpoints is set to NA
for any segmented variable specified
in seg.Z
, K
values (quantiles or equally-spaced) are selected as starting values for the breakpoints.
Since version 0.2-9.0 segmented
implements the bootstrap restarting algorithm described in Wood (2001).
The bootstrap restarting is expected to escape the local optima of the objective function when the
segmented relationship is noisy and the loglikelihood can be flat. Notice bootstrap restart runs n.boot
iterations regardless of tol
that only affects convergence within the inner loop.
A list with the arguments as components.
Vito Muggeo
Muggeo, V.M.R., Adelfio, G. (2011) Efficient change point detection in genomic sequences of continuous measurements. Bioinformatics 27, 161–166.
Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. Biometrics 57, 240–244.
#decrease the maximum number inner iterations and display the #evolution of the (outer) iterations #seg.control(display = TRUE, maxit.glm=4)
#decrease the maximum number inner iterations and display the #evolution of the (outer) iterations #seg.control(display = TRUE, maxit.glm=4)
seg.lm.fit
is called by segmented.lm
to fit segmented linear
(gaussian) models. Likewise, seg.glm.fit
is called by segmented.glm
to fit
generalized segmented linear models, and seg.def.fit
is called by segmented.default
to fit
segmented relationships in general regression models (e.g., quantile regression and Cox regression). seg.lm.fit.boot
,
seg.glm.fit.boot
, and seg.def.fit.boot
are employed to perform bootstrap restart.
The functions segConstr.*
are called by segreg()
when some contraints are set on the slopes of the segmented relationships.
These functions should usually not be used directly by the user.
seg.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE) seg.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.glm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE) seg.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.def.fit(obj, Z, PSI, mfExt, opz, return.all.sol=FALSE) seg.def.fit.boot(obj, Z, PSI, mfExt, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.Ar.fit(obj, XREG, Z, PSI, opz, return.all.sol=FALSE) seg.Ar.fit.boot(obj, XREG, Z, PSI, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.num.fit(y, XREG, Z, PSI, w, opz, return.all.sol=FALSE) seg.num.fit.boot(y, XREG, Z, PSI, w, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) segConstr.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol = FALSE) segConstr.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) segConstr.glm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol = FALSE) segConstr.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot)
seg.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE) seg.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.glm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE) seg.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.def.fit(obj, Z, PSI, mfExt, opz, return.all.sol=FALSE) seg.def.fit.boot(obj, Z, PSI, mfExt, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.Ar.fit(obj, XREG, Z, PSI, opz, return.all.sol=FALSE) seg.Ar.fit.boot(obj, XREG, Z, PSI, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) seg.num.fit(y, XREG, Z, PSI, w, opz, return.all.sol=FALSE) seg.num.fit.boot(y, XREG, Z, PSI, w, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) segConstr.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol = FALSE) segConstr.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) segConstr.glm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol = FALSE) segConstr.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot)
y |
vector of observations of length |
XREG |
design matrix for standard linear terms. |
Z |
appropriate matrix including the segmented variables whose breakpoints have to be estimated. |
PSI |
appropriate matrix including the starting values of the breakpoints to be estimated. |
w |
possibe weights vector. |
offs |
possibe offset vector. |
opz |
a list including information useful for model fitting. |
n.boot |
the number of bootstrap samples employed in the bootstrap restart algorithm. |
break.boot |
Integer, less than |
size.boot |
the size of the bootstrap resamples. If |
jt |
logical. If |
nonParam |
if |
random |
if |
return.all.sol |
if |
obj |
the starting regression model where the segmented relationships have to be added. |
mfExt |
the model frame. |
The functions call iteratively lm.wfit
(or glm.fit
) with proper design matrix depending on
XREG
, Z
and PSI
. seg.lm.fit.boot
(and seg.glm.fit.boot
) implements the bootstrap restarting idea discussed in
Wood (2001).
A list of fit information.
These functions should usually not be used directly by the user.
Vito Muggeo
Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. Biometrics 57, 240–244.
##See ?segmented
##See ?segmented
Fits regression models with segmented relationships between the response and one or more explanatory variables. Break-point estimates are provided.
segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, ...) ## Default S3 method: segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'lm' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'glm' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'Arima' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'numeric' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, adjX=FALSE, weights=NULL, ...)
segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, ...) ## Default S3 method: segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'lm' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'glm' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'Arima' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, ...) ## S3 method for class 'numeric' segmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), model = TRUE, keep.class=FALSE, adjX=FALSE, weights=NULL, ...)
obj |
standard ‘linear’ model of class "lm", "glm" or "Arima", or potentially any regression
fit may be supplied since version 0.5-0 (see 'Details'). |
seg.Z |
the segmented variable(s), i.e. the continuous covariate(s) understood to have a piecewise-linear relationship with response. It is a formula with no response variable, such as |
psi |
starting values for the breakpoints to be estimated. If there is a single segmented variable specified in |
npsi |
A named vector or list meaning the number (and not locations) of breakpoints to be estimated. The starting values will be internally computed via the quantiles or equally spaced values, as specified in argument |
fixed.psi |
An optional named list meaning the breakpoints to be kept fixed during the estimation procedure. The names should be a subset of (or even the same) variables specified in |
control |
a list of parameters for controlling the fitting process.
See the documentation for |
model |
logical value indicating if the model.frame should be returned. |
keep.class |
logical value indicating if the final fit returned by |
... |
optional arguments (to be ignored safely). Notice specific arguments relevant to the original call (via |
adjX |
if |
weights |
the weights if |
Given a linear regression model usually of class "lm" or "glm" (or even a simple numeric/ts vector), segmented tries to estimate
a new regression model having broken-line relationships with the variables specified in seg.Z
.
A segmented (or broken-line) relationship is defined by the slope
parameters and the break-points where the linear relation changes. The number of breakpoints
of each segmented relationship is fixed via the psi
(or npsi
) argument, where initial
values for the break-points (or simply their number via npsi
) must be specified. The model
is estimated simultaneously yielding point estimates and relevant approximate
standard errors of all the model parameters, including the break-points.
Since version 0.2-9.0 segmented
implements the bootstrap restarting algorithm described in Wood (2001).
The bootstrap restarting is expected to escape the local optima of the objective function when the
segmented relationship is flat and the log likelihood can have multiple local optima.
Since version 0.5-0.0 the default method segmented.default
has been added to estimate segmented relationships in
general (besides "lm" and "glm" fits) regression models, such as Cox regression or quantile regression (for a single percentile).
The objective function to be minimized is the (minus) value extracted by the logLik
function or it may be passed on via
the fn.obj
argument in seg.control
. See example below. While the default method is expected to work with any regression
fit (where the usual coef()
, update()
, and logLik()
returns appropriate results), it is not recommended for
"lm" or "glm" fits (as segmented.default
is slower than the specific methods segmented.lm
and segmented.glm
), although
final results are the same. However the object returned by segmented.default
is not of class "segmented", as currently
the segmented methods are not guaranteed to work for ‘generic’ (i.e., besides "lm" and "glm") regression fits. The user
could try each "segmented" method on the returned object by calling it explicitly (e.g. via plot.segmented()
or confint.segmented()
wherein the regression coefficients and relevant covariance matrix have to be specified, see .coef
and .vcov
in plot.segmented()
, confint.segmented()
, slope()
).
segmented returns an object of class "segmented" which inherits
from the class of obj
, for instance "lm" or "glm".
An object of class "segmented" is a list containing the components of the
original object obj
with additionally the followings:
psi |
estimated break-points (sorted) and relevant (approximate) standard errors |
it |
number of iterations employed |
epsilon |
difference in the objective function when the algorithm stops |
model |
the model frame |
psi.history |
a list or a vector including the breakpoint estimates at each step |
seed |
the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed |
.. |
Other components are not of direct interest of the user |
At convergence, if the estimated breakpoints are too close each other or at the boundaries, the parameter point estimate could be returned, but without finite standard errors. To avoid that, segmented
revises the final breakpoint estimates to allow that at least min.nj
are within each interval of the segmented covariate. A warning message is printed if such adjustment is made. See min.nj
in seg.control
.
The algorithm will start if the it.max
argument returned by seg.control
is greater than zero. If it.max=0
segmented
will estimate a new linear model with
break-point(s) fixed at the values reported in psi
.Alternatively, it is also possible to set h=0
in seg.control()
. In this case, bootstrap restarting is unncessary, then to have breakpoints at mypsi
type
segmented(.., psi=mypsi, control=seg.control(h=0, n.boot=0, it.max=1))
In the returned fit object, ‘U.’ is put before the name of the segmented variable to mean the difference-in-slopes coefficient.
Methods specific to the class "segmented"
are
Others are inherited from the class "lm"
or "glm"
depending on the
class of obj
.
Vito M. R. Muggeo, [email protected]
Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. Statistics in Medicine 22, 3055–3071.
Muggeo, V.M.R. (2008) Segmented: an R package to fit regression models with broken-line relationships. R News 8/1, 20–25.
segmented.glm
for segmented GLM and segreg
to fit the models via a formula interface. segmented.lme
fits random changepoints (segmented mixed) models.
set.seed(12) xx<-1:100 zz<-runif(100) yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2) dati<-data.frame(x=xx,y=yy,z=zz) out.lm<-lm(y~x,data=dati) #the simplest example: the starting model includes just 1 covariate #.. and 1 breakpoint has to be estimated for that o<-segmented(out.lm) #1 breakpoint for x #the single segmented variable is not in the starting model, and thus.. #... you need to specify it via seg.Z, but no starting value for psi o<-segmented(out.lm, seg.Z=~z) #note the leftmost slope is constrained to be zero (since out.lm does not include z) #2 segmented variables, 1 breakpoint each (again no need to specify npsi or psi) o<-segmented(out.lm,seg.Z=~z+x) #1 segmented variable, but 2 breakpoints: you have to specify starting values (vector) for psi: o<-segmented(out.lm,seg.Z=~x,psi=c(30,60), control=seg.control(display=FALSE)) #.. or you can specify just the *number* of breakpoints #o<-segmented(out.lm,seg.Z=~x, npsi=2, control=seg.control(display=FALSE)) slope(o) #the slopes of the segmented relationship #2 segmented variables: starting values requested via a named list out.lm<-lm(y~z,data=dati) o1<-update(o,seg.Z=~x+z,psi=list(x=c(30,60),z=.3)) #..or by specifying just the *number* of breakpoints #o1<-update(o,seg.Z=~x+z, npsi=c(x=2,z=1)) #the default method leads to the same results (but it is slower) #o1<-segmented.default(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.3)) #o1<-segmented.default(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.3), # control=seg.control(fn.obj="sum(x$residuals^2)")) #automatic procedure to estimate breakpoints in the covariate x (starting from K quantiles) # Hint: increases number of iterations. Notice: bootstrap restart is not allowed! # However see ?selgmented for a better approach #o<-segmented.lm(out.lm,seg.Z=~x+z,psi=list(x=NA,z=.3), # control=seg.control(fix.npsi=FALSE, n.boot=0, tol=1e-7, it.max = 50, K=5, display=TRUE)) #assess the progress of the breakpoint estimates throughout the iterations ## Not run: par(mfrow=c(1,2)) draw.history(o, "x") draw.history(o, "z") ## End(Not run) #try to increase the number of iterations and re-assess the #convergence diagnostics # A simple segmented model with continuous responses and no linear covariates # No need to fit the starting lm model: segmented(yy, npsi=2) #NOTE: subsetting the vector works ( segmented(yy[-1],..) ) #only a single segmented covariate is allowed in seg.Z, and if seg.Z is unspecified, # the segmented variable is taken as 1:n/n # An example using the Arima method: ## Not run: n<-50 idt <-1:n #the time index mu<-50-idt +1.5*pmax(idt-30,0) set.seed(6969) y<-mu+arima.sim(list(ar=.5),n)*3.5 o<-arima(y, c(1,0,0), xreg=idt) os1<-segmented(o, ~idt, control=seg.control(display=TRUE)) #note using the .coef argument is mandatory! slope(os1, .coef=os1$coef) plot(y) plot(os1, add=TRUE, .coef=os1$coef, col=2) ## End(Not run) ################################################################ ################################################################ ######Four examples using the default method: ################################################################ ################################################################ ################################################################ #==> 1. Cox regression with a segmented relationship ################################################################ ## Not run: library(survival) data(stanford2) o<-coxph(Surv(time, status)~age, data=stanford2) os<-segmented(o, ~age, psi=40) #estimate the breakpoint in the age effect summary(os) #actually it means summary.coxph(os) plot(os) #it does not work plot.segmented(os) #call explicitly plot.segmented() to plot the fitted piecewise lines ################################################################ # ==> 2. Linear mixed model via the nlme package ################################################################ dati$g<-gl(10,10) #the cluster 'id' variable library(nlme) o<-lme(y~x+z, random=~1|g, data=dati) os<-segmented.default(o, ~x+z, npsi=list(x=2, z=1)) #summarizing results (note the '.coef' argument) slope(os, .coef=fixef(os)) plot.segmented(os, "x", .coef=fixef(os), conf.level=.95) confint.segmented(os, "x", .coef=fixef(os)) dd<-data.frame(x=c(20,50),z=c(.2,.6), g=1:2) predict.segmented(os, newdata=dd, .coef=fixef(os)) ################################################################ # ==> 3. segmented quantile regression via the quantreg package ################################################################ library(quantreg) data(Mammals) y<-with(Mammals, log(speed)) x<-with(Mammals, log(weight)) o<-rq(y~x, tau=.9) os<-segmented.default(o, ~x) #it does NOT work. It cannot compute the vcov matrix.. #Let's define the vcov.rq function.. (I don't know if it is the best option..) vcov.rq<-function(x,...) { V<-summary(x,cov=TRUE,se="nid",...)$cov rownames(V)<-colnames(V)<-names(x$coef) V} os<-segmented.default(o, ~x) #now it does work plot.segmented(os, res=TRUE, col=2, conf.level=.95) ################################################################ # ==> 4. segmented regression with the svyglm() (survey package) ################################################################ library(survey) data(api) dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) o<-svyglm(api00~ell, design=dstrat) #specify as a string the objective function to be minimized. It can be obtained via svyvar() fn.x<- 'as.numeric(svyvar(resid(x, "pearson"), x$survey.design, na.rm = TRUE))' os<-segmented.default(o, ~ell, control=seg.control(fn.obj=fn.x, display=TRUE)) slope(os) plot.segmented(os, res=TRUE, conf.level=.9, shade=TRUE) ## End(Not run)
set.seed(12) xx<-1:100 zz<-runif(100) yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2) dati<-data.frame(x=xx,y=yy,z=zz) out.lm<-lm(y~x,data=dati) #the simplest example: the starting model includes just 1 covariate #.. and 1 breakpoint has to be estimated for that o<-segmented(out.lm) #1 breakpoint for x #the single segmented variable is not in the starting model, and thus.. #... you need to specify it via seg.Z, but no starting value for psi o<-segmented(out.lm, seg.Z=~z) #note the leftmost slope is constrained to be zero (since out.lm does not include z) #2 segmented variables, 1 breakpoint each (again no need to specify npsi or psi) o<-segmented(out.lm,seg.Z=~z+x) #1 segmented variable, but 2 breakpoints: you have to specify starting values (vector) for psi: o<-segmented(out.lm,seg.Z=~x,psi=c(30,60), control=seg.control(display=FALSE)) #.. or you can specify just the *number* of breakpoints #o<-segmented(out.lm,seg.Z=~x, npsi=2, control=seg.control(display=FALSE)) slope(o) #the slopes of the segmented relationship #2 segmented variables: starting values requested via a named list out.lm<-lm(y~z,data=dati) o1<-update(o,seg.Z=~x+z,psi=list(x=c(30,60),z=.3)) #..or by specifying just the *number* of breakpoints #o1<-update(o,seg.Z=~x+z, npsi=c(x=2,z=1)) #the default method leads to the same results (but it is slower) #o1<-segmented.default(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.3)) #o1<-segmented.default(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.3), # control=seg.control(fn.obj="sum(x$residuals^2)")) #automatic procedure to estimate breakpoints in the covariate x (starting from K quantiles) # Hint: increases number of iterations. Notice: bootstrap restart is not allowed! # However see ?selgmented for a better approach #o<-segmented.lm(out.lm,seg.Z=~x+z,psi=list(x=NA,z=.3), # control=seg.control(fix.npsi=FALSE, n.boot=0, tol=1e-7, it.max = 50, K=5, display=TRUE)) #assess the progress of the breakpoint estimates throughout the iterations ## Not run: par(mfrow=c(1,2)) draw.history(o, "x") draw.history(o, "z") ## End(Not run) #try to increase the number of iterations and re-assess the #convergence diagnostics # A simple segmented model with continuous responses and no linear covariates # No need to fit the starting lm model: segmented(yy, npsi=2) #NOTE: subsetting the vector works ( segmented(yy[-1],..) ) #only a single segmented covariate is allowed in seg.Z, and if seg.Z is unspecified, # the segmented variable is taken as 1:n/n # An example using the Arima method: ## Not run: n<-50 idt <-1:n #the time index mu<-50-idt +1.5*pmax(idt-30,0) set.seed(6969) y<-mu+arima.sim(list(ar=.5),n)*3.5 o<-arima(y, c(1,0,0), xreg=idt) os1<-segmented(o, ~idt, control=seg.control(display=TRUE)) #note using the .coef argument is mandatory! slope(os1, .coef=os1$coef) plot(y) plot(os1, add=TRUE, .coef=os1$coef, col=2) ## End(Not run) ################################################################ ################################################################ ######Four examples using the default method: ################################################################ ################################################################ ################################################################ #==> 1. Cox regression with a segmented relationship ################################################################ ## Not run: library(survival) data(stanford2) o<-coxph(Surv(time, status)~age, data=stanford2) os<-segmented(o, ~age, psi=40) #estimate the breakpoint in the age effect summary(os) #actually it means summary.coxph(os) plot(os) #it does not work plot.segmented(os) #call explicitly plot.segmented() to plot the fitted piecewise lines ################################################################ # ==> 2. Linear mixed model via the nlme package ################################################################ dati$g<-gl(10,10) #the cluster 'id' variable library(nlme) o<-lme(y~x+z, random=~1|g, data=dati) os<-segmented.default(o, ~x+z, npsi=list(x=2, z=1)) #summarizing results (note the '.coef' argument) slope(os, .coef=fixef(os)) plot.segmented(os, "x", .coef=fixef(os), conf.level=.95) confint.segmented(os, "x", .coef=fixef(os)) dd<-data.frame(x=c(20,50),z=c(.2,.6), g=1:2) predict.segmented(os, newdata=dd, .coef=fixef(os)) ################################################################ # ==> 3. segmented quantile regression via the quantreg package ################################################################ library(quantreg) data(Mammals) y<-with(Mammals, log(speed)) x<-with(Mammals, log(weight)) o<-rq(y~x, tau=.9) os<-segmented.default(o, ~x) #it does NOT work. It cannot compute the vcov matrix.. #Let's define the vcov.rq function.. (I don't know if it is the best option..) vcov.rq<-function(x,...) { V<-summary(x,cov=TRUE,se="nid",...)$cov rownames(V)<-colnames(V)<-names(x$coef) V} os<-segmented.default(o, ~x) #now it does work plot.segmented(os, res=TRUE, col=2, conf.level=.95) ################################################################ # ==> 4. segmented regression with the svyglm() (survey package) ################################################################ library(survey) data(api) dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) o<-svyglm(api00~ell, design=dstrat) #specify as a string the objective function to be minimized. It can be obtained via svyvar() fn.x<- 'as.numeric(svyvar(resid(x, "pearson"), x$survey.design, na.rm = TRUE))' os<-segmented.default(o, ~ell, control=seg.control(fn.obj=fn.x, display=TRUE)) slope(os) plot.segmented(os, res=TRUE, conf.level=.9, shade=TRUE) ## End(Not run)
Fits linear mixed models with a segmented relationship between the response and a numeric covariate. Random effects are allowed in each model parameter, including the breakpoint.
## S3 method for class 'lme' segmented(obj, seg.Z, psi, npsi = 1, fixed.psi = NULL, control = seg.control(), model = TRUE, z.psi = ~1, x.diff = ~1, random = NULL, random.noG = NULL, start.pd = NULL, psi.link = c("identity", "logit"), start = NULL, data, fixed.parms = NULL,...)
## S3 method for class 'lme' segmented(obj, seg.Z, psi, npsi = 1, fixed.psi = NULL, control = seg.control(), model = TRUE, z.psi = ~1, x.diff = ~1, random = NULL, random.noG = NULL, start.pd = NULL, psi.link = c("identity", "logit"), start = NULL, data, fixed.parms = NULL,...)
obj |
A 'lme' fit returned by |
seg.Z |
A one-sided formula indicating the segmented variable, i.e. the quantitative variable having a segmented relationship with the response. In longitudinal studies typically it is the time. |
psi |
An optional starting value for the breakpoint. If missing a starting value is obtained via the nadir estimate of a quadratic fit. When provided it may be a single numeric value or a vector of length equal to the number of clusters (i.e. subjects). |
z.psi |
Optional. A one-sided formula meaning the covariates in the sub-regression model for the changepoint parameter. Default to |
x.diff |
Optional. A one-sided formula meaning the covariates in the sub-regression model for the difference-in-slopes parameter.
Default to |
npsi |
Ignored. Currently only |
fixed.psi |
Ignored. |
control |
A list returned by |
model |
Ignored. |
random |
A list, as the one supplied in
|
random.noG |
Ignored. |
start.pd |
An optional starting value for the variances of the random effects. It should be coherent with the
specification in |
psi.link |
The link function used to specify the sub-regression model for the breakpoint
while the logit link is
where |
start |
An optional list including the starting values for the difference-in-slopes parameter, delta0 and delta, and the changepoint parameter, kappa and kappa0. When provided, 'kappa0' overwrites 'psi'. If provided, the components 'delta' and 'kappa' should be named vectors with length and names
matching length and names in |
data |
the dataframe where the variables are stored. If missing, the dataframe of the |
fixed.parms |
An optional named vector representing the coefficients of the changepoint to be maintained fixed
during the estimation process. Allowed names are "G0" or any variable (in the dataframe) supposed to
affect the location of breakpoints.
For instance |
... |
Ignored |
The function fits segmented mixed regression models, i.e. segmented models with random effects also in the slope-difference and change-point parameters.
A list of class segmented.lme
with several components. The most relevant are
lme.fit |
The fitted lme object at convergence |
lme.fit.noG |
The fitted lme object at convergence assuming known the breakpoints |
psi.i |
The subject/cluster-specific change points (fixed + random). It includes 2 attributes: |
fixed.eta.psi |
The fixed-effect linear predictor for the change points regression equation. These values will different among 'clusters' only if at least one covariate has been specified in |
fixed.eta.delta |
The fixed-effect linear predictor of the slope difference regression equation. These values will different among 'clusters' only if at least one covariate has been specified in |
The function deals with estimation with a single breakpoint only.
Currently only one breakpoint (with or without random effects) can be estimated. If fit
is the segmented.lme fit, use VarCorr(fit$lme.fit)
to extract the random effect covariance matrix.
Vito M.R. Muggeo [email protected]
Muggeo V., Atkins D.C., Gallop R.J., Dimidjian S. (2014) Segmented mixed models with random changepoints: a maximum likelihood approach with application to treatment for depression study. Statistical Modelling, 14, 293-313.
Muggeo V. (2016) Segmented mixed models with random changepoints in R. Working paper available on RG. doi: 10.13140/RG.2.1.4180.8402
plot.segmented.lme
for the plotting method and segmented.default
(example 2) for segmented models with no random effects in breakpoints or slope difference.
## Not run: library(nlme) data(Cefamandole) Cefamandole$lTime <-log(Cefamandole$Time) Cefamandole$lconc <-log(Cefamandole$conc) o<-lme(lconc ~ lTime, random=~1|Subject, data=Cefamandole) os<-segmented.lme(o, ~lTime, random=list(Subject=pdDiag(~1+lTime+U+G0)), control=seg.control(n.boot=0, display=TRUE)) slope(os) #################################################### # covariate effect on the changepoint and slope diff #let's assume a new subject-specific covariates.. set.seed(69) Cefamandole$z <- rep(runif(6), rep(14,6)) Cefamandole$group <- gl(2,42,labels=c('a','b')) #Here 'group' affects the slopes and 'z' affects the changepoint o1 <-lme(lconc ~ lTime*group, random=~1|Subject, data=Cefamandole) os1 <- segmented(o1, ~lTime, x.diff=~group, z.psi=~z, random=list(Subject=pdDiag(~1+lTime+U+G0))) slope(os1, by=list(group="a")) #the slope estimates in group="a" (baseline level) slope(os1, by=list(group="b")) #the slope estimates in group="b" ################################################### # A somewhat "complicated" example: # i) strong heterogeneity in the changepoints # ii) No changepoint for the Subject #7 (added) d<-Cefamandole d$x<- d$lTime d$x[d$Subject==1]<- d$lTime[d$Subject==1]+3 d$x[d$Subject==5]<- d$lTime[d$Subject==5]+5 d$x[d$Subject==3]<- d$lTime[d$Subject==3]-5 d<-rbind(d, d[71:76,]) d$Subject <- factor(d$Subject, levels=c(levels(d$Subject),"7")) d$Subject[85:90] <- rep("7",6) o<-lme(lconc ~ x, random=~1|Subject, data=d) os2<-segmented.lme(o, ~x, random=list(Subject=pdDiag(~1+x+U+G0)), control=seg.control(n.boot=5, display=TRUE)) #plots with common x- and y- scales (to note heterogeneity in the changepoints) plot(os2, n.plot = c(3,3)) os2$psi.i attr(os2$psi.i, "is.break") #it is FALSE for Subject #7 #plots with subject-specific scales plot(os2, n.plot = c(3,3), xscale=-1, yscale = -1) ## End(Not run)
## Not run: library(nlme) data(Cefamandole) Cefamandole$lTime <-log(Cefamandole$Time) Cefamandole$lconc <-log(Cefamandole$conc) o<-lme(lconc ~ lTime, random=~1|Subject, data=Cefamandole) os<-segmented.lme(o, ~lTime, random=list(Subject=pdDiag(~1+lTime+U+G0)), control=seg.control(n.boot=0, display=TRUE)) slope(os) #################################################### # covariate effect on the changepoint and slope diff #let's assume a new subject-specific covariates.. set.seed(69) Cefamandole$z <- rep(runif(6), rep(14,6)) Cefamandole$group <- gl(2,42,labels=c('a','b')) #Here 'group' affects the slopes and 'z' affects the changepoint o1 <-lme(lconc ~ lTime*group, random=~1|Subject, data=Cefamandole) os1 <- segmented(o1, ~lTime, x.diff=~group, z.psi=~z, random=list(Subject=pdDiag(~1+lTime+U+G0))) slope(os1, by=list(group="a")) #the slope estimates in group="a" (baseline level) slope(os1, by=list(group="b")) #the slope estimates in group="b" ################################################### # A somewhat "complicated" example: # i) strong heterogeneity in the changepoints # ii) No changepoint for the Subject #7 (added) d<-Cefamandole d$x<- d$lTime d$x[d$Subject==1]<- d$lTime[d$Subject==1]+3 d$x[d$Subject==5]<- d$lTime[d$Subject==5]+5 d$x[d$Subject==3]<- d$lTime[d$Subject==3]-5 d<-rbind(d, d[71:76,]) d$Subject <- factor(d$Subject, levels=c(levels(d$Subject),"7")) d$Subject[85:90] <- rep("7",6) o<-lme(lconc ~ x, random=~1|Subject, data=d) os2<-segmented.lme(o, ~x, random=list(Subject=pdDiag(~1+x+U+G0)), control=seg.control(n.boot=5, display=TRUE)) #plots with common x- and y- scales (to note heterogeneity in the changepoints) plot(os2, n.plot = c(3,3)) os2$psi.i attr(os2$psi.i, "is.break") #it is FALSE for Subject #7 #plots with subject-specific scales plot(os2, n.plot = c(3,3), xscale=-1, yscale = -1) ## End(Not run)
segreg
(stepreg
) fits (generalized) linear segmented (stepmented) regression via a symbolic description of the linear predictor. This is an alternative but equivalent function, introduced since version 2.0-0 (segreg) and 2.1-0 (stepreg), to segmented.(g)lm
or stepmented.(g)lm
.
segreg(formula, data, subset, weights, na.action, family = lm, control = seg.control(), transf = NULL, contrasts = NULL, model = TRUE, x = FALSE, var.psi = TRUE, ...) stepreg(formula, data, subset, weights, na.action, family = lm, control = seg.control(), transf = NULL, contrasts = NULL, model = TRUE, x = FALSE, var.psi = FALSE, ...)
segreg(formula, data, subset, weights, na.action, family = lm, control = seg.control(), transf = NULL, contrasts = NULL, model = TRUE, x = FALSE, var.psi = TRUE, ...) stepreg(formula, data, subset, weights, na.action, family = lm, control = seg.control(), transf = NULL, contrasts = NULL, model = TRUE, x = FALSE, var.psi = FALSE, ...)
formula |
A standard model formula also including one or more 'segmented'/'stepmented' terms via the function |
data |
The possible dataframe where the variables are stored |
subset |
|
weights |
|
na.action |
a function which indicates what happen when the data contain NA values. See |
family |
The family specification, similar to |
control |
See |
transf |
an optional character string (with "y" as argument) meaning a function to apply to the response variable before fitting |
contrasts |
see |
model |
If |
x |
If |
var.psi |
logical, meaning if the standard errors for the breakpoint estimates should be returned in the object fit. If |
... |
Ignored |
The function allows to fit segmented/stepmented (G)LM regression models using a formula interface. Results will be the same of those coming from the traditional segmented.lm
and segmented.glm
(or stepmented.lm
or stepmented.glm
), but there are some additional facilities: i) it is possible to estimate strightforwardly the segmented/stepmented relationships in each level of a categorical variable, see argument by
in seg
; ii) it is possible to constrain some slopes of the segmented relationship, see argument est
or R
in seg
.
An object of class "segmented" (or "stepmented") which inherits from the class "lm" or "glm" depending on family
specification. See segmented.lm
.
Currently for fits returned by segreg
, confint.segmented
only works if method="delta"
.
Constraints on the mean levels (possibly via argument 'est' of seg
) are not yet allowed when calling stepreg
.
When the formula includes even a single segmented term with constraints (specified via the argument est
in seg()
), the relevant coefficients returned do not represent the slope differences as in segmented.lm
or segmented.glm
. The values depend on the constraints and are not usually interpretable. Use slope
the recover the actual slopes of the segmented relationships.
Vito Muggeo
Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. Statistics in Medicine 22, 3055-3071.
########################### #An example using segreg() ########################### set.seed(10) x<-1:100 z<-runif(100) w<-runif(100,-10,-5) y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+10*pmax(z-.5,0)+rnorm(100,0,2) ##the traditional approach out.lm<-lm(y~x+z+w) o<-segmented(out.lm, seg.Z=~x+z, psi=list(x=c(30,60),z=.4)) o1<-segreg(y ~ w+seg(x,npsi=2)+seg(z)) all.equal(fitted(o), fitted(o1)) #put some constraints on the slopes o2<-segreg(y ~ w+seg(x,npsi=2, est=c(0,1,0))+seg(z)) o3<-segreg(y ~ w+seg(x,npsi=2, est=c(0,1,0))+seg(z, est=c(0,1))) slope(o2) slope(o3) ##see ?plant for an additional example ########################### #An example using stepreg() ########################### ### Two stepmented covariates (with 1 and 2 breakpoints) n=100 x<-1:n/n z<-runif(n,2,5) w<-rnorm(n) mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4) y<- mu + rnorm(n)*.8 os <-stepreg(y~seg(x)+seg(z,2)+w) #also includes 'w' as a possible linear term os summary(os) plot(os, "z", col=2:4) #plot the effect of z
########################### #An example using segreg() ########################### set.seed(10) x<-1:100 z<-runif(100) w<-runif(100,-10,-5) y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+10*pmax(z-.5,0)+rnorm(100,0,2) ##the traditional approach out.lm<-lm(y~x+z+w) o<-segmented(out.lm, seg.Z=~x+z, psi=list(x=c(30,60),z=.4)) o1<-segreg(y ~ w+seg(x,npsi=2)+seg(z)) all.equal(fitted(o), fitted(o1)) #put some constraints on the slopes o2<-segreg(y ~ w+seg(x,npsi=2, est=c(0,1,0))+seg(z)) o3<-segreg(y ~ w+seg(x,npsi=2, est=c(0,1,0))+seg(z, est=c(0,1))) slope(o2) slope(o3) ##see ?plant for an additional example ########################### #An example using stepreg() ########################### ### Two stepmented covariates (with 1 and 2 breakpoints) n=100 x<-1:n/n z<-runif(n,2,5) w<-rnorm(n) mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4) y<- mu + rnorm(n)*.8 os <-stepreg(y~seg(x)+seg(z,2)+w) #also includes 'w' as a possible linear term os summary(os) plot(os, "z", col=2:4) #plot the effect of z
This function selects (and estimates) the number of breakpoints of the segmented relationship according to the BIC/AIC criterion or sequential hypothesis testing.
selgmented(olm, seg.Z, Kmax=2, type = c("score", "bic", "davies", "aic"), alpha = 0.05, control = seg.control(), refit = FALSE, stop.if = 5, return.fit = TRUE, bonferroni = FALSE, msg = TRUE, plot.ic = FALSE, th = NULL, G = 1, check.dslope = TRUE)
selgmented(olm, seg.Z, Kmax=2, type = c("score", "bic", "davies", "aic"), alpha = 0.05, control = seg.control(), refit = FALSE, stop.if = 5, return.fit = TRUE, bonferroni = FALSE, msg = TRUE, plot.ic = FALSE, th = NULL, G = 1, check.dslope = TRUE)
olm |
A starting |
seg.Z |
A one-side formula for the segmented variable. Only one term can be included, and it can be omitted if |
Kmax |
The maximum number of breakpoints being tested. If |
type |
Which criterion should be used? Options |
alpha |
The fixed type I error probability when sequential hypothesis testing is carried out (i.e. |
control |
See |
refit |
If |
stop.if |
An integer. If, when trying models with an increasing (when |
return.fit |
If |
bonferroni |
If |
msg |
If |
plot.ic |
If |
th |
When a large number of breakpoints is being tested, it could happen that 2 estimated breakpoints are too close each other, and only one can be retained. Thus if the difference between two breakpoints is less or equal to |
G |
Number of sub-intervals to consider to search for the breakpoints when |
check.dslope |
Logical. Effective only if |
The function uses properly the functions segmented
, pscore.test
or davies.test
to select the 'optimal' number of breakpoints 0,1,...,Kmax
. If type='bic'
or 'aic'
, the procedure stops if the last stop.if
fits have increasing values of the information criterion. Moreover, a breakpoint is removed if too close to other, actually if the difference between two consecutive estimates is less then th
. Finally, if check.dslope=TRUE
, breakpoints whose corresponding slope difference estimate is ‘small’ (i.e. -value larger then
alpha
or alpha/Kmax
) are also removed.
When the dataset is split into
groups, and the search is carried out separately within each group. This approach is fruitful when there are many breakpoints not evenly spaced in the covariate range and/or concentrated in some sub-intervals.
G=3
or 4
is recommended based on simulation evidence.
Note Kmax
is always tacitely reduced in order to have at least 1 residual df in the model with Kmax
changepoints. Namely, if , the maximal segmented model has
2*(Kmax + 1)
parameters, and therefore the largest Kmax
allowed is 8.
When type='score'
or 'davies'
, the function also returns the 'overall p-value' coming from combing the single p-values using the Fisher method. The pooled p-value, however, does not affect the final result which depends on the single p-values only.
The returned object depends on argument return.fit
. If FALSE
, the returned object is a list with some information on the compared models (i.e. the BIC values), otherwise a classical 'segmented'
object (see segmented
for details) with the component selection.psi
including the A/BIC values and
- if refit=TRUE
, psi.no.refit
that represents the breakpoint values before the last fit (with boot restarting)
- if G>1
, cutvalues
including the cutoffs values used to split the data.
If check.dslope=TRUE
, there is no guarantee that the final model has the lowest AIC/BIC. Namely the model with the best A/BIC could have ‘non-significant’ slope differences which will be removed (with the corresponding breakpoints) by the final model. Hence the possible plot (obtained via plot.ic=TRUE
) could be misleading. See Example 1 below.
Vito M. R. Muggeo
Muggeo V (2020) Selecting number of breakpoints in segmented regression: implementation in the R package segmented https://www.researchgate.net/publication/343737604
segmented
, pscore.test
, davies.test
set.seed(12) xx<-1:100 zz<-runif(100) yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2) dati<-data.frame(x=xx,y=yy,z=zz) out.lm<-lm(y~x,data=dati) os <-selgmented(out.lm) #selection (Kmax=2) via the Score test (default) os <-selgmented(out.lm, type="bic", Kmax=3) #BIC-based selection ## Not run: ######################################## #Example 1: selecting a large number of breakpoints b <- c(-1,rep(c(1.5,-1.5),l=15)) psi <- seq(.1,.9,l=15) n <- 2000 x <- 1:n/n X <- cbind(x, outer(x,psi,function(x,y)pmax(x-y,0))) mu <- drop(tcrossprod(X,t(b))) set.seed(113) y<- mu + rnorm(n)*.022 par(mfrow=c(1,2)) #select number of breakpoints via the BIC (and plot it) o<-selgmented(y, Kmax=20, type='bic', plot.ic=TRUE, check.dslope = FALSE) plot(o, res=TRUE, col=2, lwd=3) points(o) # select via the BIC + check on the slope differences (default) o1 <-selgmented(y, Kmax=20, type='bic', plot.ic=TRUE) #check.dslope = TRUE by default #note the plot of BIC is misleading.. But the number of psi is correct plot(o1, add=TRUE, col=3) points(o1, col=3, pch=3) ################################################## #Example 2: a large number of breakpoints not evenly spaced. b <- c(-1,rep(c(2,-2),l=10)) psi <- seq(.5,.9,l=10) n <- 2000 x <- 1:n/n X <- cbind(x, outer(x,psi,function(x,y)pmax(x-y,0))) mu <- drop(tcrossprod(X,t(b))) y<- mu + rnorm(n)*.02 #run selgmented with G>1. G=3 or 4 recommended. #note G=1 does not return the right number of breaks o1 <-selgmented(y, type="bic", Kmax=20, G=4) ## End(Not run)
set.seed(12) xx<-1:100 zz<-runif(100) yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2) dati<-data.frame(x=xx,y=yy,z=zz) out.lm<-lm(y~x,data=dati) os <-selgmented(out.lm) #selection (Kmax=2) via the Score test (default) os <-selgmented(out.lm, type="bic", Kmax=3) #BIC-based selection ## Not run: ######################################## #Example 1: selecting a large number of breakpoints b <- c(-1,rep(c(1.5,-1.5),l=15)) psi <- seq(.1,.9,l=15) n <- 2000 x <- 1:n/n X <- cbind(x, outer(x,psi,function(x,y)pmax(x-y,0))) mu <- drop(tcrossprod(X,t(b))) set.seed(113) y<- mu + rnorm(n)*.022 par(mfrow=c(1,2)) #select number of breakpoints via the BIC (and plot it) o<-selgmented(y, Kmax=20, type='bic', plot.ic=TRUE, check.dslope = FALSE) plot(o, res=TRUE, col=2, lwd=3) points(o) # select via the BIC + check on the slope differences (default) o1 <-selgmented(y, Kmax=20, type='bic', plot.ic=TRUE) #check.dslope = TRUE by default #note the plot of BIC is misleading.. But the number of psi is correct plot(o1, add=TRUE, col=3) points(o1, col=3, pch=3) ################################################## #Example 2: a large number of breakpoints not evenly spaced. b <- c(-1,rep(c(2,-2),l=10)) psi <- seq(.5,.9,l=10) n <- 2000 x <- 1:n/n X <- cbind(x, outer(x,psi,function(x,y)pmax(x-y,0))) mu <- drop(tcrossprod(X,t(b))) y<- mu + rnorm(n)*.02 #run selgmented with G>1. G=3 or 4 recommended. #note G=1 does not return the right number of breaks o1 <-selgmented(y, type="bic", Kmax=20, G=4) ## End(Not run)
Computes the slopes of each ‘segmented’ (or even ‘stepmented’) relationship in the fitted model.
slope(ogg, parm, conf.level = 0.95, rev.sgn=FALSE, APC=FALSE, .vcov=NULL, .coef=NULL, use.t=NULL, by=NULL, interc=TRUE, level=0, ..., digits = max(4, getOption("digits") - 2))
slope(ogg, parm, conf.level = 0.95, rev.sgn=FALSE, APC=FALSE, .vcov=NULL, .coef=NULL, use.t=NULL, by=NULL, interc=TRUE, level=0, ..., digits = max(4, getOption("digits") - 2))
ogg |
an object of class "segmented" or "segmented.lme", returned by any |
parm |
the segmented variable whose slopes have to be computed. If missing all the segmented variables are considered. |
conf.level |
the confidence level required. |
rev.sgn |
vector of logicals. The length should be equal to the length of |
APC |
logical. If |
.vcov |
The full covariance matrix of estimates. If unspecified (i.e. |
.coef |
The regression parameter estimates. If unspecified (i.e. |
use.t |
Which quantiles should be used to compute the confidence intervals? If |
by |
Only for |
interc |
logical, only for |
level |
Numeric, only for |
... |
Further arguments to be passed on to |
digits |
controls number of digits in the returned output. |
To fit broken-line relationships, segmented
uses a parameterization whose coefficients are not
the slopes. Therefore given an object "segmented"
, slope
computes point estimates,
standard errors, t-values and confidence intervals of the slopes of each segmented relationship in the fitted model.
slope
returns a list of matrices. Each matrix represents a segmented relationship and its number of rows equal
to the number of segments, while five columns summarize the results.
The returned summary is based on limiting Gaussian distribution for the model parameters involved
in the computations. Sometimes, even with large sample sizes such approximations are questionable
(e.g., with small difference-in-slope parameters) and the results returned by slope
might be unreliable. Therefore is responsability of the user to gauge the applicability of such asymptotic
approximations. Anyway, the t values may be not assumed for testing purposes
and they should be used just as guidelines to assess the estimate uncertainty.
Vito M. R. Muggeo, [email protected]
Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. Statistics in Medicine 22, 3055–3071.
See also davies.test
and pscore.test
to test for a nonzero difference-in-slope parameter.
set.seed(16) x<-1:100 y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+rnorm(100,0,3) out<-glm(y~1) out.seg<-segmented(out,seg.Z=~x,psi=list(x=c(20,80))) ## the slopes of the three segments.... slope(out.seg) rm(x,y,out,out.seg) # ## an heteroscedastic example.. set.seed(123) n<-100 x<-1:n/n y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) o<-lm(y~x) oseg<-segmented(o,seg.Z=~x,psi=.6) slope(oseg) slope(oseg,var.diff=TRUE) #better CI
set.seed(16) x<-1:100 y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+rnorm(100,0,3) out<-glm(y~1) out.seg<-segmented(out,seg.Z=~x,psi=list(x=c(20,80))) ## the slopes of the three segments.... slope(out.seg) rm(x,y,out,out.seg) # ## an heteroscedastic example.. set.seed(123) n<-100 x<-1:n/n y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) o<-lm(y~x) oseg<-segmented(o,seg.Z=~x,psi=.6) slope(oseg) slope(oseg,var.diff=TRUE) #better CI
The stagnant
data frame has 28 rows and 2 columns.
data(stagnant)
data(stagnant)
A data frame with 28 observations on the following 2 variables.
x
log of flow rate in g/cm sec.
y
log of band height in cm
Bacon and Watts report that such data were obtained by R.A. Cook during his investigation of the behaviour of stagnant surface layer height in a controlled flow of water.
Bacon D.W., Watts D.G. (1971) Estimating the transistion between two intersecting straight lines. Biometrika 58: 525 – 534.
Originally from the PhD thesis by R.A. Cook
data(stagnant) ## plot(stagnant)
data(stagnant) ## plot(stagnant)
step.lm.fit
is called by stepmented.lm
to fit stepmented linear
(gaussian) models. Likewise, step.glm.fit
is called by stepmented.glm
to fit
generalized stepmented linear models.
The step.*.fit.boot
functions are employed to perform bootstrap restarting.
These functions should usually not be used directly by the user.
step.lm.fit(y, x.lin, Xtrue, PSI, ww, offs, opz, return.all.sol=FALSE) step.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) step.glm.fit(y, x.lin, Xtrue, PSI, ww, offs, opz, return.all.sol=FALSE) step.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot)
step.lm.fit(y, x.lin, Xtrue, PSI, ww, offs, opz, return.all.sol=FALSE) step.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot) step.glm.fit(y, x.lin, Xtrue, PSI, ww, offs, opz, return.all.sol=FALSE) step.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE, break.boot=n.boot)
y |
vector of observations of length |
x.lin , XREG
|
design matrix for standard linear terms. |
Xtrue , Z
|
appropriate matrix including the stepmented variables whose breakpoints have to be estimated. |
PSI |
appropriate matrix including the starting values of the breakpoints to be estimated. |
ww , w
|
possibe weights vector. |
offs |
possibe offset vector. |
opz |
a list including information useful for model fitting. |
n.boot |
the number of bootstrap samples employed in the bootstrap restart algorithm. |
break.boot |
Integer, less than |
size.boot |
the size of the bootstrap resamples. If |
jt |
logical. If |
nonParam |
if |
random |
if |
return.all.sol |
if |
The functions call iteratively lm.wfit
(or glm.fit
) with proper design matrix depending on
XREG
, Z
and PSI
. step.lm.fit.boot
(and step.glm.fit.boot
) implements the bootstrap restarting idea discussed in
Wood (2001).
A list of fit information.
These functions should usually not be used directly by the user.
Vito Muggeo
Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. Biometrics 57, 240–244.
stepmented.lm
or stepmented.glm
##See ?stepmented
##See ?stepmented
Fits regression models with stepmented (i.e. piecewise-constant) relationships between the response and one or more explanatory variables. Break-point estimates are provided.
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ...) ## S3 method for class 'lm' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ...) ## S3 method for class 'glm' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ...) ## S3 method for class 'numeric' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ..., pertV=0, centerX=FALSE, adjX=NULL, weights=NULL) ## S3 method for class 'ts' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ..., pertV=0, centerX=FALSE, adjX=NULL)
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ...) ## S3 method for class 'lm' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ...) ## S3 method for class 'glm' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ...) ## S3 method for class 'numeric' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ..., pertV=0, centerX=FALSE, adjX=NULL, weights=NULL) ## S3 method for class 'ts' stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), keep.class=FALSE, var.psi=FALSE, ..., pertV=0, centerX=FALSE, adjX=NULL)
obj |
A standard ‘linear’ regression model of class "lm" or "glm". Alternatively, a simple "ts" object or a simple data vector may be supplied. |
seg.Z |
the stepmented variables(s), i.e. the numeric covariate(s) understood to have a piecewise-constant relationship with response. It is a formula with no response variable, such as |
psi |
starting values for the breakpoints to be estimated. If there is a single stepmented variable specified in |
npsi |
A named vector or list meaning the number (and not locations) of breakpoints to be estimated. The starting values will be internally computed via the quantiles or equally spaced values, as specified in argument |
fixed.psi |
An optional named list including the breakpoint values to be kept fixed during the estimation procedure. The names should be a subset of (or even the same) variables specified in |
control |
a list of parameters for controlling the fitting process.
See the documentation for |
keep.class |
logical value indicating if the final fit returned by |
... |
optional arguments (to be ignored safely). Notice specific arguments relevant to the original call (via |
pertV |
Only for |
centerX |
Only for |
adjX |
Only for |
var.psi |
logical. If |
weights |
possible weights to include in the estimation process (only for |
Given a linear regression model (usually of class "lm" or "glm"), stepmented tries to estimate
a new regression model having piecewise-constant (i.e. step-function like) relationships with the variables specified in seg.Z
.
A stepmented relationship is defined by the mean level
parameters and the break-points where the mean level changes. The number of breakpoints
of each stepmented relationship depends on the psi
argument, where initial
values for the break-points must be specified. The model
is estimated simultaneously yielding point estimates and relevant approximate
standard errors of all the model parameters, including the break-points.
stepmented
implements the algorithm described in Fasola et al. (2018) along with bootstrap restarting (Wood, 2001) to escape local optima. The procedure turns out to be particularly appealing and probably efficient when there are two or more covariates exhibiting different change points to be estimated.
The returned object is of class "stepmented" which inherits
from the class "lm" or "glm" depending on the class of obj
. When only.mean=FALSE
, it is a list having two 'stepmented' fits (for the mean and for the dispersion submodels).
An object of class "stepmented" is a list containing the components of the
original object obj
with additionally the followings:
psi |
estimated break-points and relevant (approximate) standard errors (on the continuum) |
psi.rounded |
the rounded estimated break-points (see Note, below) |
it |
number of iterations employed |
epsilon |
difference in the objective function when the algorithm stops |
model |
the model frame |
psi.history |
a list or a vector including the breakpoint estimates at each step |
seed |
the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed |
.. |
Other components are not of direct interest of the user |
The component psi.rounded
of the fit object includes the rounded changepoint values which are usually taken as the final estimates. More specifically, each column of psi.rounded
represents a changepoint and the corresponding rows are the range of the ‘optimal’ interval. The first row, i.e. the lower bound of the interval, is taken as point estimate. print.stepmented
, print.summary.stepmented
, and confint.stepmented
return the rounded (lower) value of the interval.
Also:
The algorithm will start if the it.max
argument returned by seg.control
is greater than zero. If it.max=0
stepmented
will estimate a new linear model with
break-point(s) fixed at the starting values reported in psi
. Alternatively, it is also possible to set h=0
in seg.control()
. In this case, bootstrap restarting is unncessary, then to have changepoints at mypsi
type
stepmented(.., psi=mypsi, control=seg.control(h=0, n.boot=0, it.max=1))
In the returned fit object, ‘U.’ is put before the name of the stepmented
variable to indicate the difference in the mean levels. slope
can be used to compute the actual mean levels corresponding to the different intervals.
Currently methods specific to the class "stepmented"
are
Others are inherited from the class "lm"
or "glm"
depending on the
class of obj
.
Vito M. R. Muggeo, [email protected] (based on original code by Salvatore Fasola)
Fasola S, Muggeo VMR, Kuchenhoff H (2018) A heuristic, iterative algorithm for change-point detection in abrupt change models, Computational Statistics 33, 997–1015
segmented
for segmented regression, lm
, glm
n=20 x<-1:n/n mu<- 2+ 1*(x>.6) y<- mu + rnorm(n)*.8 #fitting via regression model os <-stepmented(lm(y~1),~x) y<-ts(y) os1<- stepmented(y) #the 'ts' method os2<- stepmented(y, npsi=2) #plot(y) #plot(os1, add=TRUE) #plot(os2, add=TRUE, col=3:5) ### Example with (poisson) GLM y<- rpois(n,exp(mu)) o<-stepmented(glm(y~1,family=poisson)) plot(o, res=TRUE) ## Not run: ## Example using the (well-known) Nile dataset data(Nile) plot(Nile) os<- stepmented(Nile) plot(os, add=TRUE) ### Example with (binary) GLM (example from the package stepR) set.seed(1234) y <- rbinom(200, 1, rep(c(0.1, 0.7, 0.3, 0.9), each=50)) o<-stepmented(glm(y~1,family=binomial), npsi=3) plot(o, res=TRUE) ### Two stepmented covariates (with 1 and 2 breakpoints); z has also an additional linear effect n=100 x<-1:n/n z<-runif(n,2,5) mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4)+z y<- mu + rnorm(n)*.8 os <-stepmented(lm(y~z),~x+z, npsi=c(x=1,z=2)) os summary(os) ## see ?plot.stepmented ## End(Not run)
n=20 x<-1:n/n mu<- 2+ 1*(x>.6) y<- mu + rnorm(n)*.8 #fitting via regression model os <-stepmented(lm(y~1),~x) y<-ts(y) os1<- stepmented(y) #the 'ts' method os2<- stepmented(y, npsi=2) #plot(y) #plot(os1, add=TRUE) #plot(os2, add=TRUE, col=3:5) ### Example with (poisson) GLM y<- rpois(n,exp(mu)) o<-stepmented(glm(y~1,family=poisson)) plot(o, res=TRUE) ## Not run: ## Example using the (well-known) Nile dataset data(Nile) plot(Nile) os<- stepmented(Nile) plot(os, add=TRUE) ### Example with (binary) GLM (example from the package stepR) set.seed(1234) y <- rbinom(200, 1, rep(c(0.1, 0.7, 0.3, 0.9), each=50)) o<-stepmented(glm(y~1,family=binomial), npsi=3) plot(o, res=TRUE) ### Two stepmented covariates (with 1 and 2 breakpoints); z has also an additional linear effect n=100 x<-1:n/n z<-runif(n,2,5) mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4)+z y<- mu + rnorm(n)*.8 os <-stepmented(lm(y~z),~x+z, npsi=c(x=1,z=2)) os summary(os) ## see ?plot.stepmented ## End(Not run)
summary method for class segmented
.
## S3 method for class 'segmented' summary(object, short = FALSE, var.diff = FALSE, p.df="p", .vcov=NULL, ...) ## S3 method for class 'summary.segmented' print(x, short=x$short, var.diff=x$var.diff, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"),...)
## S3 method for class 'segmented' summary(object, short = FALSE, var.diff = FALSE, p.df="p", .vcov=NULL, ...) ## S3 method for class 'summary.segmented' print(x, short=x$short, var.diff=x$var.diff, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"),...)
object |
Object of class "segmented". |
short |
logical indicating if the ‘short’ summary should be printed. |
var.diff |
logical indicating if different error variances should be computed
in each interval of the segmented variable, see Details. If |
p.df |
A character as a function of |
.vcov |
Optional. The full covariance matrix for the parameter estimates. If provided, standard errors are computed (and displayed) according to this matrix. |
x |
a |
digits |
controls number of digits printed in output. |
signif.stars |
logical, should stars be printed on summary tables of coefficients? |
... |
further arguments. |
If short=TRUE
only coefficients of the segmented relationships are printed.
If var.diff=TRUE
and there is only one segmented variable, different error variances are
computed in the intervals defined by the estimated breakpoints of the segmented variable.
For the jth interval with observations, the error variance is estimated via
,
where
is the residual sum of squares in interval j, and
is the number of model parameters. This number to be subtracted from
can be changed via argument
p.df
. For instance p.df="0"
uses , and
p.df="p/K"
leads to , where
is the number of groups (segments), and
can be interpreted as the average number of model parameter in that group.
Note var.diff=TRUE
only affects the estimates covariance matrix. It does not affect the parameter estimates, neither the log likelihood and relevant measures, such as AIC or BIC. In other words, var.diff=TRUE
just provides 'alternative' standard errors, probably appropriate when the error variances are different before/after the estimated breakpoints. Also are computed using the t-distribution with 'naive' degrees of freedom (as reported in
object$df.residual
).
If var.diff=TRUE
the variance-covariance matrix of the estimates is computed via the
sandwich formula,
where V is the diagonal matrix including the different group-specific error variance estimates. Standard errors are the square root of the main diagonal of this matrix.
A list (similar to one returned by segmented.lm
or segmented.glm
) with additional components:
psi |
estimated break-points and relevant (approximate) standard errors |
Ttable |
estimates and standard errors of the model parameters. This is similar
to the matrix |
gap |
estimated coefficients, standard errors and t-values for the ‘gap’ variables |
cov.var.diff |
if |
sigma.new |
if |
df.new |
if |
Vito M.R. Muggeo
##continues example from segmented() # summary(segmented.model,short=TRUE) ## an heteroscedastic example.. # set.seed(123) # n<-100 # x<-1:n/n # y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) # o<-lm(y~x) # oseg<-segmented(o,seg.Z=~x,psi=.6) # summary(oseg,var.diff=TRUE)$sigma.new
##continues example from segmented() # summary(segmented.model,short=TRUE) ## an heteroscedastic example.. # set.seed(123) # n<-100 # x<-1:n/n # y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) # o<-lm(y~x) # oseg<-segmented(o,seg.Z=~x,psi=.6) # summary(oseg,var.diff=TRUE)$sigma.new
summary method for class segmented.lme
.
## S3 method for class 'segmented.lme' summary(object, .vcov=NULL, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'segmented.lme' summary(object, .vcov=NULL, digits = max(3, getOption("digits") - 3), ...)
object |
Object of class "segmented.lme". |
.vcov |
Optional. The full covariance matrix for the parameter estimates. If provided, standard errors are computed (and displayed) according to this matrix. |
digits |
controls number of digits printed in output. |
... |
further arguments. |
The function summarizes and prints the most relevant information on the segmented mixed fit. The output is similar to that returned by print.summary.lme
A list (similar to one returned by segmented.lm
) with estimates of the variance components, and point estimates, standard errors, DF, t-value and p-value for the fixed effects. p-values for the variables U
and G0
are omitted as pointless.
Vito M.R. Muggeo
##continues example from segmented.lme() # summary(os)
##continues example from segmented.lme() # summary(os)
summary/print method for class stepmented
.
## S3 method for class 'stepmented' summary(object, short = FALSE, var.diff = FALSE, p.df="p", .vcov=NULL, ...) ## S3 method for class 'summary.stepmented' print(x, short=x$short, var.diff=x$var.diff, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"),...) ## S3 method for class 'stepmented' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'stepmented' summary(object, short = FALSE, var.diff = FALSE, p.df="p", .vcov=NULL, ...) ## S3 method for class 'summary.stepmented' print(x, short=x$short, var.diff=x$var.diff, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"),...) ## S3 method for class 'stepmented' print(x, digits = max(3, getOption("digits") - 3), ...)
object , x
|
Object of class "stepmented" or a |
short |
logical indicating if the ‘short’ summary should be printed. |
var.diff |
logical indicating if different error variances should be computed
in each interval of the stepmented variable, see Details. If |
p.df |
A character as a function of |
.vcov |
Optional. The full covariance matrix for the parameter estimates. If provided, standard errors are computed (and displayed) according to this matrix. |
digits |
controls number of digits printed in output. |
signif.stars |
logical, should stars be printed on summary tables of coefficients? |
... |
further arguments, notably |
If short=TRUE
only coefficients of the stepmented relationships are printed.
If var.diff=TRUE
and there is only one stepmented variable, different error variances are
computed in the intervals defined by the estimated breakpoints of the stepmented variable.
For the jth interval with observations, the error variance is estimated via
,
where
is the residual sum of squares in interval j, and
is the number of model parameters. This number to be subtracted from
can be changed via argument
p.df
. For instance p.df="0"
uses , and
p.df="p/K"
leads to , where
is the number of groups (segments), and
can be interpreted as the average number of model parameter in that group.
Note var.diff=TRUE
only affects the estimates covariance matrix. It does not affect the parameter estimates, neither the log likelihood and relevant measures, such as AIC or BIC. In other words, var.diff=TRUE
just provides 'alternative' standard errors, probably appropriate when the error variances are different before/after the estimated breakpoints. Also are computed using the t-distribution with 'naive' degrees of freedom (as reported in
object$df.residual
).
If var.diff=TRUE
the variance-covariance matrix of the estimates is computed via the
sandwich formula,
where V is the diagonal matrix including the different group-specific error variance estimates. Standard errors are the square root of the main diagonal of this matrix.
A list (similar to one returned by stepmented.lm
or stepmented.glm
) with additional components:
psi |
estimated break-points and relevant (approximate) standard errors |
Ttable |
estimates and standard errors of the model parameters. This is similar
to the matrix |
cov.var.diff |
if |
sigma.new |
if |
df.new |
if |
If type
is not specified in ...
(which means type="standard"
), no standard error will be computed (and returned) for the jumpoint.
Vito M.R. Muggeo
##continues example from stepmented() # summary(stepmented.model,short=TRUE) ## an heteroscedastic example.. # set.seed(123) # n<-100 # x<-1:n/n # y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) # o<-lm(y~x) # oseg<-stepmented(o,seg.Z=~x,psi=.6) # summary(oseg,var.diff=TRUE)$sigma.new
##continues example from stepmented() # summary(stepmented.model,short=TRUE) ## an heteroscedastic example.. # set.seed(123) # n<-100 # x<-1:n/n # y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) # o<-lm(y~x) # oseg<-stepmented(o,seg.Z=~x,psi=.6) # summary(oseg,var.diff=TRUE)$sigma.new
Returns the variance-covariance matrix of the parameters (including breakpoints) of a fitted segmented model object.
## S3 method for class 'segmented' vcov(object, var.diff = FALSE, is = FALSE, ...)
## S3 method for class 'segmented' vcov(object, var.diff = FALSE, is = FALSE, ...)
object |
a fitted model object of class "segmented", returned by any |
var.diff |
logical. If |
is |
logical. If |
... |
additional arguments. |
The returned covariance matrix is based on an approximation of the nonlinear segmented term. Therefore
covariances corresponding to breakpoints are reliable only in large samples and/or clear cut segmented
relationships. If is=TRUE
, the returned covariance matrix depends on the design matrix having the term replaced by its smooth counterpart.
The full matrix of the estimated covariances between the parameter estimates, including the breakpoints.
var.diff=TRUE
works when there is a single segmented variable.
Vito M. R. Muggeo, [email protected]
##continues example from summary.segmented() # vcov(oseg) # vcov(oseg, var.diff=TRUE) # vcov(oseg, is=TRUE)
##continues example from summary.segmented() # vcov(oseg) # vcov(oseg, var.diff=TRUE) # vcov(oseg, is=TRUE)
Returns the variance-covariance matrix of the parameters (including breakpoints) of a fitted segmented mixed model object.
## S3 method for class 'segmented.lme' vcov(object, B=0, ret.b=FALSE, ...)
## S3 method for class 'segmented.lme' vcov(object, B=0, ret.b=FALSE, ...)
object |
a fitted model object of class "segmented.lme", returned by |
B |
number of bootstrap replicates, if a bootstrap-based covariance matrix is requested. |
ret.b |
logical. If |
... |
optional arguments, i.e. |
The returned covariance matrix is based on an approximation of the nonlinear segmented term. Therefore
covariances corresponding to breakpoints are reliable only in large samples and/or clear cut segmented
relationships. If B>0
is set, case resampling bootstrap (on the outermost nesting level) is carried out. Moreover, if ret.b=TRUE
, the bootstrap distributions are returned, rather than the covariance matrix.
The full matrix of the estimated covariances of the fixed effects estimates, including the breakpoint.
All the functions for segmented mixed models (*.segmented.lme) are still at an experimental stage
Vito M. R. Muggeo, [email protected]
##continues example from segmented.lme() # vcov(os) # vcov(os, B=50) # vcov(os, B=50, ret.b=TRUE)
##continues example from segmented.lme() # vcov(os) # vcov(os, B=50) # vcov(os, B=50, ret.b=TRUE)
Returns the variance-covariance matrix of the parameters estimates (including breakpoints) of a fitted stepmented model object.
## S3 method for class 'stepmented' vcov(object, k=NULL, zero.cor=TRUE, type=c("cdf", "none", "abs"), ...)
## S3 method for class 'stepmented' vcov(object, k=NULL, zero.cor=TRUE, type=c("cdf", "none", "abs"), ...)
object |
a fitted model object of class "stepmented", returned by any |
k |
The power of |
zero.cor |
If |
type |
How the covariance matrix should be computed. If |
... |
additional arguments. |
The full covariance matrix is based on the smooth approximation
via the sandwich formula using the empirical information matrix and assuming .
is the standard Normal cdf, and
is the argument
k
. When k=NULL
(default), it is computed via
where is the signal-to-noise ratio corresponding to the estimated changepoint
(in the range (0,1)). The above formula comes from extensive simulation studies under different scenarios: Seo and Linton (2007) discuss using the normal cdf to smooth out the indicator function by suggesting
as bandwidth; we found such suggestion does not perform well in practice.
The full matrix of the estimated covariances between the parameter estimates, including the breakpoints.
The function, including the value of , must be considered at preliminary stage. Currently the value of
appears to overestimate slightly the true
variability.
If the fit object
has been called by stepmented(.., var.psi=TRUE)
, then vcov.stepmented
will return object$vcov
, unless the power k
differs from -2/3
.
Vito Muggeo
Seo MH, Linton O (2007) A smoothed least squares estimator for threshold regression models, J of Econometrics, 141: 704-735
##see ?stepmented
##see ?stepmented