Title: | Discrete Prolate Spheroidal (Slepian) Sequence Regression Smoothers |
---|---|
Description: | Interface for creation of 'slp' class smoother objects for use in Generalized Additive Models (as implemented by packages 'gam' and 'mgcv'). |
Authors: | Wesley Burr, with contributions from Karim Rahim |
Maintainer: | Wesley Burr <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0-5 |
Built: | 2024-11-22 06:39:23 UTC |
Source: | CRAN |
Interface for creation of 'slp' class smoother objects for use in Generalized Additive Models as implemented by packages gam and mgcv.
Package: | slp |
Type: | Package |
Version: | 1.0-5 |
Date: | 2016-08-28 |
License: | GPL-2 |
Wesley Burr, with contributions from Karim Rahim
Maintainer: Wesley Burr <[email protected]>
Thomson, D.J. (1982) Spectrum estimation and harmonic analysis. Proceedings of the IEEE Volume 70, number 9, pp. 1055–1096.
Thomson, D.J. (2001) Inverse Constrained Projection Filters. Proc. SPIE 4478, Wavelets: Applications in Signal and Image Processing IX, 172 (December 5, 2001); doi:10.1117/12.449708
Hastie, T. J. (1991) Generalized additive models. Chapter 7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth \& Brooks/Cole.
Hastie, T. and Tibshirani, R. (1990) Generalized Additive Models. London: Chapman and Hall.
Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36
Wood, S.N. (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Ass. 99:673-686.
Wood, S.N. (2006) Generalized Additive Models: an introduction with R, CRC
Compute Discrete Prolate Spheroidal (Slepian) Sequences for use as time-based smoother. This approach uses the tridiagonal method and exploits symmetry. Note the odd order tapers are normalized so that the slope at the centre is positive, in accordance with Slepian (1978) and Thomson (1982). This differs from Percival and Walden (1993). This code follows Chapter 8.3 of Percival and Walden (1993) using LAPACK function calls, Anderson (1999).
.dpss(n,k,nw)
.dpss(n,k,nw)
n |
A positive integer, typically the length of the time series-regression data set. |
k |
A positive integer, the number of basis vectors for the smoother, often 2*nw - 2. |
nw |
A positive double-precision number, the time-bandwidth parameter. The frequency domain analogue of the maximum period of interest. |
v |
A 'n' by 'k' matrix of basis vectors of class 'dpss'. Each column is the appropriate Slepian sequence of order 'k-1'. |
Anderson, E. (1999). LAPACK Users' guide (Vol. 9). SIAM.
Percival, D.B. and Walden, A.T. (1993) Spectral analysis for physical applications. Cambridge University Press.
Slepian, D. (1978) Prolate spheroidal wave functions, Fourier analysis, and uncertainty. V–The discrete case. Bell System Technical Journal Volume 57, pp. 1371–1430
Thomson, D.J (1982) Spectrum estimation and harmonic analysis. Proceedings of the IEEE Volume 70, number 9, pp. 1055–1096.
Shared name for a number of pre-computed basis sets included with the
slp
package. For large N
, significant speed-ups can be
obtained by pre-computing the basis set and simply loading it from disk.
Each file named basis_N_XXX_W_X_K_XX.RData
,
with the X
entries integers indicating the N
, W
(in
df/
year) and K
parameters, can be loaded via data(...)
as a basis
object in the environment of your choice.
Currently, these pre-computed bases are used as speed-up aids within
{slp(...)}
and {s(..., bs='slp', ...)}
.
A full list of available bases can be obtained by examining
slpSavedObjects
, an additional data(...)
object included
with the package.
Given values for N
, W
and K
, determine if the combination is
available on disk as a data()
load. If so, significant time savings can be
obtained, especially for large N
.
checkSaved(N, W, K)
checkSaved(N, W, K)
N |
the length of the time index array, in units of |
W |
the time bandwidth, in units of |
K |
the number of basis vectors requested. |
Does a lookup against data(slpSavedObjects)
to determine whether the
combination of N
, W
and K
is saved on disk as part of the
package, and can be loaded. It is possible to create your own basis objects for
particular choices of N
, W
and K
and save them as part
of the library directory, updating slpSavedObjects
as you do so.
Logical (TRUE
or FALSE
), indicating availability or lack thereof.
# Examples using pkg:gam library("slp") checkSaved(N = 730, W = 6, K = 24) checkSaved(N = 365, W = 6, K = 13)
# Examples using pkg:gam library("slp") checkSaved(N = 730, W = 6, K = 24) checkSaved(N = 365, W = 6, K = 13)
Re-generate the basis matrix for a particular N, W
Slepian sequence
family member, with the additional property that the smoother captures/passes constants
without distortion. Simply re-arranges object
. Not intended to be used
directly by user.
## S3 method for class 'slp.smooth' Predict.matrix(object,data)
## S3 method for class 'slp.smooth' Predict.matrix(object,data)
object |
a smooth specification object, usually generated by a model term |
data |
a list containing just the data required by this term, with names corresponding to
|
Presumably because most basis sets are larger in size than their computational
burden, mgcv
passes objects around without including the actual basis
vectors. For example, if using basis cr
, the parameters are included in
object
, and then the bases re-computed as needed.
As the slp
basis is significantly more computational in nature, the
basis vectors are saved as part of the object. While mgcv
deletes
the main set of time-aligned vectors, this routine restores such vectors
so that predict
and plot
work correctly.
A corrected (re-assembled) version of object
, which contains the
X
basis vectors in a format that can be used by predict
or
plot
.
smooth.construct
,
Predict.matrix
Generate the basis matrix for a particular N, W
Slepian sequence
family member, with the additional property that the smoother passes constants
without distortion. Can be quite slow execution due to the latter property.
Based on ns
for implementation with gam
.
Parallel implementation for mgcv
included in package as
slp.mgcv
.
slp(x, W = NA, K = NA, deltat = 1, naive = FALSE, intercept = FALSE, customSVD = TRUE, forceC = FALSE, returnS = FALSE)
slp(x, W = NA, K = NA, deltat = 1, naive = FALSE, intercept = FALSE, customSVD = TRUE, forceC = FALSE, returnS = FALSE)
x |
the predictor variable. Missing values are allowed. Assumed to be contiguous;
if not, then converted to a contiguous series to determine appropriate |
W |
the time bandwidth. Computed as the frequency domain analogue of the maximum period of
interest for a time series-regression problem using “smooth functions of time”. For example,
a period choice of 2 months converts to 60 days and |
K |
the number of basis vectors requested. If not provided, then |
deltat |
the time step for the input |
naive |
a flag for returning the naive (default) Slepian basis vectors |
intercept |
a flag for choosing between a SLP2 or SLP3 basis. Type-2 bases capture (absorb) means of target series, while Type-3 bases ignore (pass) means. |
customSVD |
a flag for using the built-in |
forceC |
a flag for forced computation of the basis vectors. Several combinations of commonly
used |
returnS |
a flag for returning the projection matrix |
slp
is based around the routine .dpss
, which generates a family of Discrete
Prolate Spheroidal (Slepian) Sequences. These vectors are orthonormal, have alternating
even/odd parity, and form the optimally concentrated basis set for the subspace of
R^N
corresponding to the bandwidth W
. Full details are given
in Slepian (1978). These basis functions have natural boundary conditions, and lack any form of
knot structure. This version is returned for naive = TRUE
.
The dpss
basis vectors can be adapted to provide the additional
useful property of capturing or passing constants perfectly. That is, the smoother matrix
S
formed from the returned rectangular matrix will either reproduce constants
at near round-off precision, i.e., S %*% rep(1, N) = rep(1, N)
,
for naive = FALSE
with intercept = TRUE
, or will pass constants,
i.e., S %*% rep(1, N) = rep(0, N)
, for naive = FALSE
with intercept = FALSE
.
The primary use is in modeling formula to directly specify a Slepian time-based smoothing term in a model: see the examples.
For large N
this routine can be very slow. If you are computing models with
large N
, we highly recommend pre-computing the basis object, then using it
in your models without recomputation. The third example below demonstrates this approach.
A matrix of dimension length(x) * K
or length(x) * (K-1)
where
either K
was supplied, or W
was supplied and K
converted. Note that the
basis vectors are computed on a contiguous grid based on x
, and then
back-converted to the time structure of x
.
Attributes are returned that correspond to the arguments to ns
,
and explicitly give K
, W
, etc.
Thomson, D.J (1982) Spectrum estimation and harmonic analysis. Proceedings of the IEEE. Volume 70, number 9, pp. 1055-1096.
Slepian, David (1978) Prolate Spheroidal Wave Functions, Fourier Analysis, and Uncertainty V: the Discrete Case. Bell System Technical Journal. Volume 57, pp. 1371-1429.
# Examples using pkg:gam library("gam") library("slp") N <- 730 W <- 14 / N K <- 28 # will actually use 27 df when intercept = FALSE x <- rnorm(n = N, sd = 1) y <- x + rnorm(n = N, sd = 2) + 5.0 t <- seq(1, N) # note: all three examples share identical results # example with in-call computation, using K (df) fit1 <- gam(y ~ x + slp(t, K = K, forceC = TRUE), family = gaussian) # example with in-call computation, using W fit2 <- gam(y ~ x + slp(t, W = W, forceC = TRUE), family = gaussian) # example with out-of-call computation, using K timeBasis <- slp(t, K = K, forceC = TRUE) fit3 <- gam(y ~ x + timeBasis, family = gaussian) # the same computations can be done using pre-computed basis vectors # for significant speed-ups, especially for large N - see `checkSaved' # for more details fit4 <- gam(y ~ x + slp(t, W = W, forceC = FALSE))
# Examples using pkg:gam library("gam") library("slp") N <- 730 W <- 14 / N K <- 28 # will actually use 27 df when intercept = FALSE x <- rnorm(n = N, sd = 1) y <- x + rnorm(n = N, sd = 2) + 5.0 t <- seq(1, N) # note: all three examples share identical results # example with in-call computation, using K (df) fit1 <- gam(y ~ x + slp(t, K = K, forceC = TRUE), family = gaussian) # example with in-call computation, using W fit2 <- gam(y ~ x + slp(t, W = W, forceC = TRUE), family = gaussian) # example with out-of-call computation, using K timeBasis <- slp(t, K = K, forceC = TRUE) fit3 <- gam(y ~ x + timeBasis, family = gaussian) # the same computations can be done using pre-computed basis vectors # for significant speed-ups, especially for large N - see `checkSaved' # for more details fit4 <- gam(y ~ x + slp(t, W = W, forceC = FALSE))
Generate the basis matrix for a particular N, W
Slepian sequence
family member, with the additional property that the smoother captures/passes constants
without distortion. Can be quite slow in execution due to the latter property.
Based on smooth.construct.cr.smooth.spec
for implementation
with mgcv
.
## S3 method for class 'slp.smooth.spec' smooth.construct(object,data,knots)
## S3 method for class 'slp.smooth.spec' smooth.construct(object,data,knots)
object |
a smooth specification object, usually generated by a model term |
data |
a list containing just the data required by this term, with names corresponding to
|
knots |
a list containing any knots supplied for basis setup – should be |
slp
is based on .dpss
, which generates a family of Discrete
Prolate Spheroidal (Slepian) Sequences. These vectors are orthonormal, have alternating
even/odd parity, and form the optimally concentrated basis set for the subspace of
R^N
corresponding to the bandwidth W
. Full details are given
in Slepian (1978). These basis functions have natural boundary conditions, and lack any form of
knot structure. This version is returned for naive = TRUE
.
The dpss
basis vectors can be adapted to provide the additional
useful property of capturing or passing constants perfectly. That is, the smoother matrix
S
formed from the returned rectangular matrix will either reproduce constants
at near round-off precision, i.e., S %*% rep(1, N) = rep(1, N)
,
for naive = FALSE
with intercept = TRUE
, or will pass constants,
i.e., S %*% rep(1, N) = rep(0, N)
, for naive = FALSE
with intercept = FALSE
.
The primary use is in modeling formula to directly specify a Slepian time-based smoothing term in a model: see the examples.
For large N
this routine can be very slow. If you are computing models with
large N
, we highly recommend pre-computing the basis object, then using it
in your models without recomputation. The third example below demonstrates this approach.
An object of class slp.smooth
. In addition to the usual elements of a smooth
class (see smooth.construct
), this object will
contain:
C |
a constraint matrix which restricts |
K |
the user-specified number of basis vectors, or the computed |
W |
the user-specified bandwidth |
fullBasis |
the full-span computed, normalized basis set, before contiguity is
taken into account. Used by |
contiguous |
a logical variable declaring whether or not the input time array was considered to be contiguous by the basis computation procedure. |
wx |
the “corrected” input time array; if |
Wood S.N. (2006) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press.
Hastie T.J. & Tibshirani, R.J. (1990) Generalized Additive Models. Chapman and Hall.
Thomson, D.J (1982) Spectrum estimation and harmonic analysis. Proceedings of the IEEE. Volume 70, number 9, pp. 1055-1096.
Slepian, David (1978) Prolate Spheroidal Wave Functions, Fourier Analysis, and Uncertainty V: the Discrete Case. Bell System Technical Journal. Volume 57, pp. 1371-1429.
# Examples using pkg:mgcv library("mgcv") library("slp") N <- 730 W <- 8 / N K <- 16 # will actually use 15 df as intercept = FALSE x <- rnorm(n = N, sd = 1) y <- x + rnorm(n = N, sd = 2) + 5.0 t <- seq(1, N) # note: all three examples share identical results # example with in-call computation, using K (df) fit1 <- gam(y ~ x + s(t, bs = 'slp', xt = list(K = K)), family = gaussian) # example with in-call computation, using W fit2 <- gam(y ~ x + s(t, bs = 'slp', xt = list(W = W)), family = gaussian)
# Examples using pkg:mgcv library("mgcv") library("slp") N <- 730 W <- 8 / N K <- 16 # will actually use 15 df as intercept = FALSE x <- rnorm(n = N, sd = 1) y <- x + rnorm(n = N, sd = 2) + 5.0 t <- seq(1, N) # note: all three examples share identical results # example with in-call computation, using K (df) fit1 <- gam(y ~ x + s(t, bs = 'slp', xt = list(K = K)), family = gaussian) # example with in-call computation, using W fit2 <- gam(y ~ x + s(t, bs = 'slp', xt = list(W = W)), family = gaussian)
List of pre-computed basis sets included with the slp
package.
For large N
, significant speed-ups can be obtained by pre-computing
the basis set and simply loading it from disk.
slpSavedObjects
slpSavedObjects
A list containing the N
, W
(in df/
year) and
K
of the pre-computed basis sets. Each combination is included as a
data(...)
loadable .RData
file.