Title: | Blind Source Separation and Supervised Dimension Reduction for Time Series |
---|---|
Description: | Different estimators are provided to solve the blind source separation problem for multivariate time series with stochastic volatility and supervised dimension reduction problem for multivariate time series. Different functions based on AMUSE and SOBI are also provided for estimating the dimension of the white noise subspace. The package is fully described in Nordhausen, Matilainen, Miettinen, Virta and Taskinen (2021) <doi:10.18637/jss.v098.i15>. |
Authors: | Markus Matilainen [cre, aut] , Christophe Croux [aut], Jari Miettinen [aut] , Klaus Nordhausen [aut] , Hannu Oja [aut], Sara Taskinen [aut] , Joni Virta [aut] |
Maintainer: | Markus Matilainen <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.0 |
Built: | 2024-12-06 06:56:26 UTC |
Source: | CRAN |
Different estimators are provided to solve the blind source separation problem for multivariate time series with stochastic volatility and supervised dimension reduction problem for multivariate time series. Different functions based on AMUSE and SOBI are also provided for estimating the dimension of the white noise subspace. The package is fully described in Nordhausen, Matilainen, Miettinen, Virta and Taskinen (2021) <doi:10.18637/jss.v098.i15>.
Package: | tsBSS |
Type: | Package |
Version: | 1.0.0 |
Date: | 2021-07-09 |
License: | GPL (>= 2) |
This package contains functions for the blind source separation (BSS) problem for multivariate time series. The methods are designed for time series with stochastic volatility, such as GARCH and SV models. The main functions of the package for the BSS problem are
FixNA
Function to solve the BSS problem. Algorithm is an alternative to vSOBI
algorithm to acommodate stochastic volatility.
gFOBI
Function to solve the BSS problem. Algorithm is a generalization of FOBI
designed for time series with stochastic volatility.
gJADE
Function to solve the BSS problem. Algorithm is a generalization of JADE
designed for time series with stochastic volatility.
vSOBI
Function to solve the BSS problem. Algorithm is a variant of SOBI
algorithm and an alternative to FixNA
to acommodate stochastic volatility.
gSOBI
Function to solve the BSS problem. Algorithm is a combination of SOBI
and vSOBI
algorithms.
PVC
Function to solve the BSS problem. Algorithm is a modified version of Principal Component Volatility Analysis by Hu and Tsay (2011).
The input data can be a numeric matrix or a multivariate time series object of class ts
, xts
or zoo
. For other classes, the tsbox
package provides appropriate conversions to and from these classes.
The main function of the package for the supervised dimension reduction is
tssdr
Function for supervised dimension reduction for multivariate time series. Includes methods TSIR, TSAVE and TSSH.
Methods for ARMA models, such as AMUSE and SOBI, and some non-stationary BSS methods for time series are implemented in the JADE
package. See function dr
for methods for supervised dimension reduction for iid observations.
Several functions in this package utilize "rjd"
(real joint diagonalization) and "frjd"
(fast rjd) from the JADE
package for joint diagonalization of k real-valued square matrices. For whitening the time series this package uses function "BSSprep"
from package BSSprep
.
There are several functions for estimating the number of white noise latent series in second-order source separation (SOS) models. The functions are
AMUSEboot
, AMUSEladle
and AMUSEasymp
which are based on AMUSE
.
Additionally, there is function lbtest
for a modified Ljung-Box test and a volatility clustering test for univariate and multivariate time series.
The package also contains a dataset WeeklyReturnsData
, which has logarithmic returns of exchange rates of 7 currencies against US Dollar.
Markus Matilainen, Christophe Croux, Jari Miettinen, Klaus Nordhausen, Hannu Oja, Sara Taskinen, Joni Virta
Maintainer: Markus Matilainen <[email protected]>
Nordhausen, K., Matilainen, M., Miettinen, J., Virta, J. and Taskinen, S. (2021) Dimension Reduction for Time Series in a Blind Source Separation Context Using R, Journal of Statistical Software, 98(15), 1–30. <doi:10.18637/jss.v098.i15>
Matilainen, M., Nordhausen, K. and Oja, H. (2015), New Independent Component Analysis Tools for Time Series, Statistics & Probability Letters, 105, 80–87.
Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.
Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57–69.
Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2019), Sliced Average Variance Estimation for Multivariate Time Series. Statistics: A Journal of Theoretical and Applied Statistics, 53, 630–655.
Shi, Z., Jiang, Z. and Zhou, F. (2009), Blind Source Separation with Nonlinear Autocorrelation and Non-Gaussianity, Journal of Computational and Applied Mathematics, 223(1): 908–915.
Matilainen, M., Nordhausen, K. and Virta, J. (2018), On the Number of Signals in Multivariate Time Series. In Deville, Y., Gannot, S., Mason, R., Plumbley, M.D. and Ward, D. (editors) "International Conference on Latent Variable Analysis and Signal Separation", LNCS 10891, 248–258. Springer, Cham., <doi:10.1007/978-3-319-93764-9_24>.
Nordhausen, K. and Virta, J.(2018), Ladle Estimator for Time Series Signal Dimension. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 428–432, <doi:10.1109/SSP.2018.8450695>.
Virta, J. and Nordhausen, K. (2021), Determining the Signal Dimension in Second Order Source Separation. Statistica Sinica, 31, 135–156.
Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis, 41, 293–311.
Hu, Y.-P. and Tsay, R. S. (2014), Principal Volatility Component Analysis, Journal of Business & Economic Statistics, 32(2), 153–164.
The function uses AMUSE (Algorithm for Multiple Unknown Signals Extraction) to test whether the last p-k
latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. The test is asymptotic.
AMUSEasymp(X, ...) ## Default S3 method: AMUSEasymp(X, k, tau = 1, ...) ## S3 method for class 'ts' AMUSEasymp(X, ...) ## S3 method for class 'xts' AMUSEasymp(X, ...) ## S3 method for class 'zoo' AMUSEasymp(X, ...)
AMUSEasymp(X, ...) ## Default S3 method: AMUSEasymp(X, k, tau = 1, ...) ## S3 method for class 'ts' AMUSEasymp(X, ...) ## S3 method for class 'xts' AMUSEasymp(X, ...) ## S3 method for class 'zoo' AMUSEasymp(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
The number of latent series that are not white noise. Can be between |
tau |
The lag for the AMUSE autocovariance matrix. |
... |
Further arguments to be passed to or from methods. |
AMUSE standardizes X
with samples and computes the eigenedcomposition of the autocovariance matrix of the standardized data for a chosen lag
tau
, yielding a transformation giving the latent variables as
. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the squares of the corresponding eigenvalues of the autocovariance matrix. Under the null hypothesis the final
eigenvalues equal zero,
, and their mean square
can be used as a test statistic in inference on the true number of latent white noise series.
This function conducts the hypothesis test using the asymptotic null distribution of , a chi-squared distribution with
degrees of freedom.
A list of class ictest, inheriting from class htest, containing:
statistic |
The value of the test statistic. |
p.value |
The p-value of the test. |
parameter |
The degrees of freedom of the asymptotic null distribution. |
method |
Character string indicating which test was performed. |
data.name |
Character string giving the name of the data. |
alternative |
Character string specifying the alternative hypothesis. |
k |
The number of latent series that are not white noise used in the testing problem. |
W |
The transformation matrix to the latent series. |
S |
Multivariate time series with the centered source components. |
D |
The underlying eigenvalues of the autocovariance matrix. |
MU |
The location of the data which was subtracted before calculating AMUSE. |
tau |
The used lag. |
... |
Further arguments to be passed to or from methods. |
Klaus Nordhausen, Joni Virta
Virta, J. and Nordhausen, K. (2021), Determining the Signal Dimension in Second Order Source Separation. Statistica Sinica, 31, 135–156.
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0.3, 0.6)), n) s2 <- arima.sim(list(ma = c(-0.3, 0.3)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) asymp_res_1 <- AMUSEasymp(X, k = 1) asymp_res_1 asymp_res_2 <- AMUSEasymp(X, k = 2) asymp_res_2 # Plots of the estimated sources, the last two are white noise plot(asymp_res_2)
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0.3, 0.6)), n) s2 <- arima.sim(list(ma = c(-0.3, 0.3)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) asymp_res_1 <- AMUSEasymp(X, k = 1) asymp_res_1 asymp_res_2 <- AMUSEasymp(X, k = 2) asymp_res_2 # Plots of the estimated sources, the last two are white noise plot(asymp_res_2)
The function uses AMUSE (Algorithm for Multiple Unknown Signals Extraction) to test whether the last p-k
latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model.
Four different bootstrapping strategies are available and the function can be run in parallel.
AMUSEboot(X, ...) ## Default S3 method: AMUSEboot(X, k, tau = 1, n.boot = 200, s.boot = c("p", "np1", "np2", "np3"), ncores = NULL, iseed = NULL, ...) ## S3 method for class 'ts' AMUSEboot(X, ...) ## S3 method for class 'xts' AMUSEboot(X, ...) ## S3 method for class 'zoo' AMUSEboot(X, ...)
AMUSEboot(X, ...) ## Default S3 method: AMUSEboot(X, k, tau = 1, n.boot = 200, s.boot = c("p", "np1", "np2", "np3"), ncores = NULL, iseed = NULL, ...) ## S3 method for class 'ts' AMUSEboot(X, ...) ## S3 method for class 'xts' AMUSEboot(X, ...) ## S3 method for class 'zoo' AMUSEboot(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
The number of latent series that are not white noise. Can be between |
tau |
The lag for the AMUSE autocovariance matrix. |
n.boot |
The number of bootstrapping samples. |
s.boot |
Bootstrapping strategy to be used. Possible values are |
ncores |
The number of cores to be used. If |
iseed |
If parallel computation is used, the seed passed on to |
... |
Further arguments to be passed to or from methods. |
AMUSE standardizes X
with samples and computes the eigendecomposition of the autocovariance matrix of the standardized data for a chosen lag
tau
, yielding a transformation giving the latent variables as
. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the squares of the corresponding eigenvalues of the autocovariance matrix. Under the null hypothesis the final
eigenvalues equal zero,
, and their mean square
can be used as a test statistic in bootstrap-based inference on the true number of latent white noise series.
The function offers four different bootstrapping strategies for generating samples for which the null hypothesis approximately holds, and they are all based on the following general formula:
Decompose the AMUSE-estimated latent series into the postulated signal
and white noise
.
Take bootstrap samples
of
, see the different strategies below.
Recombine and back-transform
.
Compute the test statistic based on .
Repeat the previous steps n.boot
times.
The four different bootstrapping strategies are:
s.boot = "p"
:
The first strategy is parametric and simply generates all boostrap samples independently and identically from the standard normal distribution.
s.boot = "np1"
:
The second strategy is non-parametric and pools all observed white noise observations together and draws the bootstrap samples from amongst them.
s.boot = "np2"
:
The third strategy is non-parametric and proceeds otherwise as the second strategy but acts component-wise. That is, for each of the white noise series it pools the observed
white noise observations together and draws the bootstrap samples of that particular latent series from amongst them.
s.boot = "np3"
:
The third strategy is non-parametric and instead of drawing the samples univariately as in the second and third strategies, it proceeds by resampling vectors of size
from amongst all the observed
white noise vectors.
The function can be run in parallel by setting ncores
to the desired number of cores (should be less than the number of cores available - 1). When running code in parallel the standard random seed of R is overridden and if a random seed needs to be set it should be passed via the argument iseed
. The argument iseed
has no effect in case ncores
equals 1 (the default value).
A list of class ictest, inheriting from class htest, containing:
statistic |
The value of the test statistic. |
p.value |
The p-value of the test. |
parameter |
The number of bootstrap samples. |
alternative |
Character string specifying the alternative hypothesis. |
k |
The number of latent series that are not white noise used in the testing problem. |
W |
The transformation matrix to the latent series. |
S |
Multivariate time series with the centered source components. |
D |
The underlying eigenvalues of the autocovariance matrix. |
MU |
The location of the data which was subtracted before calculating AMUSE. |
tau |
The used lag. |
method |
Character string indicating which test was performed. |
data.name |
Character string giving the name of the data. |
s.boot |
Character string denoting which bootstrapping test version was used. |
Markus Matilainen, Klaus Nordhausen, Joni Virta
Matilainen, M., Nordhausen, K. and Virta, J. (2018), On the Number of Signals in Multivariate Time Series. In Deville, Y., Gannot, S., Mason, R., Plumbley, M.D. and Ward, D. (editors) "International Conference on Latent Variable Analysis and Signal Separation", LNCS 10891, 248–258. Springer, Cham., <doi:10.1007/978-3-319-93764-9_24>.
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0.3, 0.6)), n) s2 <- arima.sim(list(ma = c(-0.3, 0.3)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) boot_res_1 <- AMUSEboot(X, k = 1) boot_res_1 boot_res_2 <- AMUSEboot(X, k = 2) boot_res_2 # Plots of the estimated sources, the last two are white noise plot(boot_res_2)
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0.3, 0.6)), n) s2 <- arima.sim(list(ma = c(-0.3, 0.3)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) boot_res_1 <- AMUSEboot(X, k = 1) boot_res_1 boot_res_2 <- AMUSEboot(X, k = 2) boot_res_2 # Plots of the estimated sources, the last two are white noise plot(boot_res_2)
The ladle estimator uses the eigenvalues and eigenvectors of an autocovariance matrix with the chosen lag to estimate the number of white noise components in second-order source separation (SOS).
AMUSEladle(X, ...) ## Default S3 method: AMUSEladle(X, tau = 1, l = 20, sim = c("geom", "fixed"), n.boot = 200, ncomp = ifelse(ncol(X) > 10, floor(ncol(X)/log(ncol(X))), ncol(X) - 1), ...) ## S3 method for class 'ts' AMUSEladle(X, ...) ## S3 method for class 'xts' AMUSEladle(X, ...) ## S3 method for class 'zoo' AMUSEladle(X, ...)
AMUSEladle(X, ...) ## Default S3 method: AMUSEladle(X, tau = 1, l = 20, sim = c("geom", "fixed"), n.boot = 200, ncomp = ifelse(ncol(X) > 10, floor(ncol(X)/log(ncol(X))), ncol(X) - 1), ...) ## S3 method for class 'ts' AMUSEladle(X, ...) ## S3 method for class 'xts' AMUSEladle(X, ...) ## S3 method for class 'zoo' AMUSEladle(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
tau |
The lag for the AMUSE autocovariance matrix. |
l |
If |
sim |
If |
n.boot |
The number of bootstrapping samples. See |
ncomp |
The number of components among which the ladle estimator is to be searched. Must be between |
... |
Further arguments to be passed to or from methods. |
AMUSE standardizes X
with samples and computes the eigenedcomposition of the autocovariance matrix of the standardized data for a chosen lag
tau
, yielding a transformation giving the latent variables as
. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the squares of the corresponding eigenvalues of the autocovariance matrix. Under the assumption that we have
non-white-noise components, the final
eigenvalues equal zero,
.
The change point from non-zero eigenvalues to zero eigenvalues is visible in the eigenvectors of the autocovariance matrix as an increase in their boostrap variablity. Similarly, before the change point, the squared eigenvalues decrease in magnitude and afterwards they stay constant. The ladle estimate combines the scaled eigenvector bootstrap variability with the scaled eigenvalues to estimate the number of non-white-noise components. The estimate is the value of
ncomp
where the combined measure achieves its minimum value.
A list of class ladle
containing:
method |
The string AMUSE. |
k |
The estimated number of non-white-noise components. |
fn |
The vector giving the measures of variation of the eigenvectors using the bootstrapped eigenvectors for the different number of components. |
phin |
Normalized eigenvalues of the AMUSE matrix. |
data.name |
The name of the data for which the ladle estimate was computed. |
gn |
The main criterion for the ladle estimate - the sum of |
lambda |
The eigenvalues of the AMUSE matrix. |
W |
The transformation matrix to the source components. Also known as the unmixing matrix. |
S |
Multivariate time series with the centered source components. |
MU |
The location of the data which was subtracted before calculating the source components. |
sim |
The used boostrapping technique, either |
lag |
The used lag. |
Klaus Nordhausen, Joni Virta
Nordhausen, K. and Virta, J.(2018), Ladle Estimator for Time Series Signal Dimension. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 428–432, <doi:10.1109/SSP.2018.8450695>.
Luo, W. and Li, B. (2016), Combining Eigenvalues and Variation of Eigenvectors for Order Determination, Biometrika, 103. 875–887. <doi:10.1093/biomet/asw051>
n <- 1000 s1 <- arima.sim(n = n, list(ar = 0.6, ma = c(0, -0.4))) s2 <- arima.sim(n = n, list(ar = c(0.4,0.1,0.3), ma = c(0.2, 0.4))) s3 <- arima.sim(n = n, list(ar = c(0.7, 0.1))) Snoise <- matrix(rnorm(5*n), ncol = 5) S <- cbind(s1, s2, s3, Snoise) A <- matrix(rnorm(64), 8, 8) X <- S %*% t(A) ladle_AMUSE <- AMUSEladle(X, l = 20, sim = "geom") # The estimated number of non-white-noise components summary(ladle_AMUSE) # The ladle plot ladleplot(ladle_AMUSE) # Using ggplot ggladleplot(ladle_AMUSE) # Time series plots of the estimated components plot(ladle_AMUSE)
n <- 1000 s1 <- arima.sim(n = n, list(ar = 0.6, ma = c(0, -0.4))) s2 <- arima.sim(n = n, list(ar = c(0.4,0.1,0.3), ma = c(0.2, 0.4))) s3 <- arima.sim(n = n, list(ar = c(0.7, 0.1))) Snoise <- matrix(rnorm(5*n), ncol = 5) S <- cbind(s1, s2, s3, Snoise) A <- matrix(rnorm(64), 8, 8) X <- S %*% t(A) ladle_AMUSE <- AMUSEladle(X, l = 20, sim = "geom") # The estimated number of non-white-noise components summary(ladle_AMUSE) # The ladle plot ladleplot(ladle_AMUSE) # Using ggplot ggladleplot(ladle_AMUSE) # Time series plots of the estimated components plot(ladle_AMUSE)
Class bssvol (blind source separation in stochastic volatility processes) with methods print.bssvol (prints an object of class bssvol) and plot.bss (plots an object of class bssvol).
Class also inherits methods from the class bss in package JADE
: for extracting the components of an object of class bssvol (bss.components
) and the coefficients of an object of class bssvol (coef.bss
).
## S3 method for class 'bssvol' print(x, ...) ## S3 method for class 'bssvol' plot(x, ...)
## S3 method for class 'bssvol' print(x, ...) ## S3 method for class 'bssvol' plot(x, ...)
x |
An object of class bssvol. |
... |
Further arguments to be passed to or from methods. |
Markus Matilainen
JADE
, bss.components
, coef.bss
The FixNA (Fixed-point algorithm for maximizing the Nonlinear Autocorrelation; Shi et al., 2009) and FixNA2 (Matilainen et al., 2017) methods for blind source separation of time series with stochastic volatility. These methods are alternatives to vSOBI method.
FixNA(X, ...) ## Default S3 method: FixNA(X, k = 1:12, eps = 1e-06, maxiter = 1000, G = c("pow", "lcosh"), method = c("FixNA", "FixNA2"), ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' FixNA(X, ...) ## S3 method for class 'xts' FixNA(X, ...) ## S3 method for class 'zoo' FixNA(X, ...)
FixNA(X, ...) ## Default S3 method: FixNA(X, k = 1:12, eps = 1e-06, maxiter = 1000, G = c("pow", "lcosh"), method = c("FixNA", "FixNA2"), ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' FixNA(X, ...) ## S3 method for class 'xts' FixNA(X, ...) ## S3 method for class 'zoo' FixNA(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is |
eps |
Convergence tolerance. |
maxiter |
The maximum number of iterations. |
G |
Function |
method |
The method to be used. The choices are |
ordered |
Whether to order components according to their volatility. Default is |
acfk |
A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if |
original |
Whether to return the original components or their residuals based on ARMA fit. Default is |
alpha |
Alpha level for linear correlation detection. Default is 0.05. |
... |
Further arguments to be passed to or from methods. |
Assume that a -variate
with
observations is whitened, i.e.
, for
,
where
is the sample covariance matrix of
. The algorithm for method FixNA finds an orthogonal matrix
by maximizing
and the algorithm for method FixNA2 finds an orthogonal matrix by maximizing
where ,
and
. For function
the choices are
and log(cosh(
)).
The algorithm works iteratively starting with diag(p)
as an initial value for an orthogonal matrix .
Matrix is a partial derivative of
, for
, with respect to
.
Then
, where
is the number of columns in
, and
.
The update for the orthogonal matrix
is calculated at each iteration step. The algorithm stops when
is less than eps
.
The final unmixing matrix is then .
For ordered = TRUE
the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima
. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest
.
A list of class 'bssvol', inheriting from class 'bss', containing the following components:
W |
The estimated unmixing matrix. If |
k |
The vector of the used lags. |
S |
The estimated sources as time series object standardized to have mean 0 and unit variances. If |
MU |
The mean vector of |
If ordered = TRUE
, then also the following components included in the list:
Sraw |
The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if |
fits |
The ARMA fits for the components with linear autocorrelation. |
armaeff |
A logical vector. Is TRUE if ARMA fit was done to the corresponding component. |
linTS |
The value of the modified Ljung-Box test statistic for each component. |
linP |
p-value based on the modified Ljung-Box test statistic for each component. |
volTS |
The value of the volatility clustering test statistic. |
volP |
p-value based on the volatility clustering test statistic. |
Markus Matilainen
Hyvärinen, A. (2001), Blind Source Separation by Nonstationarity of Variance: A Cumulant-Based Approach, IEEE Transactions on Neural Networks, 12(6): 1471–1474.
Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.
Shi, Z., Jiang, Z. and Zhou, F. (2009), Blind Source Separation with Nonlinear Autocorrelation and Non-Gaussianity, Journal of Computational and Applied Mathematics, 223(1): 908–915.
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # Simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # Create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- FixNA(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # Simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # Create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- FixNA(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
The gFOBI (generalized Fourth Order Blind Identification) method for blind source separation of time series with stochastic volatility. The method is a generalization of FOBI, which is a method designed for iid data.
gFOBI(X, ...) ## Default S3 method: gFOBI(X, k = 0:12, eps = 1e-06, maxiter = 100, method = c("frjd", "rjd"), na.action = na.fail, weight = NULL, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' gFOBI(X, ...) ## S3 method for class 'xts' gFOBI(X, ...) ## S3 method for class 'zoo' gFOBI(X, ...)
gFOBI(X, ...) ## Default S3 method: gFOBI(X, k = 0:12, eps = 1e-06, maxiter = 100, method = c("frjd", "rjd"), na.action = na.fail, weight = NULL, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' gFOBI(X, ...) ## S3 method for class 'xts' gFOBI(X, ...) ## S3 method for class 'zoo' gFOBI(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
A vector of lags. It can be any non-negative integer, or a vector consisting of them. Default is |
eps |
Convergence tolerance. |
maxiter |
The maximum number of iterations. |
method |
The method to use for the joint diagonalization. The options are |
na.action |
A function which indicates what should happen when the data contain 'NA's. Default is to fail. |
weight |
A vector of length k to give weight to the different matrices in joint diagonalization. If NULL, all matrices have equal weight. |
ordered |
Whether to order components according to their volatility. Default is |
acfk |
A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if |
original |
Whether to return the original components or their residuals based on ARMA fit. Default is |
alpha |
Alpha level for linear correlation detection. Default is 0.05. |
... |
Further arguments to be passed to or from methods. |
Assume that a -variate
with
observations is whitened, i.e.
, for
,
where
is the sample covariance matrix of
. Algorithm first calculates
where , and then
for .
The algorithm finds an orthogonal matrix by maximizing
The final unmixing matrix is then .
For ordered = TRUE
the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima
. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest
.
A list of class 'bssvol', inheriting from class 'bss', containing the following components:
W |
The estimated unmixing matrix. If |
k |
The vector of the used lags. |
S |
The estimated sources as time series object standardized to have mean 0 and unit variances. If |
MU |
The mean vector of |
If ordered = TRUE
, then also the following components included in the list:
Sraw |
The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if |
fits |
The ARMA fits for the components with linear autocorrelation. |
armaeff |
A logical vector. Is TRUE if ARMA fit was done to the corresponding component. |
linTS |
The value of the modified Ljung-Box test statistic for each component. |
linP |
p-value based on the modified Ljung-Box test statistic for each component. |
volTS |
The value of the volatility clustering test statistic. |
volP |
p-value based on the volatility clustering test statistic. |
Markus Matilainen, Klaus Nordhausen
Cardoso, J.-F. (1989), Source Separation Using Higher Order Moments, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2109–2112.
Matilainen, M., Nordhausen, K. and Oja, H. (2015), New Independent Component Analysis Tools for Time Series, Statistics & Probability Letters, 105, 80–87.
FOBI
, frjd
, lbtest
, auto.arima
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y X <- cbind(s1, s2, s3) %*% t(A) res <- gFOBI(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y X <- cbind(s1, s2, s3) %*% t(A) res <- gFOBI(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
The gJADE (generalized Joint Approximate Diagonalization of Eigenmatrices) method for blind source separation of time series with stochastic volatility. The method is a generalization of JADE, which is a method for blind source separation problem using only marginal information.
gJADE(X, ...) ## Default S3 method: gJADE(X, k = 0:12, eps = 1e-06, maxiter = 100, method = c("frjd", "rjd"), na.action = na.fail, weight = NULL, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' gJADE(X, ...) ## S3 method for class 'xts' gJADE(X, ...) ## S3 method for class 'zoo' gJADE(X, ...)
gJADE(X, ...) ## Default S3 method: gJADE(X, k = 0:12, eps = 1e-06, maxiter = 100, method = c("frjd", "rjd"), na.action = na.fail, weight = NULL, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' gJADE(X, ...) ## S3 method for class 'xts' gJADE(X, ...) ## S3 method for class 'zoo' gJADE(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
A vector of lags. It can be any non-negative integer, or a vector consisting of them. Default is |
eps |
Convergence tolerance. |
maxiter |
The maximum number of iterations. |
method |
The method to use for the joint diagonalization. The options are |
na.action |
A function which indicates what should happen when the data contain 'NA's. Default is to fail. |
weight |
A vector of length k to give weight to the different matrices in joint diagonalization. If NULL, all matrices have equal weight. |
ordered |
Whether to order components according to their volatility. Default is |
acfk |
A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if |
original |
Whether to return the original components or their residuals based on ARMA fit. Default is |
alpha |
Alpha level for linear correlation detection. Default is 0.05. |
... |
Further arguments to be passed to or from methods. |
Assume that a -variate
with
observations is whitened, i.e.
, for
,
where
is the sample covariance matrix of
. The matrix
is of the form
for , where
is the lagged sample covariance matrix of
for lag
,
is a matrix where element
equals to 1 and all other elements are 0,
is an identity matrix of order
and
is as in
gFOBI
.
The algorithm finds an orthogonal matrix by maximizing
where .
The final unmixing matrix is then
.
For ordered = TRUE
the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima
. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest
.
A list of class 'bssvol', inheriting from class 'bss', containing the following components:
W |
The estimated unmixing matrix. If |
k |
The vector of the used lags. |
S |
The estimated sources as time series object standardized to have mean 0 and unit variances. If |
MU |
The mean vector of |
If ordered = TRUE
, then also the following components included in the list:
Sraw |
The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if |
fits |
The ARMA fits for the components with linear autocorrelation. |
armaeff |
A logical vector. Is TRUE if ARMA fit was done to the corresponding component. |
linTS |
The value of the modified Ljung-Box test statistic for each component. |
linP |
p-value based on the modified Ljung-Box test statistic for each component. |
volTS |
The value of the volatility clustering test statistic. |
volP |
p-value based on the volatility clustering test statistic. |
Klaus Nordhausen, Markus Matilainen
Cardoso, J.-F., Souloumiac, A. (1993), Blind Beamforming for Non-Gaussian Signals, in: IEE-Proceedings-F, volume 140, pp. 362–370.
Matilainen, M., Nordhausen, K. and Oja, H. (2015), New Independent Component Analysis Tools for Time Series, Statistics & Probability Letters, 105, 80–87.
frjd
, JADE
, gFOBI
, lbtest
, auto.arima
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y X <- cbind(s1, s2, s3) %*% t(A) res <- gJADE(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y X <- cbind(s1, s2, s3) %*% t(A) res <- gJADE(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
The gSOBI (generalized Second Order Blind Identification) method for the blind source separation (BSS) problem. The method is designed for separating multivariate time series with or without stochastic volatility. The method is a combination of SOBI and vSOBI with as a nonlinearity function.
gSOBI(X, ...) ## Default S3 method: gSOBI(X, k1 = 1:12, k2 = 1:3, b = 0.9, eps = 1e-06, maxiter = 1000, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' gSOBI(X, ...) ## S3 method for class 'xts' gSOBI(X, ...) ## S3 method for class 'zoo' gSOBI(X, ...)
gSOBI(X, ...) ## Default S3 method: gSOBI(X, k1 = 1:12, k2 = 1:3, b = 0.9, eps = 1e-06, maxiter = 1000, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' gSOBI(X, ...) ## S3 method for class 'xts' gSOBI(X, ...) ## S3 method for class 'zoo' gSOBI(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k1 |
A vector of lags for SOBI part. It can be any non-zero positive integer, or a vector consisting of them. Default is |
k2 |
A vector of lags for vSOBI part. It can be any non-zero positive integer, or a vector consisting of them. Default is |
b |
The weight for the SOBI part, |
eps |
Convergence tolerance. |
maxiter |
The maximum number of iterations. |
ordered |
Whether to order components according to their volatility. Default is |
acfk |
A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if |
original |
Whether to return the original components or their residuals based on ARMA fit. Default is |
alpha |
Alpha level for linear correlation detection. Default is 0.05. |
... |
Further arguments to be passed to or from methods. |
Assume that a -variate
with
observations is whitened, i.e.
, for
,
where
is the sample covariance matrix of
. The algorithm finds an orthogonal matrix
by maximizing
where SOBI part
and vSOBI part
where
is a value between 0 and 1, and
,
,
and
The algorithm works iteratively starting with diag(p)
as an initial value for an orthogonal matrix .
Matrix is a partial derivative of
, where
, with respect to
.
Then
, where
is the number of columns in
, and
, for
. Finally
.
The update for the orthogonal matrix is calculated at each iteration step. The algorithm stops when
is less than eps
.
The final unmixing matrix is then .
For ordered = TRUE
the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima
. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest
.
A list of class 'bssvol', inheriting from class 'bss', containing the following components:
W |
The estimated unmixing matrix. If |
k1 |
The vector of the used lags for the SOBI part. |
k2 |
The vector of the used lags for the vSOBI part. |
S |
The estimated sources as time series object standardized to have mean 0 and unit variances. If |
MU |
The mean vector of |
If ordered = TRUE
, then also the following components included in the list:
Sraw |
The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if |
fits |
The ARMA fits for the components with linear autocorrelation. |
armaeff |
A logical vector. Is TRUE if ARMA fit was done to the corresponding component. |
linTS |
The value of the modified Ljung-Box test statistic for each component. |
linP |
p-value based on the modified Ljung-Box test statistic for each component. |
volTS |
The value of the volatility clustering test statistic. |
volP |
p-value based on the volatility clustering test statistic. |
Markus Matilainen, Jari Miettinen
Belouchrani, A., Abed-Meriam, K., Cardoso, J.F. and Moulines, R. (1997), A Blind Source Separation Technique Using Second-Order Statistics, IEEE Transactions on Signal Processing, 434–444.
Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.
Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis, 41, 293–311.
SOBI
, vSOBI
, lbtest
, auto.arima
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- gSOBI(X, 1:4, 1:2, 0.99) res$W coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero # xts series as input library("xts") data(sample_matrix) X2 <- as.xts(sample_matrix) res2 <- gSOBI(X2, 1:4, 1:2, 0.99) plot(res2, multi.panel = TRUE) # zoo series as input X3 <- as.zoo(X) res3 <- gSOBI(X3, 1:4, 1:2, 0.99) plot(res3) }
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- gSOBI(X, 1:4, 1:2, 0.99) res$W coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero # xts series as input library("xts") data(sample_matrix) X2 <- as.xts(sample_matrix) res2 <- gSOBI(X2, 1:4, 1:2, 0.99) plot(res2, multi.panel = TRUE) # zoo series as input X3 <- as.zoo(X) res3 <- gSOBI(X3, 1:4, 1:2, 0.99) plot(res3) }
Modified Ljung-Box test and volatility clustering test for time series. Time series can be univariate or multivariate. The modified Ljung-Box test checks whether there is linear autocorrelation in the time series. The volatility clustering test checks whether the time series has squared autocorrelation, which would indicate a presence of volatility clustering.
lbtest(X, k, type = c("squared", "linear")) ## S3 method for class 'lbtest' print(x, digits = 3, ...)
lbtest(X, k, type = c("squared", "linear")) ## S3 method for class 'lbtest' print(x, digits = 3, ...)
X |
A numeric vector/matrix or a univariate/multivariate time series object of class |
k |
A vector of lags. |
type |
The type of the autocorrelation test. Options are Modified Ljung-Box test ( |
In methods for class 'lbtest' only:
x |
An object of class lbtest |
digits |
The number of digits when printing an object of class lbtest. Default is 3 |
... |
Further arguments to be passed to or from methods. |
Assume all the individual time series in
with
observations are scaled to have variance 1.
Then the modified Ljung-Box test statistic for testing the existence of linear autocorrelation in (
option = "linear"
) is
Here
where ,
and
.
The volatility clustering test statistic (option = "squared"
) is
Test statistic related to each time series is then compared to
-distribution with
length(k)
degrees of freedom, and the corresponding p-values are produced. Small p-value indicates the existence of autocorrelation.
A list of class 'lbtest' containing the following components:
TS |
The values of the test statistic for each component of X as a vector. |
p_val |
The p-values based on the test statistic for each component of X as a vector. |
Xname |
The name of the data used as a character string. |
varnames |
The names of the variables used as a character string vector. |
k |
The lags used for testing the serial autocorrelation as a vector. |
K |
The total number of lags used for testing the serial autocorrelation. |
type |
The type of the autocorrelation test. |
Markus Matilainen, Jari Miettinen
Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis, 41, 293–311.
FixNA
, gFOBI
, gJADE
, vSOBI
, gSOBI
if(require("stochvol")) { n <- 10000 s1 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.1)$y s2 <- rnorm(n) S <- cbind(s1, s2) lbtest(S, 1:3, type = "squared") # First p-value should be very close to zero, as there exists stochastic volatility }
if(require("stochvol")) { n <- 10000 s1 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.1)$y s2 <- rnorm(n) S <- cbind(s1, s2) lbtest(S, 1:3, type = "squared") # First p-value should be very close to zero, as there exists stochastic volatility }
PVC (Principal Volatility Component) estimator for the blind source separation (BSS) problem. This method is a modified version of PVC by Hu and Tsay (2014).
PVC(X, ...) ## Default S3 method: PVC(X, k = 1:12, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' PVC(X, ...) ## S3 method for class 'xts' PVC(X, ...) ## S3 method for class 'zoo' PVC(X, ...)
PVC(X, ...) ## Default S3 method: PVC(X, k = 1:12, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' PVC(X, ...) ## S3 method for class 'xts' PVC(X, ...) ## S3 method for class 'zoo' PVC(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is |
ordered |
Whether to order components according to their volatility. Default is |
acfk |
A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if |
original |
Whether to return the original components or their residuals based on ARMA fit. Default is |
alpha |
Alpha level for linear correlation detection. Default is 0.05. |
... |
Further arguments to be passed to or from methods. |
Assume that a -variate
with
observations is whitened, i.e.
, for
,
where
is the sample covariance matrix of
.
Then for each lag
we calculate
where and
.
Then
where
Thus the generalized kurtosis matrix is
where is the set of chosen lags.
Then
is the matrix with eigenvectors of
as its rows.
The final unmixing matrix is then
, where the average value of each row is set to be positive.
For ordered = TRUE
the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima
. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest
.
A list of class 'bssvol', inheriting from class 'bss', containing the following components:
W |
The estimated unmixing matrix. If |
k |
The vector of the used lags. |
S |
The estimated sources as time series object standardized to have mean 0 and unit variances. If |
MU |
The mean vector of |
If ordered = TRUE
, then also the following components included in the list:
Sraw |
The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if |
fits |
The ARMA fits for the components with linear autocorrelation. |
armaeff |
A logical vector. Is TRUE if ARMA fit was done to the corresponding component. |
linTS |
The value of the modified Ljung-Box test statistic for each component. |
linP |
p-value based on the modified Ljung-Box test statistic for each component. |
volTS |
The value of the volatility clustering test statistic. |
volP |
p-value based on the volatility clustering test statistic. |
Jari Miettinen, Markus Matilainen
Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis,41, 293–311.
Hu, Y.-P. and Tsay, R. S. (2014), Principal Volatility Component Analysis, Journal of Business & Economic Statistics, 32(2), 153–164.
comVol
, gSOBI
, lbtest
, auto.arima
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # Simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # Create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- PVC(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # Simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # Create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- PVC(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
The function uses SOBI (Second Order Blind Identification) to test whether the last p-k
latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. The test is asymptotic.
SOBIasymp(X, ...) ## Default S3 method: SOBIasymp(X, k, tau = 1:12, eps = 1e-06, maxiter = 200, ...) ## S3 method for class 'ts' SOBIasymp(X, ...) ## S3 method for class 'xts' SOBIasymp(X, ...) ## S3 method for class 'zoo' SOBIasymp(X, ...)
SOBIasymp(X, ...) ## Default S3 method: SOBIasymp(X, k, tau = 1:12, eps = 1e-06, maxiter = 200, ...) ## S3 method for class 'ts' SOBIasymp(X, ...) ## S3 method for class 'xts' SOBIasymp(X, ...) ## S3 method for class 'zoo' SOBIasymp(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
The number of latent series that are not white noise. Can be between |
tau |
The lags for the SOBI autocovariance matrices. |
eps |
The convergence tolerance for the joint diagonalization. |
maxiter |
The maximum number of iterations for the joint diagonalization. |
... |
Further arguments to be passed to or from methods. |
SOBI standardizes X
with samples and jointly diagonalizes the autocovariance matrices of the standardized data for a chosen set of lags
tau
, yielding a transformation giving the latent variables as
. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the sums of squares of the corresponding "eigenvalues" produced by the joint diagonalization. Under the null hypothesis the lower right corner
blocks of the autocovariance matrices of the sources are zero matrices and the sum
of their squared norms over all lags can be used as a test statistic in inference on the true number of latent white noise series.
This function conducts the hypothesis test using the asymptotic null distribution of , a chi-squared distribution with
degrees of freedom where
is the number of autocovariance matrices used by SOBI.
A list of class ictest, inheriting from class htest, containing:
statistic |
The value of the test statistic. |
p.value |
The p-value of the test. |
parameter |
The degrees of freedom of the asymptotic null distribution. |
method |
Character string indicating which test was performed. |
data.name |
Character string giving the name of the data. |
alternative |
Character string specifying the alternative hypothesis. |
k |
The number of latent series that are not white noise used in the testing problem. |
W |
The transformation matrix to the latent series. |
S |
Multivariate time series with the centered source components. |
D |
The underlying eigenvalues of the autocovariance matrix. |
MU |
The location of the data which was subtracted before calculating SOBI. |
tau |
The used set of lags for the SOBI autocovariance matrices. |
Klaus Nordhausen, Joni Virta
Virta, J. and Nordhausen, K. (2021), Determining the Signal Dimension in Second Order Source Separation. Statistica Sinica, 31, 135–156.
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0, 0.6)), n) s2 <- arima.sim(list(ma = c(0, -0.5)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) asymp_res_1 <- SOBIasymp(X, k = 1) asymp_res_1 asymp_res_2 <- SOBIasymp(X, k = 2) asymp_res_2 # Plots of the estimated sources, the last two are white noise plot(asymp_res_2) # Note that AMUSEasymp with lag 1 does not work due to the lack of short range dependencies AMUSEasymp(X, k = 1)
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0, 0.6)), n) s2 <- arima.sim(list(ma = c(0, -0.5)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) asymp_res_1 <- SOBIasymp(X, k = 1) asymp_res_1 asymp_res_2 <- SOBIasymp(X, k = 2) asymp_res_2 # Plots of the estimated sources, the last two are white noise plot(asymp_res_2) # Note that AMUSEasymp with lag 1 does not work due to the lack of short range dependencies AMUSEasymp(X, k = 1)
The function uses SOBI (Second Order Blind Identification) to test whether the last p-k
latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. Four different bootstrapping strategies are available and the function can be run in parallel.
SOBIboot(X, ...) ## Default S3 method: SOBIboot(X, k, tau = 1:12, n.boot = 200, s.boot = c("p", "np1", "np2", "np3"), ncores = NULL, iseed = NULL, eps = 1e-06, maxiter = 200, ...) ## S3 method for class 'ts' SOBIboot(X, ...) ## S3 method for class 'xts' SOBIboot(X, ...) ## S3 method for class 'zoo' SOBIboot(X, ...)
SOBIboot(X, ...) ## Default S3 method: SOBIboot(X, k, tau = 1:12, n.boot = 200, s.boot = c("p", "np1", "np2", "np3"), ncores = NULL, iseed = NULL, eps = 1e-06, maxiter = 200, ...) ## S3 method for class 'ts' SOBIboot(X, ...) ## S3 method for class 'xts' SOBIboot(X, ...) ## S3 method for class 'zoo' SOBIboot(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
The number of latent series that are not white noise. Can be between |
tau |
The vector of lags for the SOBI autocovariance matrices. |
n.boot |
The number of bootstrapping samples. |
s.boot |
Bootstrapping strategy to be used. Possible values are |
ncores |
The number of cores to be used. If |
iseed |
If parallel computation is used, the seed passed on to |
eps |
The convergence tolerance for the joint diagonalization. |
maxiter |
The maximum number of iterations for the joint diagonalization. |
... |
Further arguments to be passed to or from methods. |
SOBI standardizes X
with samples and jointly diagonalizes the autocovariance matrices of the standardized data for a chosen set of lags
tau
, yielding a transformation giving the latent variables as
. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the sums of squares of the corresponding "eigenvalues" produced by the joint diagonalization. Under the null hypothesis the final
"eigenvalues" of each of the autocovariance matrices equal zero,
, and their mean square
over all lags can be used as a test statistic in bootstrap-based inference on the true number of latent white noise series.
The function offers four different bootstrapping strategies for generating samples for which the null hypothesis approximately holds, and they are all based on the following general formula:
Decompose the SOBI-estimated latent series into the postulated signal
and white noise
.
Take bootstrap samples
of
, see the different strategies below.
Recombine and back-transform
.
Compute the test statistic based on .
The four different bootstrapping strategies are:
s.boot = "p"
:
The first strategy is parametric and simply generates all boostrap samples independently and identically from the standard normal distribution.
s.boot = "np1"
:
The second strategy is non-parametric and pools all observed white noise observations together and draws the bootstrap samples from amongst them.
s.boot = "np2"
:
The third strategy is non-parametric and proceeds otherwise as the second strategy but acts component-wise. That is, separately for each of the white noise series it pools the observed
white noise observations together and draws the bootstrap samples of that particular latent series from amongst them.
s.boot = "np3"
:
The third strategy is non-parametric and instead of drawing the samples univariately as in the second and third strategies, proceeds by resampling vectors of size
from amongst all the observed
white noise vectors.
The function can be run in parallel by setting ncores
to the desired number of cores (should be less than the number of cores available - 1). When running code in parallel the standard random seed of R is overridden and if a random seed needs to be set it should be passed via the argument iseed
. The argument iseed
has no effect in case ncores
equals 1 (the default value).
This function uses for the joint diagonalization the function frjd.int
, which does not fail in case of failed convergence but returns the estimate from the final step.
A list of class ictest, inheriting from class htest, containing:
statistic |
The value of the test statistic. |
p.value |
The p-value of the test. |
parameter |
The number of bootstrap samples. |
alternative |
Character string specifying the alternative hypothesis. |
k |
The number of latent series that are not white noise used in the testing problem. |
W |
The transformation matrix to the latent series. |
S |
Multivariate time series with the centered source components. |
D |
The underlying eigenvalues of the autocovariance matrix. |
MU |
The location of the data which was subtracted before calculating SOBI. |
tau |
The used set of lags. |
method |
Character string indicating which test was performed. |
data.name |
Character string giving the name of the data. |
s.boot |
Character string denoting which bootstrapping test version was used. |
Markus Matilainen, Klaus Nordhausen, Joni Virta
Matilainen, M., Nordhausen, K. and Virta, J. (2018), On the Number of Signals in Multivariate Time Series. In Deville, Y., Gannot, S., Mason, R., Plumbley, M.D. and Ward, D. (editors) "International Conference on Latent Variable Analysis and Signal Separation", LNCS 10891, 248–258. Springer, Cham., <doi:10.1007/978-3-319-93764-9_24>.
AMUSE
, AMUSEboot
, SOBI
, tsboot
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0, 0, 0.3, 0.6)), n) s2 <- arima.sim(list(ma = c(0, 0, -0.3, 0.3)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) boot_res_1 <- SOBIboot(X, k = 1) boot_res_1 boot_res_2 <- SOBIboot(X, k = 2) boot_res_2 # Plots of the estimated sources, the last two are white noise plot(boot_res_2) # Note that AMUSEboot with lag 1 does not work due to the lack of short range dependencies AMUSEboot(X, k = 1) # xts series as input library("xts") data(sample_matrix) X2 <- as.xts(sample_matrix) boot_res_xts <- SOBIboot(X2, k = 2) plot(boot_res_xts, multi.panel = TRUE) # zoo series as input X3 <- as.zoo(X) boot_res_zoo <- SOBIboot(X3, k = 2) plot(boot_res_zoo)
n <- 1000 A <- matrix(rnorm(16), 4, 4) s1 <- arima.sim(list(ar = c(0, 0, 0.3, 0.6)), n) s2 <- arima.sim(list(ma = c(0, 0, -0.3, 0.3)), n) s3 <- rnorm(n) s4 <- rnorm(n) S <- cbind(s1, s2, s3, s4) X <- S %*% t(A) boot_res_1 <- SOBIboot(X, k = 1) boot_res_1 boot_res_2 <- SOBIboot(X, k = 2) boot_res_2 # Plots of the estimated sources, the last two are white noise plot(boot_res_2) # Note that AMUSEboot with lag 1 does not work due to the lack of short range dependencies AMUSEboot(X, k = 1) # xts series as input library("xts") data(sample_matrix) X2 <- as.xts(sample_matrix) boot_res_xts <- SOBIboot(X2, k = 2) plot(boot_res_xts, multi.panel = TRUE) # zoo series as input X3 <- as.zoo(X) boot_res_zoo <- SOBIboot(X3, k = 2) plot(boot_res_zoo)
The ladle estimator uses the joint diagonalization "eigenvalues" and "eigenvectors" of several autocovariance matrices to estimate the number of white noise components in second-order source separation (SOS).
SOBIladle(X, ...) ## Default S3 method: SOBIladle(X, tau = 1:12, l = 20, sim = c("geom", "fixed"), n.boot = 200, ncomp = ifelse(ncol(X) > 10, floor(ncol(X)/log(ncol(X))), ncol(X) - 1), maxiter = 1000, eps = 1e-06, ...) ## S3 method for class 'ts' SOBIladle(X, ...) ## S3 method for class 'xts' SOBIladle(X, ...) ## S3 method for class 'zoo' SOBIladle(X, ...)
SOBIladle(X, ...) ## Default S3 method: SOBIladle(X, tau = 1:12, l = 20, sim = c("geom", "fixed"), n.boot = 200, ncomp = ifelse(ncol(X) > 10, floor(ncol(X)/log(ncol(X))), ncol(X) - 1), maxiter = 1000, eps = 1e-06, ...) ## S3 method for class 'ts' SOBIladle(X, ...) ## S3 method for class 'xts' SOBIladle(X, ...) ## S3 method for class 'zoo' SOBIladle(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
tau |
The lags for the SOBI autocovariance matrices. |
l |
If |
sim |
If |
n.boot |
The number of bootstrapping samples. See |
ncomp |
The number of components among which the ladle estimator is to be searched. Must be between |
maxiter |
Maximum number of iterations. |
eps |
Convergence tolerance. |
... |
Further arguments to be passed to or from methods. |
SOBI standardizes X
with samples and jointly diagonalizes the autocovariance matrices of the standardized data for a chosen set of lags
tau
, yielding a transformation giving the latent variables as
. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the sums of squares of the corresponding "eigenvalues" produced by the joint diagonalization.
Under the assumption that we have
non-white-noise components, the final
"eigenvalues" of each of the autocovariance matrices equal zero,
.
The change point from non-zero eigenvalues to zero eigenvalues is visible in the joint diagonalization "eigenvectors" of the autocovariance matrices as an increase in their boostrap variablity. Similarly, before the change point, the squared eigenvalues decrease in magnitude and afterwards they stay constant. The ladle estimate combines the scaled eigenvector bootstrap variability with the scaled eigenvalues to estimate the number of non-white-noise components. The estimate is the value of
ncomp
where the combined measure achieves its minimum value.
This function uses for the joint diagonalization the function frjd.int
, which does not fail in case of failed convergence but returns the estimate from the final step.
A list of class ladle
containing:
method |
The string SOBI. |
k |
The estimated number of non-white-noise components. |
fn |
The vector giving the measures of variation of the eigenvectors using the bootstrapped eigenvectors for the different number of components. |
phin |
Normalized sums of squared eigenvalues of the SOBI matrices. |
data.name |
The name of the data for which the ladle estimate was computed. |
gn |
The main criterion for the ladle estimate - the sum of |
lambda |
The sums of squared eigenvalues of the SOBI matrices. |
W |
The transformation matrix to the source components. Also known as the unmixing matrix. |
S |
Multivariate time series with the centered source components. |
MU |
The location of the data which was subtracted before calculating the source components. |
sim |
The used boostrapping technique, either |
tau |
The used set of lags for the SOBI autocovariance matrices. |
Klaus Nordhausen, Joni Virta
Nordhausen, K. and Virta, J.(2018), Ladle Estimator for Time Series Signal Dimension. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 428–432, <doi:10.1109/SSP.2018.8450695>.
Luo, W. and Li, B. (2016), Combining Eigenvalues and Variation of Eigenvectors for Order Determination, Biometrika, 103. 875–887. <doi:10.1093/biomet/asw051>
AMUSE
, SOBI
, AMUSEladle
, frjd.int
n <- 1000 s1 <- arima.sim(n = n, list(ar = 0.6, ma = c(0, -0.4))) s2 <- arima.sim(n = n, list(ar = c(0, 0.1,0.3), ma = c(0.2, 0.4))) s3 <- arima.sim(n = n, list(ar = c(0, 0.8))) Snoise <- matrix(rnorm(5*n), ncol = 5) S <- cbind(s1, s2, s3, Snoise) A <- matrix(rnorm(64), 8, 8) X <- S %*% t(A) ladle_SOBI <- SOBIladle(X, l = 20, sim = "geom") # The estimated number of non-white-noise components summary(ladle_SOBI) # The ladle plot ladleplot(ladle_SOBI) # Time series plots of the estimated components plot(ladle_SOBI) # Note that AMUSEladle with lag 1 does not work due to the lack of short range dependencies ladle_AMUSE <- AMUSEladle(X) summary(ladle_AMUSE) ladleplot(ladle_AMUSE) # xts series as input library("xts") data(sample_matrix) X2 <- as.xts(sample_matrix) ladle_SOBI_xts <- SOBIladle(X2, l = 20, sim = "geom") plot(ladle_SOBI_xts, multi.panel = TRUE) # zoo series as input X3 <- as.zoo(X) ladle_SOBI_zoo <- SOBIladle(X3, l = 20, sim = "geom") plot(ladle_SOBI_zoo)
n <- 1000 s1 <- arima.sim(n = n, list(ar = 0.6, ma = c(0, -0.4))) s2 <- arima.sim(n = n, list(ar = c(0, 0.1,0.3), ma = c(0.2, 0.4))) s3 <- arima.sim(n = n, list(ar = c(0, 0.8))) Snoise <- matrix(rnorm(5*n), ncol = 5) S <- cbind(s1, s2, s3, Snoise) A <- matrix(rnorm(64), 8, 8) X <- S %*% t(A) ladle_SOBI <- SOBIladle(X, l = 20, sim = "geom") # The estimated number of non-white-noise components summary(ladle_SOBI) # The ladle plot ladleplot(ladle_SOBI) # Time series plots of the estimated components plot(ladle_SOBI) # Note that AMUSEladle with lag 1 does not work due to the lack of short range dependencies ladle_AMUSE <- AMUSEladle(X) summary(ladle_AMUSE) ladleplot(ladle_AMUSE) # xts series as input library("xts") data(sample_matrix) X2 <- as.xts(sample_matrix) ladle_SOBI_xts <- SOBIladle(X2, l = 20, sim = "geom") plot(ladle_SOBI_xts, multi.panel = TRUE) # zoo series as input X3 <- as.zoo(X) ladle_SOBI_zoo <- SOBIladle(X3, l = 20, sim = "geom") plot(ladle_SOBI_zoo)
Gives a summary of an object of class tssdr. It includes different types of methods to select the number of directions (sources) and lags.
## S3 method for class 'tssdr' summary(object, type = c("rectangle", "alllag", "alldir", "big"), thres = 0.8, ...) ## S3 method for class 'summary.tssdr' print(x, digits = 3, ...) ## S3 method for class 'summary.tssdr' components(x, ...) ## S3 method for class 'summary.tssdr' coef(object, ...) ## S3 method for class 'summary.tssdr' plot(x, main = "The response and the chosen directions", ...)
## S3 method for class 'tssdr' summary(object, type = c("rectangle", "alllag", "alldir", "big"), thres = 0.8, ...) ## S3 method for class 'summary.tssdr' print(x, digits = 3, ...) ## S3 method for class 'summary.tssdr' components(x, ...) ## S3 method for class 'summary.tssdr' coef(object, ...) ## S3 method for class 'summary.tssdr' plot(x, main = "The response and the chosen directions", ...)
object |
An object of class tssdr. |
type |
Method for choosing the important lags and directions. The choices are |
thres |
The threshold value for choosing the lags and directions. Default is |
... |
Further arguments to be passed to or from methods. |
In methods for class 'summary.tssdr' only:
x |
An object of class summary.tssdr |
digits |
The number of digits when printing an object of class summary.tssdr. Default is 3 |
main |
A title for a plot when printing an object of class summary.tssdr. |
The sum of values of matrix
of
object
is 1. The values of the matrix are summed together in ways detailed below, until the value is at least (
thres
). Let be the element
of the matrix
.
For alllag
: and
is the smallest value for which
where
. The chosen number of lags and directions are returned.
For alldir
: and
is the smallest value for which
where
. The chosen number of lags and directions are returned.
For rectangle
: and
are values such that their product
is the smallest for which
where
and
. The chosen number of lags and directions are returned.
For big
: is the smallest value of elements
for which
where
. Thi indices of the matrix corresponding to the chosen values are returned.
Note that when printing a summary.tssdr object, all elements except the component S, which is the matrix of the chosen directions or a vector if there is only one direction, are printed.
A list of class 'summary.tssdr' containing the following components:
W |
The estimated signal separation matrix |
L |
The Lambda matrix for choosing lags and directions. |
S |
The estimated directions as time series object standardized to have mean 0 and unit variances. |
type |
The method for choosing the important lags and directions. |
algorithm |
The used algorithm as a character string. |
yname |
The name for the response time series |
Xname |
The name for the predictor time series |
k |
The chosen number of lags (not for |
p |
The chosen number of directions (not for |
pk |
The chosen lag-direction combinations (for |
Markus Matilainen
Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57–69.
n <- 10000 A <- matrix(rnorm(9), 3, 3) x1 <- arima.sim(n = n, list(ar = 0.2)) x2 <- arima.sim(n = n, list(ar = 0.8)) x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4)) eps2 <- rnorm(n - 1) y <- 2*x1[1:(n - 1)] + 3*x2[1:(n - 1)] + eps2 X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A) res2 <- tssdr(y, X, algorithm = "TSIR") res2 summ2 <- summary(res2, thres = 0.5) summ2 summary(res2) # Chooses more lags with larger threshold summary(res2, type = "alllag") # Chooses all lags summary(res2, type = "alldir", thres = 0.5) # Chooses all directions summary(res2, type = "big", thres = 0.5) # Same choices than in summ2
n <- 10000 A <- matrix(rnorm(9), 3, 3) x1 <- arima.sim(n = n, list(ar = 0.2)) x2 <- arima.sim(n = n, list(ar = 0.8)) x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4)) eps2 <- rnorm(n - 1) y <- 2*x1[1:(n - 1)] + 3*x2[1:(n - 1)] + eps2 X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A) res2 <- tssdr(y, X, algorithm = "TSIR") res2 summ2 <- summary(res2, thres = 0.5) summ2 summary(res2) # Chooses more lags with larger threshold summary(res2, type = "alllag") # Chooses all lags summary(res2, type = "alldir", thres = 0.5) # Chooses all directions summary(res2, type = "big", thres = 0.5) # Same choices than in summ2
Supervised dimension reduction for multivariate time series data. There are three different algorithms to choose from. TSIR is a time series version of Sliced Inverse Regression (SIR), TSAVE is a time series version of Sliced Average Variance Estimate (TSAVE) and a hybrid of TSIR and TSAVE is TSSH (Time series SIR SAVE Hybrid). For summary of an object of class tssdr, see summary.tssdr
.
tssdr(y, X, ...) ## Default S3 method: tssdr(y, X, algorithm = c("TSIR", "TSAVE", "TSSH"), k = 1:12, H = 10, weight = 0.5, method = c("frjd", "rjd"), eps = 1e-06, maxiter = 1000, ...) ## S3 method for class 'ts' tssdr(y, X, ...) ## S3 method for class 'xts' tssdr(y, X, ...) ## S3 method for class 'zoo' tssdr(y, X, ...) ## S3 method for class 'tssdr' print(x, digits = 3, ...) ## S3 method for class 'tssdr' components(x, ...) ## S3 method for class 'tssdr' plot(x, main = "The response and the directions", ...)
tssdr(y, X, ...) ## Default S3 method: tssdr(y, X, algorithm = c("TSIR", "TSAVE", "TSSH"), k = 1:12, H = 10, weight = 0.5, method = c("frjd", "rjd"), eps = 1e-06, maxiter = 1000, ...) ## S3 method for class 'ts' tssdr(y, X, ...) ## S3 method for class 'xts' tssdr(y, X, ...) ## S3 method for class 'zoo' tssdr(y, X, ...) ## S3 method for class 'tssdr' print(x, digits = 3, ...) ## S3 method for class 'tssdr' components(x, ...) ## S3 method for class 'tssdr' plot(x, main = "The response and the directions", ...)
y |
A numeric vector or a time series object of class |
X |
A numeric matrix or a multivariate time series object of class |
algorithm |
Algorithm to be used. The options are |
k |
A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is |
H |
The number of slices. If |
weight |
Weight |
method |
The method to use for the joint diagonalization. The options are |
eps |
Convergence tolerance. |
maxiter |
The maximum number of iterations. |
... |
Further arguments to be passed to or from methods. |
In methods for class 'tssdr' only:
x |
An object of class tssdr |
digits |
The number of digits when printing an object of class tssdr. Default is 3 |
main |
A title for a plot when printing an object of class tssdr. |
Assume that the -variate time series
with
observations is whitened, i.e.
, for
,
where
is a sample covariance matrix of
.
Divide
into
disjoint intervals (slices) by its empirical quantiles.
For each lag , denote
for a vector of the last
values of the sliced
. Also denote
for the first
observations of
. Then
are the disjoint slices of
according to the values of
.
Let be the number of observations in
.
Write
for
,
and
.
Then for algorithm
TSIR
matrix
Denote for a sample covariance matrix of
. Then for algorithm
TSAVE
matrix
.
For TSSH
then matrix
for a chosen . Note that the value of
can be different for TSIR and TSAVE parts.
The algorithms find an orthogonal matrix by maximizing, for
or
,
for and all lags
.
The final signal separation matrix is then
.
Write , where
is chosen in such way that
for
and all lags
.
Then the
:th element of the matrix
is
.
To make a choice on which lags and directions to keep, see summary.tssdr
. Note that when printing a tssdr object, all elements are printed, except the directions S.
A list of class 'tssdr' containing the following components:
W |
The estimated signal separation matrix. |
k |
The vector of the used lags. |
S |
The estimated directions as time series object standardized to have mean 0 and unit variances. |
MU |
The mean vector of |
L |
The Lambda matrix for choosing lags and directions. |
H |
The used number of slices. |
yname |
The name for the response time series |
Xname |
The name for the predictor time series |
algorithm |
The used algorithm as a character string. |
Markus Matilainen
Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57–69.
Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2019), Sliced Average Variance Estimation for Multivariate Time Series. Statistics: A Journal of Theoretical and Applied Statistics, 53, 630–655.
Li, K.C. (1991), Sliced Inverse Regression for Dimension Reduction, Journal of the American Statistical Association, 86, 316–327.
Cook, R. and Weisberg, S. (1991), Sliced Inverse Regression for Dimension Reduction, Comment. Journal of the American Statistical Association, 86, 328–332.
n <- 10000 A <- matrix(rnorm(9), 3, 3) x1 <- arima.sim(n = n, list(ar = 0.2)) x2 <- arima.sim(n = n, list(ar = 0.8)) x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4)) eps2 <- rnorm(n - 1) y <- 2*x1[1:(n - 1)] + eps2 X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A) res1 <- tssdr(y, X, algorithm = "TSAVE") res1 summ1 <- summary(res1, type = "alllag", thres = 0.8) summ1 plot(summ1) head(components(summ1)) coef(summ1) # Hybrid of TSIR and TSAVE. For TSIR part H = 10 and for TSAVE part H = 2. tssdr(y, X, algorithm = "TSSH", weight = 0.6, H = c(10, 2))
n <- 10000 A <- matrix(rnorm(9), 3, 3) x1 <- arima.sim(n = n, list(ar = 0.2)) x2 <- arima.sim(n = n, list(ar = 0.8)) x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4)) eps2 <- rnorm(n - 1) y <- 2*x1[1:(n - 1)] + eps2 X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A) res1 <- tssdr(y, X, algorithm = "TSAVE") res1 summ1 <- summary(res1, type = "alllag", thres = 0.8) summ1 plot(summ1) head(components(summ1)) coef(summ1) # Hybrid of TSIR and TSAVE. For TSIR part H = 10 and for TSAVE part H = 2. tssdr(y, X, algorithm = "TSSH", weight = 0.6, H = c(10, 2))
The vSOBI (variant of Second Order Blind Identification) method for the blind source separation of time series with stochastic volatility. The method is a variant of SOBI, which is a method designed to separate ARMA sources, and an alternative to FixNA and FixNA2 methods.
vSOBI(X, ...) ## Default S3 method: vSOBI(X, k = 1:12, eps = 1e-06, maxiter = 1000, G = c("pow", "lcosh"), ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' vSOBI(X, ...) ## S3 method for class 'xts' vSOBI(X, ...) ## S3 method for class 'zoo' vSOBI(X, ...)
vSOBI(X, ...) ## Default S3 method: vSOBI(X, k = 1:12, eps = 1e-06, maxiter = 1000, G = c("pow", "lcosh"), ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...) ## S3 method for class 'ts' vSOBI(X, ...) ## S3 method for class 'xts' vSOBI(X, ...) ## S3 method for class 'zoo' vSOBI(X, ...)
X |
A numeric matrix or a multivariate time series object of class |
k |
A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is |
eps |
Convergence tolerance. |
maxiter |
The maximum number of iterations. |
G |
Function |
ordered |
Whether to order components according to their volatility. Default is |
acfk |
A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if |
original |
Whether to return the original components or their residuals based on ARMA fit. Default is |
alpha |
Alpha level for linear correlation detection. Default is 0.05. |
... |
Further arguments to be passed to or from methods. |
Assume that a -variate
with
observations is whitened, i.e.
, for
,
where
is the sample covariance matrix of
. The algorithm finds an orthogonal matrix
by maximizing
where ,
and
. For function
the choices are
and log(cosh(
)).
The algorithm works iteratively starting with diag(p)
as an initial value for an orthogonal matrix .
Matrix is a partial derivative of
with respect to
.
Then
, where
is the number of columns in
, and
.
The update for the orthogonal matrix
is calculated at each iteration step. The algorithm stops when
is less than eps
.
The final unmixing matrix is then .
For ordered = TRUE
the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima
. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest
.
A list of class 'bssvol', inheriting from class 'bss', containing the following components:
W |
The estimated unmixing matrix. If |
k |
The vector of the used lags. |
S |
The estimated sources as time series object standardized to have mean 0 and unit variances. If |
MU |
The mean vector of |
If ordered = TRUE
, then also the following components included in the list:
Sraw |
The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if |
fits |
The ARMA fits for the components with linear autocorrelation. |
armaeff |
A logical vector. Is TRUE if ARMA fit was done to the corresponding component. |
linTS |
The value of the modified Ljung-Box test statistic for each component. |
linP |
p-value based on the modified Ljung-Box test statistic for each component. |
volTS |
The value of the volatility clustering test statistic. |
volP |
p-value based on the volatility clustering test statistic. |
Markus Matilainen
Belouchrani, A., Abed-Meriam, K., Cardoso, J.F. and Moulines, R. (1997), A Blind Source Separation Technique Using Second-Order Statistics, IEEE Transactions on Signal Processing, 434–444.
Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.
FixNA
, SOBI
, lbtest
, auto.arima
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- vSOBI(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
if(require("stochvol")) { n <- 10000 A <- matrix(rnorm(9), 3, 3) # simulate SV models s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y # create a daily time series X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25) res <- vSOBI(X) res coef(res) plot(res) head(bss.components(res)) MD(res$W, A) # Minimum Distance Index, should be close to zero }
This data set has logarithmic returns of exchange rates of 7 currencies against US dollar extracted from the International Monetary Fund's (IMF) database. These currencies are Australian Dollar (AUD), Canadian Dollar (CAD), Norwegian Kroner (NOK), Singapore Dollar (SGD), Swedish Kroner (SEK), Swiss Franc (CHF) and British Pound (GBP).
data("WeeklyReturnsData")
data("WeeklyReturnsData")
An object of class ts with 605 observations on the following 7 variables.
AUD
The weekly logarithmic returns of the exchange rates of AUD against US Dollar.
CAD
The weekly logarithmic returns of the exchange rates of CAD against US Dollar.
NOK
The weekly logarithmic returns of the exchange rates of NOK against US Dollar.
SGD
The weekly logarithmic returns of the exchange rates of SGD against US Dollar.
SEK
The weekly logarithmic returns of the exchange rates of SEK against US Dollar.
CHF
The weekly logarithmic returns of the exchange rates of CHF against US Dollar.
GBP
The weekly logarithmic returns of the exchange rates of GBP against US Dollar.
The daily exhange rates of the currencies against US Dollar from March 22, 2000 to October 26, 2011 are extracted from the International Monetary Fund's (IMF) Exchange Rates database from https://www.imf.org/external/np/fin/ert/GUI/Pages/CountryDataBase.aspx. These rates are representative rates (currency units per US Dollar), which are reported daily to the IMF by the issuing central bank.
The weekly averages of these exchange rates are then calculated. The logarithmic returns of the average weekly exchange rates are calculated for the currency as follows.
Let be the exchange rates of
against US Dollar. Then
where and
.
The six missing values in
are changed to 0. The assumption used here is that there has not been any change from the previous week.
The weekly returns data is then changed to a multivariate time series object of class ts
. The resulting ts
object is then dataset WeeklyReturnsData
.
An example analysis of the data is given in Miettinen et al. (2018). Same data has also been used in Hu and Tsay (2014).
International Monetary Fund (2017), IMF Exchange Rates, https://www.imf.org/external/np/fin/ert/GUI/Pages/CountryDataBase.aspx
For IMF Copyrights and Usage, and special terms and conditions pertaining to the use of IMF data, see https://www.imf.org/external/terms.htm
Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis,41, 293–311.
Hu and Tsay (2014), Principal Volatility Component Analysis, Journal of Business & Economic Statistics, 32(2), 153–164.
plot(WeeklyReturnsData) res <- gSOBI(WeeklyReturnsData) res coef(res) plot(res) head(bss.components(res))
plot(WeeklyReturnsData) res <- gSOBI(WeeklyReturnsData) res coef(res) plot(res) head(bss.components(res))