Package 'tsBSS'

Title: Blind Source Separation and Supervised Dimension Reduction for Time Series
Description: Different estimators are provided to solve the blind source separation problem for multivariate time series with stochastic volatility and supervised dimension reduction problem for multivariate time series. Different functions based on AMUSE and SOBI are also provided for estimating the dimension of the white noise subspace. The package is fully described in Nordhausen, Matilainen, Miettinen, Virta and Taskinen (2021) <doi:10.18637/jss.v098.i15>.
Authors: Markus Matilainen [cre, aut] , Christophe Croux [aut], Jari Miettinen [aut] , Klaus Nordhausen [aut] , Hannu Oja [aut], Sara Taskinen [aut] , Joni Virta [aut]
Maintainer: Markus Matilainen <[email protected]>
License: GPL (>= 2)
Version: 1.0.0
Built: 2024-11-06 06:40:32 UTC
Source: CRAN

Help Index


Blind Source Separation and Supervised Dimension Reduction for Time Series

Description

Different estimators are provided to solve the blind source separation problem for multivariate time series with stochastic volatility and supervised dimension reduction problem for multivariate time series. Different functions based on AMUSE and SOBI are also provided for estimating the dimension of the white noise subspace. The package is fully described in Nordhausen, Matilainen, Miettinen, Virta and Taskinen (2021) <doi:10.18637/jss.v098.i15>.

Details

Package: tsBSS
Type: Package
Version: 1.0.0
Date: 2021-07-09
License: GPL (>= 2)

This package contains functions for the blind source separation (BSS) problem for multivariate time series. The methods are designed for time series with stochastic volatility, such as GARCH and SV models. The main functions of the package for the BSS problem are

  • FixNA Function to solve the BSS problem. Algorithm is an alternative to vSOBI algorithm to acommodate stochastic volatility.

  • gFOBI Function to solve the BSS problem. Algorithm is a generalization of FOBI designed for time series with stochastic volatility.

  • gJADE Function to solve the BSS problem. Algorithm is a generalization of JADE designed for time series with stochastic volatility.

  • vSOBI Function to solve the BSS problem. Algorithm is a variant of SOBI algorithm and an alternative to FixNA to acommodate stochastic volatility.

  • gSOBI Function to solve the BSS problem. Algorithm is a combination of SOBI and vSOBI algorithms.

  • PVC Function to solve the BSS problem. Algorithm is a modified version of Principal Component Volatility Analysis by Hu and Tsay (2011).

The input data can be a numeric matrix or a multivariate time series object of class ts, xts or zoo. For other classes, the tsbox package provides appropriate conversions to and from these classes.

The main function of the package for the supervised dimension reduction is

  • tssdr Function for supervised dimension reduction for multivariate time series. Includes methods TSIR, TSAVE and TSSH.

Methods for ARMA models, such as AMUSE and SOBI, and some non-stationary BSS methods for time series are implemented in the JADE package. See function dr for methods for supervised dimension reduction for iid observations.

Several functions in this package utilize "rjd" (real joint diagonalization) and "frjd" (fast rjd) from the JADE package for joint diagonalization of k real-valued square matrices. For whitening the time series this package uses function "BSSprep" from package BSSprep.

There are several functions for estimating the number of white noise latent series in second-order source separation (SOS) models. The functions are

Additionally, there is function lbtest for a modified Ljung-Box test and a volatility clustering test for univariate and multivariate time series.

The package also contains a dataset WeeklyReturnsData, which has logarithmic returns of exchange rates of 7 currencies against US Dollar.

Author(s)

Markus Matilainen, Christophe Croux, Jari Miettinen, Klaus Nordhausen, Hannu Oja, Sara Taskinen, Joni Virta

Maintainer: Markus Matilainen <[email protected]>

References

Nordhausen, K., Matilainen, M., Miettinen, J., Virta, J. and Taskinen, S. (2021) Dimension Reduction for Time Series in a Blind Source Separation Context Using R, Journal of Statistical Software, 98(15), 1–30. <doi:10.18637/jss.v098.i15>

Matilainen, M., Nordhausen, K. and Oja, H. (2015), New Independent Component Analysis Tools for Time Series, Statistics & Probability Letters, 105, 80–87.

Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57–69.

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2019), Sliced Average Variance Estimation for Multivariate Time Series. Statistics: A Journal of Theoretical and Applied Statistics, 53, 630–655.

Shi, Z., Jiang, Z. and Zhou, F. (2009), Blind Source Separation with Nonlinear Autocorrelation and Non-Gaussianity, Journal of Computational and Applied Mathematics, 223(1): 908–915.

Matilainen, M., Nordhausen, K. and Virta, J. (2018), On the Number of Signals in Multivariate Time Series. In Deville, Y., Gannot, S., Mason, R., Plumbley, M.D. and Ward, D. (editors) "International Conference on Latent Variable Analysis and Signal Separation", LNCS 10891, 248–258. Springer, Cham., <doi:10.1007/978-3-319-93764-9_24>.

Nordhausen, K. and Virta, J.(2018), Ladle Estimator for Time Series Signal Dimension. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 428–432, <doi:10.1109/SSP.2018.8450695>.

Virta, J. and Nordhausen, K. (2021), Determining the Signal Dimension in Second Order Source Separation. Statistica Sinica, 31, 135–156.

Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis, 41, 293–311.

Hu, Y.-P. and Tsay, R. S. (2014), Principal Volatility Component Analysis, Journal of Business & Economic Statistics, 32(2), 153–164.


Second-order Separation Sub-White-Noise Asymptotic Testing with AMUSE

Description

The function uses AMUSE (Algorithm for Multiple Unknown Signals Extraction) to test whether the last p-k latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. The test is asymptotic.

Usage

AMUSEasymp(X, ...)

## Default S3 method:
AMUSEasymp(X, k, tau = 1, ...)
## S3 method for class 'ts'
AMUSEasymp(X, ...)
## S3 method for class 'xts'
AMUSEasymp(X, ...)
## S3 method for class 'zoo'
AMUSEasymp(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

The number of latent series that are not white noise. Can be between 00 and p1p-1.

tau

The lag for the AMUSE autocovariance matrix.

...

Further arguments to be passed to or from methods.

Details

AMUSE standardizes X with nn samples and computes the eigenedcomposition of the autocovariance matrix of the standardized data for a chosen lag tau, yielding a transformation W\bf W giving the latent variables as S=XW{\bf S} = {\bf X} {\bf W}. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the squares of the corresponding eigenvalues of the autocovariance matrix. Under the null hypothesis the final pkp - k eigenvalues equal zero, λpk==λp=0\lambda_{p-k} = \cdots = \lambda_{p} = 0, and their mean square mm can be used as a test statistic in inference on the true number of latent white noise series.

This function conducts the hypothesis test using the asymptotic null distribution of mm, a chi-squared distribution with (pk)(pk+1)/2(p - k)(p - k + 1)/2 degrees of freedom.

Value

A list of class ictest, inheriting from class htest, containing:

statistic

The value of the test statistic.

p.value

The p-value of the test.

parameter

The degrees of freedom of the asymptotic null distribution.

method

Character string indicating which test was performed.

data.name

Character string giving the name of the data.

alternative

Character string specifying the alternative hypothesis.

k

The number of latent series that are not white noise used in the testing problem.

W

The transformation matrix to the latent series.

S

Multivariate time series with the centered source components.

D

The underlying eigenvalues of the autocovariance matrix.

MU

The location of the data which was subtracted before calculating AMUSE.

tau

The used lag.

...

Further arguments to be passed to or from methods.

Author(s)

Klaus Nordhausen, Joni Virta

References

Virta, J. and Nordhausen, K. (2021), Determining the Signal Dimension in Second Order Source Separation. Statistica Sinica, 31, 135–156.

See Also

AMUSE, SOBI, SOBIasymp

Examples

n <- 1000

  A <- matrix(rnorm(16), 4, 4)
  s1 <- arima.sim(list(ar = c(0.3, 0.6)), n)
  s2 <- arima.sim(list(ma = c(-0.3, 0.3)), n)
  s3 <- rnorm(n)
  s4 <- rnorm(n)

  S <- cbind(s1, s2, s3, s4)
  X <- S %*% t(A)

  asymp_res_1 <- AMUSEasymp(X, k = 1)
  asymp_res_1

  asymp_res_2 <- AMUSEasymp(X, k = 2)
  asymp_res_2

  # Plots of the estimated sources, the last two are white noise
  plot(asymp_res_2)

Second-order Separation Sub-White-Noise Bootstrap Testing with AMUSE

Description

The function uses AMUSE (Algorithm for Multiple Unknown Signals Extraction) to test whether the last p-k latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. Four different bootstrapping strategies are available and the function can be run in parallel.

Usage

AMUSEboot(X, ...)

## Default S3 method:
AMUSEboot(X, k, tau = 1, n.boot = 200, s.boot = c("p", "np1", "np2", "np3"),
          ncores = NULL, iseed = NULL, ...)
## S3 method for class 'ts'
AMUSEboot(X, ...)
## S3 method for class 'xts'
AMUSEboot(X, ...)
## S3 method for class 'zoo'
AMUSEboot(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

The number of latent series that are not white noise. Can be between 00 and p1p-1.

tau

The lag for the AMUSE autocovariance matrix.

n.boot

The number of bootstrapping samples.

s.boot

Bootstrapping strategy to be used. Possible values are "p" (default), "np1", "np2", "np3". See details for further information.

ncores

The number of cores to be used. If NULL or 1, no parallel computing is used. Otherwise makeCluster with type = "PSOCK" is used. It is the users repsonsibilty to choose a reasonable value for ncores. The function detectCores might be useful in this context.

iseed

If parallel computation is used, the seed passed on to clusterSetRNGStream. Default is NULL which means no fixed seed is used.

...

Further arguments to be passed to or from methods.

Details

AMUSE standardizes X with nn samples and computes the eigendecomposition of the autocovariance matrix of the standardized data for a chosen lag tau, yielding a transformation W\bf W giving the latent variables as S=XW{\bf S} = {\bf X} {\bf W}. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the squares of the corresponding eigenvalues of the autocovariance matrix. Under the null hypothesis the final pkp - k eigenvalues equal zero, λpk==λp=0\lambda_{p-k} = \cdots = \lambda_{p} = 0, and their mean square mm can be used as a test statistic in bootstrap-based inference on the true number of latent white noise series.

The function offers four different bootstrapping strategies for generating samples for which the null hypothesis approximately holds, and they are all based on the following general formula:

  1. Decompose the AMUSE-estimated latent series S\bf S into the postulated signal S1{\bf S}_1 and white noise S2{\bf S}_2.

  2. Take nn bootstrap samples S2{\bf S}_2^* of S2{\bf S}_2, see the different strategies below.

  3. Recombine S=(S1,S2)\bf S^* = ({\bf S}_1, {\bf S}_2^*) and back-transform X=SW1{\bf X}^*= {\bf S}^* {\bf W}^{-1}.

  4. Compute the test statistic based on X{\bf X}^*.

  5. Repeat the previous steps n.boot times.

The four different bootstrapping strategies are:

  1. s.boot = "p": The first strategy is parametric and simply generates all boostrap samples independently and identically from the standard normal distribution.

  2. s.boot = "np1": The second strategy is non-parametric and pools all observed n(pk)n(p - k) white noise observations together and draws the bootstrap samples from amongst them.

  3. s.boot = "np2": The third strategy is non-parametric and proceeds otherwise as the second strategy but acts component-wise. That is, for each of the pkp - k white noise series it pools the observed nn white noise observations together and draws the bootstrap samples of that particular latent series from amongst them.

  4. s.boot = "np3": The third strategy is non-parametric and instead of drawing the samples univariately as in the second and third strategies, it proceeds by resampling nn vectors of size pkp - k from amongst all the observed nn white noise vectors.

The function can be run in parallel by setting ncores to the desired number of cores (should be less than the number of cores available - 1). When running code in parallel the standard random seed of R is overridden and if a random seed needs to be set it should be passed via the argument iseed. The argument iseed has no effect in case ncores equals 1 (the default value).

Value

A list of class ictest, inheriting from class htest, containing:

statistic

The value of the test statistic.

p.value

The p-value of the test.

parameter

The number of bootstrap samples.

alternative

Character string specifying the alternative hypothesis.

k

The number of latent series that are not white noise used in the testing problem.

W

The transformation matrix to the latent series.

S

Multivariate time series with the centered source components.

D

The underlying eigenvalues of the autocovariance matrix.

MU

The location of the data which was subtracted before calculating AMUSE.

tau

The used lag.

method

Character string indicating which test was performed.

data.name

Character string giving the name of the data.

s.boot

Character string denoting which bootstrapping test version was used.

Author(s)

Markus Matilainen, Klaus Nordhausen, Joni Virta

References

Matilainen, M., Nordhausen, K. and Virta, J. (2018), On the Number of Signals in Multivariate Time Series. In Deville, Y., Gannot, S., Mason, R., Plumbley, M.D. and Ward, D. (editors) "International Conference on Latent Variable Analysis and Signal Separation", LNCS 10891, 248–258. Springer, Cham., <doi:10.1007/978-3-319-93764-9_24>.

See Also

AMUSE, SOBI, SOBIboot

Examples

n <- 1000

  A <- matrix(rnorm(16), 4, 4)
  s1 <- arima.sim(list(ar = c(0.3, 0.6)), n)
  s2 <- arima.sim(list(ma = c(-0.3, 0.3)), n)
  s3 <- rnorm(n)
  s4 <- rnorm(n)

  S <- cbind(s1, s2, s3, s4)
  X <- S %*% t(A)

  boot_res_1 <- AMUSEboot(X, k = 1)
  boot_res_1

  boot_res_2 <- AMUSEboot(X, k = 2)
  boot_res_2

  # Plots of the estimated sources, the last two are white noise
  plot(boot_res_2)

Ladle Estimator to Estimate the Number of White Noise Components in SOS with AMUSE

Description

The ladle estimator uses the eigenvalues and eigenvectors of an autocovariance matrix with the chosen lag to estimate the number of white noise components in second-order source separation (SOS).

Usage

AMUSEladle(X, ...)

## Default S3 method:
AMUSEladle(X, tau = 1, l = 20, sim = c("geom", "fixed"), n.boot = 200, ncomp = 
           ifelse(ncol(X) > 10, floor(ncol(X)/log(ncol(X))), ncol(X) - 1), ...)
## S3 method for class 'ts'
AMUSEladle(X, ...)
## S3 method for class 'xts'
AMUSEladle(X, ...)
## S3 method for class 'zoo'
AMUSEladle(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

tau

The lag for the AMUSE autocovariance matrix.

l

If sim = "geom" (default) then l is the success probability of the geometric distribution from where the bootstrap block lengths for the stationary bootstrap are drawn. If sim = "fixed" then l is the fixed block length for the fixed block bootstrap.

sim

If "geom" (default) then the stationary bootstrap is used. If "fixed" then the fixed block bootstrap is used.

n.boot

The number of bootstrapping samples. See tsboot for details.

ncomp

The number of components among which the ladle estimator is to be searched. Must be between 0 and ncol(X)-1. The default follows the recommendation of Luo and Li (2016).

...

Further arguments to be passed to or from methods.

Details

AMUSE standardizes X with nn samples and computes the eigenedcomposition of the autocovariance matrix of the standardized data for a chosen lag tau, yielding a transformation W\bf W giving the latent variables as S=XW{\bf S} = {\bf X} {\bf W}. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the squares of the corresponding eigenvalues of the autocovariance matrix. Under the assumption that we have kk non-white-noise components, the final pkp - k eigenvalues equal zero, λpk==λp=0\lambda_{p-k} = \cdots = \lambda_{p} = 0.

The change point from non-zero eigenvalues to zero eigenvalues is visible in the eigenvectors of the autocovariance matrix as an increase in their boostrap variablity. Similarly, before the change point, the squared eigenvalues decrease in magnitude and afterwards they stay constant. The ladle estimate combines the scaled eigenvector bootstrap variability with the scaled eigenvalues to estimate the number of non-white-noise components. The estimate is the value of k=0,,k = 0, \ldots , ncomp where the combined measure achieves its minimum value.

Value

A list of class ladle containing:

method

The string AMUSE.

k

The estimated number of non-white-noise components.

fn

The vector giving the measures of variation of the eigenvectors using the bootstrapped eigenvectors for the different number of components.

phin

Normalized eigenvalues of the AMUSE matrix.

data.name

The name of the data for which the ladle estimate was computed.

gn

The main criterion for the ladle estimate - the sum of fn and phin. k is the value where gn takes its minimum.

lambda

The eigenvalues of the AMUSE matrix.

W

The transformation matrix to the source components. Also known as the unmixing matrix.

S

Multivariate time series with the centered source components.

MU

The location of the data which was subtracted before calculating the source components.

sim

The used boostrapping technique, either "geom" or "fixed".

lag

The used lag.

Author(s)

Klaus Nordhausen, Joni Virta

References

Nordhausen, K. and Virta, J.(2018), Ladle Estimator for Time Series Signal Dimension. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 428–432, <doi:10.1109/SSP.2018.8450695>.

Luo, W. and Li, B. (2016), Combining Eigenvalues and Variation of Eigenvectors for Order Determination, Biometrika, 103. 875–887. <doi:10.1093/biomet/asw051>

See Also

AMUSE, SOBI, SOBIladle

Examples

n <- 1000
  
  s1 <- arima.sim(n = n, list(ar = 0.6, ma = c(0, -0.4)))
  s2 <- arima.sim(n = n, list(ar = c(0.4,0.1,0.3), ma = c(0.2, 0.4)))
  s3 <- arima.sim(n = n, list(ar = c(0.7, 0.1)))
  Snoise <- matrix(rnorm(5*n), ncol = 5)
  S <- cbind(s1, s2, s3, Snoise)

  A <- matrix(rnorm(64), 8, 8)
  X <- S %*% t(A)
  
  ladle_AMUSE <- AMUSEladle(X, l = 20, sim = "geom")

  # The estimated number of non-white-noise components
  summary(ladle_AMUSE)
  
  # The ladle plot
  ladleplot(ladle_AMUSE)
  # Using ggplot
  ggladleplot(ladle_AMUSE)
  
  # Time series plots of the estimated components
  plot(ladle_AMUSE)

Class: bssvol

Description

Class bssvol (blind source separation in stochastic volatility processes) with methods print.bssvol (prints an object of class bssvol) and plot.bss (plots an object of class bssvol).

Class also inherits methods from the class bss in package JADE: for extracting the components of an object of class bssvol (bss.components) and the coefficients of an object of class bssvol (coef.bss).

Usage

## S3 method for class 'bssvol'
print(x, ...)

## S3 method for class 'bssvol'
plot(x, ...)

Arguments

x

An object of class bssvol.

...

Further arguments to be passed to or from methods.

Author(s)

Markus Matilainen

See Also

JADE, bss.components, coef.bss


The FixNA Method for Blind Source Separation

Description

The FixNA (Fixed-point algorithm for maximizing the Nonlinear Autocorrelation; Shi et al., 2009) and FixNA2 (Matilainen et al., 2017) methods for blind source separation of time series with stochastic volatility. These methods are alternatives to vSOBI method.

Usage

FixNA(X, ...)

## Default S3 method:
FixNA(X, k = 1:12, eps = 1e-06, maxiter = 1000, G = c("pow", "lcosh"),
      method = c("FixNA", "FixNA2"),
      ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...)
## S3 method for class 'ts'
FixNA(X, ...)
## S3 method for class 'xts'
FixNA(X, ...)
## S3 method for class 'zoo'
FixNA(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:12.

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

G

Function G(x)G(x). The choices are "pow" (default) and "lcosh".

method

The method to be used. The choices are "FixNA" (default) and "FixNA2".

ordered

Whether to order components according to their volatility. Default is FALSE.

acfk

A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if ordered = TRUE.

original

Whether to return the original components or their residuals based on ARMA fit. Default is TRUE, i.e. the original components are returned. Applicable only if ordered = TRUE.

alpha

Alpha level for linear correlation detection. Default is 0.05.

...

Further arguments to be passed to or from methods.

Details

Assume that a pp-variate Y{\bf Y} with TT observations is whitened, i.e. Y=S1/2(Xt1Tt=1TXt){\bf Y}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S{\bf S} is the sample covariance matrix of X{\bf X}. The algorithm for method FixNA finds an orthogonal matrix U{\bf U} by maximizing

D1(U)=k=1KD1k(U)=k=1Ki=1p1Tkt=1Tk[G(uiYt)G(uiYt+k)]{\bf D}_1({\bf U}) = \sum_{k = 1}^K {\bf D}_{1k}({\bf U})= \sum_{k = 1}^K \sum_{i = 1}^p \frac{1}{T - k}\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_t) G({\bf u}_i' {\bf Y}_{t + k})]

and the algorithm for method FixNA2 finds an orthogonal matrix U{\bf U} by maximizing

D2(U)=k=1KD2k(U){\bf D}_2({\bf U}) = \sum_{k = 1}^K {\bf D}_{2k}({\bf U})

=k=1Ki=1p1Tkt=1Tk[G(uiYt)G(uiYt+k)](1Tk)2t=1Tk[G(uiYt)]t=1Tk[G(uiYt+k)].= \sum_{k = 1}^K \sum_{i = 1}^p\left|\frac{1}{T - k}\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_t) G({\bf u}_i' {\bf Y}_{t + k})] - \left(\frac{1}{T - k}\right)^2\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_t)]\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_{t + k})]\right|.

where i=1,,pi = 1, \ldots, p, k=1,,Kk = 1, \ldots, K and t=1,,Tt = 1, \ldots, T. For function G(x)G(x) the choices are x2x^2 and log(cosh(xx)).

The algorithm works iteratively starting with diag(p) as an initial value for an orthogonal matrix U=(u1,u2,,up){\bf U} = ({\bf u}_1, {\bf u}_2, \ldots, {\bf u}_p)'.

Matrix Tmik{\bf T}_{mik} is a partial derivative of Dmk(U){\bf D}_{mk}({\bf U}), for m=1,2m = 1, 2, with respect to ui{\bf u}_i. Then Tmk=(Tm1k,,Tmpk){\bf T}_{mk} = ({\bf T}_{m1k}, \ldots, {\bf T}_{mpk})', where pp is the number of columns in Y{\bf Y}, and Tm=k=1KTmk{\bf T}_m = \sum_{k = 1}^K {\bf T}_{mk}. The update for the orthogonal matrix Unew=(TmTm)1/2Tm{\bf U}_{new} = ({\bf T}_m{\bf T}_m')^{-1/2}{\bf T}_m is calculated at each iteration step. The algorithm stops when

UnewUold||{\bf U}_{new} - {\bf U}_{old}||

is less than eps. The final unmixing matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}.

For ordered = TRUE the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest.

Value

A list of class 'bssvol', inheriting from class 'bss', containing the following components:

W

The estimated unmixing matrix. If ordered = TRUE, the rows are ordered according to the order of the components.

k

The vector of the used lags.

S

The estimated sources as time series object standardized to have mean 0 and unit variances. If ordered = TRUE, then components are ordered according to their volatility. If original = FALSE, the sources with linear autocorrelation are replaced by their ARMA residuals.

MU

The mean vector of X.

If ordered = TRUE, then also the following components included in the list:

Sraw

The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if original = FALSE.

fits

The ARMA fits for the components with linear autocorrelation.

armaeff

A logical vector. Is TRUE if ARMA fit was done to the corresponding component.

linTS

The value of the modified Ljung-Box test statistic for each component.

linP

p-value based on the modified Ljung-Box test statistic for each component.

volTS

The value of the volatility clustering test statistic.

volP

p-value based on the volatility clustering test statistic.

Author(s)

Markus Matilainen

References

Hyvärinen, A. (2001), Blind Source Separation by Nonstationarity of Variance: A Cumulant-Based Approach, IEEE Transactions on Neural Networks, 12(6): 1471–1474.

Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.

Shi, Z., Jiang, Z. and Zhou, F. (2009), Blind Source Separation with Nonlinear Autocorrelation and Non-Gaussianity, Journal of Computational and Applied Mathematics, 223(1): 908–915.

See Also

vSOBI, lbtest, auto.arima

Examples

if(require("stochvol")) {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

# Simulate SV models
s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y
s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y
s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y

# Create a daily time series
X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25)

res <- FixNA(X)
res
coef(res)
plot(res)
head(bss.components(res))

MD(res$W, A) # Minimum Distance Index, should be close to zero
}

Generalized FOBI

Description

The gFOBI (generalized Fourth Order Blind Identification) method for blind source separation of time series with stochastic volatility. The method is a generalization of FOBI, which is a method designed for iid data.

Usage

gFOBI(X, ...)

## Default S3 method:
gFOBI(X, k = 0:12, eps = 1e-06, maxiter = 100, method = c("frjd", "rjd"),
      na.action = na.fail, weight = NULL, ordered = FALSE,
      acfk = NULL, original = TRUE, alpha = 0.05, ...)
## S3 method for class 'ts'
gFOBI(X, ...)
## S3 method for class 'xts'
gFOBI(X, ...)
## S3 method for class 'zoo'
gFOBI(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

A vector of lags. It can be any non-negative integer, or a vector consisting of them. Default is 0:12. If k=0k = 0, this method reduces to FOBI.

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

method

The method to use for the joint diagonalization. The options are "rjd" and "frjd". Default is "frjd".

na.action

A function which indicates what should happen when the data contain 'NA's. Default is to fail.

weight

A vector of length k to give weight to the different matrices in joint diagonalization. If NULL, all matrices have equal weight.

ordered

Whether to order components according to their volatility. Default is FALSE.

acfk

A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if ordered = TRUE.

original

Whether to return the original components or their residuals based on ARMA fit. Default is TRUE, i.e. the original components are returned. Applicable only if ordered = TRUE.

alpha

Alpha level for linear correlation detection. Default is 0.05.

...

Further arguments to be passed to or from methods.

Details

Assume that a pp-variate Y{\bf Y} with TT observations is whitened, i.e. Y=S1/2(Xt1Tt=1TXt){\bf Y}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S{\bf S} is the sample covariance matrix of X{\bf X}. Algorithm first calculates

B^kij(Y)=1Tkt=1T[Yt+kYtEijYtYt+k]{\bf \widehat{B}}^{ij}_k({\bf Y}) = \frac{1}{T - k} \sum_{t = 1}^T [{\bf Y}_{t + k} {\bf Y}_t' {\bf E}^{ij} {\bf Y}_t {\bf Y}_{t + k}']

where t=1,,Tt = 1, \ldots, T, and then

B^k(Y)=i=1pB^kii(Y).{\bf \widehat{B}}_k({\bf Y}) = \sum_{i = 1}^p {\bf \widehat{B}}^{ii}_k({\bf Y}).

for i=1,,pi = 1, \ldots, p.

The algorithm finds an orthogonal matrix U{\bf U} by maximizing

k=0Kdiag(UB^k(Y)U)2.\sum_{k = 0}^K ||\textrm{diag}({\bf U \widehat{B}}_k({\bf Y}) {\bf U}')||^2.

The final unmixing matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}.

For ordered = TRUE the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest.

Value

A list of class 'bssvol', inheriting from class 'bss', containing the following components:

W

The estimated unmixing matrix. If ordered = TRUE, the rows are ordered according to the order of the components.

k

The vector of the used lags.

S

The estimated sources as time series object standardized to have mean 0 and unit variances. If ordered = TRUE, then components are ordered according to their volatility. If original = FALSE, the sources with linear autocorrelation are replaced by their ARMA residuals.

MU

The mean vector of X.

If ordered = TRUE, then also the following components included in the list:

Sraw

The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if original = FALSE.

fits

The ARMA fits for the components with linear autocorrelation.

armaeff

A logical vector. Is TRUE if ARMA fit was done to the corresponding component.

linTS

The value of the modified Ljung-Box test statistic for each component.

linP

p-value based on the modified Ljung-Box test statistic for each component.

volTS

The value of the volatility clustering test statistic.

volP

p-value based on the volatility clustering test statistic.

Author(s)

Markus Matilainen, Klaus Nordhausen

References

Cardoso, J.-F. (1989), Source Separation Using Higher Order Moments, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2109–2112.

Matilainen, M., Nordhausen, K. and Oja, H. (2015), New Independent Component Analysis Tools for Time Series, Statistics & Probability Letters, 105, 80–87.

See Also

FOBI, frjd, lbtest, auto.arima

Examples

if(require("stochvol")) {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

# simulate SV models
s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y
s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y
s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y

X <- cbind(s1, s2, s3) %*% t(A)

res <- gFOBI(X)
res
coef(res)
plot(res)
head(bss.components(res))

MD(res$W, A) # Minimum Distance Index, should be close to zero
}

Generalized JADE

Description

The gJADE (generalized Joint Approximate Diagonalization of Eigenmatrices) method for blind source separation of time series with stochastic volatility. The method is a generalization of JADE, which is a method for blind source separation problem using only marginal information.

Usage

gJADE(X, ...)

## Default S3 method:
gJADE(X, k = 0:12, eps = 1e-06, maxiter = 100, method = c("frjd", "rjd"),
      na.action = na.fail, weight = NULL, ordered = FALSE,
      acfk = NULL, original = TRUE, alpha = 0.05, ...)
## S3 method for class 'ts'
gJADE(X, ...)
## S3 method for class 'xts'
gJADE(X, ...)
## S3 method for class 'zoo'
gJADE(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

A vector of lags. It can be any non-negative integer, or a vector consisting of them. Default is 0:12. If k=0k = 0, this method reduces to JADE.

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

method

The method to use for the joint diagonalization. The options are "rjd" and "frjd". Default is "frjd".

na.action

A function which indicates what should happen when the data contain 'NA's. Default is to fail.

weight

A vector of length k to give weight to the different matrices in joint diagonalization. If NULL, all matrices have equal weight.

ordered

Whether to order components according to their volatility. Default is FALSE.

acfk

A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if ordered = TRUE.

original

Whether to return the original components or their residuals based on ARMA fit. Default is TRUE, i.e. the original components are returned. Applicable only if ordered = TRUE.

alpha

Alpha level for linear correlation detection. Default is 0.05.

...

Further arguments to be passed to or from methods.

Details

Assume that a pp-variate Y{\bf Y} with TT observations is whitened, i.e. Y=S1/2(Xt1Tt=1TXt){\bf Y}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S{\bf S} is the sample covariance matrix of X{\bf X}. The matrix C^kij(Y){\bf \widehat{C}}^{ij}_k({\bf Y}) is of the form

C^kij(Y)=B^kij(Y)Sk(Y)(Eij+Eji)Sk(Y)trace(Eij)Ip,{\bf \widehat{C}}^{ij}_k({\bf Y}) = {\bf \widehat{B}}^{ij}_k({\bf Y}) - {\bf S}_k({\bf Y}) ({\bf E}^{ij} + {\bf E}^{ji}) {\bf S}_k({\bf Y})' - \textrm{trace}({\bf E}^{ij}) {\bf I}_p,

for i,j=1,,pi, j = 1, \ldots, p, where Sk(Y){\bf S}_k({\bf Y}) is the lagged sample covariance matrix of Y{\bf Y} for lag k=1,,Kk = 1, \ldots, K, Eij{\bf E}^{ij} is a matrix where element (i,j)(i,j) equals to 1 and all other elements are 0, Ip{\bf I}_p is an identity matrix of order pp and B^kij(Y){\bf \widehat{B}}^{ij}_k({\bf Y}) is as in gFOBI.

The algorithm finds an orthogonal matrix U{\bf U} by maximizing

i=1pj=1pk=0Kdiag(UC^kij(Y)U)2.\sum_{i = 1}^p \sum_{j = 1}^p \sum_{k = 0}^K ||diag({\bf U \widehat{C}}^{ij}_k({\bf Y}) {\bf U}')||^2.

where k=1,,Kk = 1, \ldots, K. The final unmixing matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}.

For ordered = TRUE the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest.

Value

A list of class 'bssvol', inheriting from class 'bss', containing the following components:

W

The estimated unmixing matrix. If ordered = TRUE, the rows are ordered according to the order of the components.

k

The vector of the used lags.

S

The estimated sources as time series object standardized to have mean 0 and unit variances. If ordered = TRUE, then components are ordered according to their volatility. If original = FALSE, the sources with linear autocorrelation are replaced by their ARMA residuals.

MU

The mean vector of X.

If ordered = TRUE, then also the following components included in the list:

Sraw

The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if original = FALSE.

fits

The ARMA fits for the components with linear autocorrelation.

armaeff

A logical vector. Is TRUE if ARMA fit was done to the corresponding component.

linTS

The value of the modified Ljung-Box test statistic for each component.

linP

p-value based on the modified Ljung-Box test statistic for each component.

volTS

The value of the volatility clustering test statistic.

volP

p-value based on the volatility clustering test statistic.

Author(s)

Klaus Nordhausen, Markus Matilainen

References

Cardoso, J.-F., Souloumiac, A. (1993), Blind Beamforming for Non-Gaussian Signals, in: IEE-Proceedings-F, volume 140, pp. 362–370.

Matilainen, M., Nordhausen, K. and Oja, H. (2015), New Independent Component Analysis Tools for Time Series, Statistics & Probability Letters, 105, 80–87.

See Also

frjd, JADE, gFOBI, lbtest, auto.arima

Examples

if(require("stochvol")) {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

# simulate SV models
s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y
s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y
s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y

X <- cbind(s1, s2, s3) %*% t(A)

res <- gJADE(X)
res
coef(res)
plot(res)
head(bss.components(res))

MD(res$W, A) # Minimum Distance Index, should be close to zero
}

Generalized SOBI

Description

The gSOBI (generalized Second Order Blind Identification) method for the blind source separation (BSS) problem. The method is designed for separating multivariate time series with or without stochastic volatility. The method is a combination of SOBI and vSOBI with G(x)=x2G(x) = x^2 as a nonlinearity function.

Usage

gSOBI(X, ...)

## Default S3 method:
gSOBI(X, k1 = 1:12, k2 = 1:3, b = 0.9, eps = 1e-06, maxiter = 1000, ordered = FALSE,
      acfk = NULL, original = TRUE, alpha = 0.05, ...)
## S3 method for class 'ts'
gSOBI(X, ...)
## S3 method for class 'xts'
gSOBI(X, ...)
## S3 method for class 'zoo'
gSOBI(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k1

A vector of lags for SOBI part. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:12.

k2

A vector of lags for vSOBI part. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:3.

b

The weight for the SOBI part, 1b1-b for the vSOBI part. Default is 0.9.

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

ordered

Whether to order components according to their volatility. Default is FALSE.

acfk

A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if ordered = TRUE.

original

Whether to return the original components or their residuals based on ARMA fit. Default is TRUE, i.e. the original components are returned. Applicable only if ordered = TRUE.

alpha

Alpha level for linear correlation detection. Default is 0.05.

...

Further arguments to be passed to or from methods.

Details

Assume that a pp-variate Y{\bf Y} with TT observations is whitened, i.e. Y=S1/2(Xt1Tt=1TXt){\bf Y}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S{\bf S} is the sample covariance matrix of X{\bf X}. The algorithm finds an orthogonal matrix U{\bf U} by maximizing

D(U)=bk1=1K1Dk1(U)+(1b)k2=1K2Dk2(U),{\bf D}({\bf U}) = b\sum_{k_1 = 1}^{K_1} {\bf D}_{k_1}({\bf U}) + (1 - b)\sum_{k_2 = 1}^{K_2} {\bf D}_{k_2}({\bf U}),

where SOBI part

Dk1=i=1p(1Tk1t=1Tk1[(uiYt)(uiYt+k1)])2.{\bf D}_{k_1} = \sum_{i = 1}^p \left(\frac{1}{T - k_1}\sum_{t=1}^{T - k_1}[({\bf u}_i' {\bf Y}_t) ({\bf u}_i' {\bf Y}_{t + k_1})]\right)^2.

and vSOBI part

Dk2=i=1p(1Tk2t=1Tk2[(uiYt)2(uiYt+k2)2](1Tk2)2t=1Tk2[(uiYt)2]t=1Tk2[(uiYt+k2)2])2{\bf D}_{k_2} = \sum_{i = 1}^p \left(\frac{1}{T - k_2}\sum_{t=1}^{T - k_2}[({\bf u}_i' {\bf Y}_t)^2 ({\bf u}_i' {\bf Y}_{t + k_2})^2] - \left(\frac{1}{T - k_2}\right)^2\sum_{t=1}^{T - k_2}[({\bf u}_i' {\bf Y}_t)^2]\sum_{t=1}^{T - k_2}[({\bf u}_i' {\bf Y}_{t + k_2})^2]\right)^2

where b[0,1].b \in [0, 1]. is a value between 0 and 1, and i=1,,pi = 1, \ldots, p, k1=1,,K1k_1 = 1, \ldots, K_1, k2=1,,K2k_2 = 1, \ldots, K_2 and t=1,,Tt = 1, \ldots, T

The algorithm works iteratively starting with diag(p) as an initial value for an orthogonal matrix U=(u1,u2,,up){\bf U} = ({\bf u}_1, {\bf u}_2, \ldots, {\bf u}_p)'.

Matrix Tikj{\bf T}_{ikj} is a partial derivative of Dkj(U){\bf D}_{kj}({\bf U}), where j=1,2j = 1, 2, with respect to ui{\bf u}_i. Then Tkj=(T1kj,,Tpkj){\bf T}_{kj} = ({\bf T}_{1kj}, \ldots, {\bf T}_{pkj})', where pp is the number of columns in Y\bf Y, and Tj=kj=1KjTkj{\bf T}_j = \sum_{k_j = 1}^{K_j} {\bf T}_{kj}, for j=1,2j = 1, 2. Finally T=bT1+(1b)T2{\bf T} = b{\bf T}_1 + (1-b){\bf T}_2.

The update for the orthogonal matrix Unew=(TT)1/2T{\bf U}_{new} = ({\bf TT}')^{-1/2}{\bf T} is calculated at each iteration step. The algorithm stops when

UnewUold||{\bf U}_{new} - {\bf U}_{old}||

is less than eps. The final unmixing matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}.

For ordered = TRUE the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest.

Value

A list of class 'bssvol', inheriting from class 'bss', containing the following components:

W

The estimated unmixing matrix. If ordered = TRUE, the rows are ordered according to the order of the components.

k1

The vector of the used lags for the SOBI part.

k2

The vector of the used lags for the vSOBI part.

S

The estimated sources as time series object standardized to have mean 0 and unit variances. If ordered = TRUE, then components are ordered according to their volatility. If original = FALSE, the sources with linear autocorrelation are replaced by their ARMA residuals.

MU

The mean vector of X.

If ordered = TRUE, then also the following components included in the list:

Sraw

The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if original = FALSE.

fits

The ARMA fits for the components with linear autocorrelation.

armaeff

A logical vector. Is TRUE if ARMA fit was done to the corresponding component.

linTS

The value of the modified Ljung-Box test statistic for each component.

linP

p-value based on the modified Ljung-Box test statistic for each component.

volTS

The value of the volatility clustering test statistic.

volP

p-value based on the volatility clustering test statistic.

Author(s)

Markus Matilainen, Jari Miettinen

References

Belouchrani, A., Abed-Meriam, K., Cardoso, J.F. and Moulines, R. (1997), A Blind Source Separation Technique Using Second-Order Statistics, IEEE Transactions on Signal Processing, 434–444.

Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.

Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis, 41, 293–311.

See Also

SOBI, vSOBI, lbtest, auto.arima

Examples

if(require("stochvol")) {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

# simulate SV models
s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y
s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y
s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y

# create a daily time series
X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25)

res <- gSOBI(X, 1:4, 1:2, 0.99)
res$W
coef(res)
plot(res)
head(bss.components(res))

MD(res$W, A) # Minimum Distance Index, should be close to zero

# xts series as input
library("xts")
data(sample_matrix)
X2 <- as.xts(sample_matrix)
res2 <- gSOBI(X2, 1:4, 1:2, 0.99)
plot(res2, multi.panel = TRUE)

# zoo series as input
X3 <- as.zoo(X)
res3 <- gSOBI(X3, 1:4, 1:2, 0.99)
plot(res3)
}

Modified Ljung-Box Test and Volatility Clustering Test for Time Series.

Description

Modified Ljung-Box test and volatility clustering test for time series. Time series can be univariate or multivariate. The modified Ljung-Box test checks whether there is linear autocorrelation in the time series. The volatility clustering test checks whether the time series has squared autocorrelation, which would indicate a presence of volatility clustering.

Usage

lbtest(X, k, type = c("squared", "linear"))

## S3 method for class 'lbtest'
print(x, digits = 3, ...)

Arguments

X

A numeric vector/matrix or a univariate/multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

A vector of lags.

type

The type of the autocorrelation test. Options are Modified Ljung-Box test ("linear") or volatility clustering test ("squared") autocorrelation. Default is "squared".

In methods for class 'lbtest' only:

x

An object of class lbtest

digits

The number of digits when printing an object of class lbtest. Default is 3

...

Further arguments to be passed to or from methods.

Details

Assume all the individual time series XiX_i in X\bf X with TT observations are scaled to have variance 1.

Then the modified Ljung-Box test statistic for testing the existence of linear autocorrelation in XiX_i (option = "linear") is

Tjk(t=1T(XitXi,t+j)/(Tj))2/Vj.T \sum_{j \in k} \left(\sum_{t=1}^T (X_{it} X_{i, t + j})/(T - j)\right)^2/V_{j}.

Here

Vj=t=1njxt2xt+j2nj+2k=1nj1nkns=1nkjxsxs+jxs+kxs+k+jnkj.V_{j} = \sum_{t=1}^{n-j}\frac{x_t^2 x_{t+j}^2}{n-j} + 2 \sum_{k=1}^{n-j-1} \frac{n-k}{n} \sum_{s=1}^{n-k-j}\frac{x_s x_{s+j }x_{s+k} x_{s+k+j}}{n-k-j}.

where t=1,,njt = 1, \ldots, n - j, k=1,,nj1k = 1, \ldots, n - j - 1 and s=1,,nkjs = 1, \ldots, n - k - j.

The volatility clustering test statistic (option = "squared") is

Tjk(t=1T(Xit2Xi,t+j2)/(Tj)1)2T \sum_{j \in k} \left(\sum_{t=1}^T (X_{it}^2 X_{i, t + j}^2)/(T - j) - 1\right)^2

Test statistic related to each time series XiX_i is then compared to χ2\chi^2-distribution with length(k) degrees of freedom, and the corresponding p-values are produced. Small p-value indicates the existence of autocorrelation.

Value

A list of class 'lbtest' containing the following components:

TS

The values of the test statistic for each component of X as a vector.

p_val

The p-values based on the test statistic for each component of X as a vector.

Xname

The name of the data used as a character string.

varnames

The names of the variables used as a character string vector.

k

The lags used for testing the serial autocorrelation as a vector.

K

The total number of lags used for testing the serial autocorrelation.

type

The type of the autocorrelation test.

Author(s)

Markus Matilainen, Jari Miettinen

References

Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis, 41, 293–311.

See Also

FixNA, gFOBI, gJADE, vSOBI, gSOBI

Examples

if(require("stochvol")) {
n <- 10000
s1 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.1)$y
s2 <- rnorm(n)
S <- cbind(s1, s2)

lbtest(S, 1:3, type = "squared")
# First p-value should be very close to zero, as there exists stochastic volatility
}

A Modified Algorithm for Principal Volatility Component Estimator

Description

PVC (Principal Volatility Component) estimator for the blind source separation (BSS) problem. This method is a modified version of PVC by Hu and Tsay (2014).

Usage

PVC(X, ...)

## Default S3 method:
PVC(X, k = 1:12, ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...)
## S3 method for class 'ts'
PVC(X, ...)
## S3 method for class 'xts'
PVC(X, ...)
## S3 method for class 'zoo'
PVC(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:12.

ordered

Whether to order components according to their volatility. Default is FALSE.

acfk

A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if ordered = TRUE.

original

Whether to return the original components or their residuals based on ARMA fit. Default is TRUE, i.e. the original components are returned. Applicable only if ordered = TRUE.

alpha

Alpha level for linear correlation detection. Default is 0.05.

...

Further arguments to be passed to or from methods.

Details

Assume that a pp-variate Y{\bf Y} with TT observations is whitened, i.e. Y=S1/2(Xt1Tt=1TXt){\bf Y}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S\bf S is the sample covariance matrix of X\bf X. Then for each lag kk we calculate

Cov^(YtYt,Yij,tk)=1Tt=k+1T(YtYt1Tkt=k+1TYtYt)(Yij,tk1Tkt=k+1TYij,tk),\widehat{Cov}({\bf Y}_t {\bf Y}_t', Y_{ij, t-k}) = \frac{1}{T}\sum_{t = k + 1}^T \left({\bf Y}_t {\bf Y}_t' - \frac{1}{T-k}\sum_{t = k+1}^T {\bf Y}_t {\bf Y}_t' \right)\left(Y_{ij, t-k} - \frac{1}{T-k}\sum_{t = k+1}^T {Y}_{ij, t-k}\right),

where t=k+1,,Tt = k + 1, \ldots, T and Yij,tk=Yi,tkYj,tk,i,j=1,,pY_{ij, t-k} = Y_{i, t-k} Y_{j, t-k}, i, j = 1, \ldots, p. Then

gk(Y)=i=1pj=1p(Cov^(YtYt,Yij,tk))2.{\bf g}_k({\bf Y}) = \sum_{i = 1}^p \sum_{j=1}^p (\widehat{Cov}({\bf Y}_t {\bf Y}_t', Y_{ij, t-k}))^2.

where i,j=1,,p.i,j = 1, \ldots, p. Thus the generalized kurtosis matrix is

GK(Y)=k=1Kgk(Y),{\bf G}_K({\bf Y}) = \sum_{k = 1}^K {\bf g}_k({\bf Y}),

where k=1,,Kk = 1, \ldots, K is the set of chosen lags. Then U\bf U is the matrix with eigenvectors of GK(Y){\bf G}_K({\bf Y}) as its rows. The final unmixing matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}, where the average value of each row is set to be positive.

For ordered = TRUE the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest.

Value

A list of class 'bssvol', inheriting from class 'bss', containing the following components:

W

The estimated unmixing matrix. If ordered = TRUE, the rows are ordered according to the order of the components.

k

The vector of the used lags.

S

The estimated sources as time series object standardized to have mean 0 and unit variances. If ordered = TRUE, then components are ordered according to their volatility. If original = FALSE, the sources with linear autocorrelation are replaced by their ARMA residuals.

MU

The mean vector of X.

If ordered = TRUE, then also the following components included in the list:

Sraw

The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if original = FALSE.

fits

The ARMA fits for the components with linear autocorrelation.

armaeff

A logical vector. Is TRUE if ARMA fit was done to the corresponding component.

linTS

The value of the modified Ljung-Box test statistic for each component.

linP

p-value based on the modified Ljung-Box test statistic for each component.

volTS

The value of the volatility clustering test statistic.

volP

p-value based on the volatility clustering test statistic.

Author(s)

Jari Miettinen, Markus Matilainen

References

Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis,41, 293–311.

Hu, Y.-P. and Tsay, R. S. (2014), Principal Volatility Component Analysis, Journal of Business & Economic Statistics, 32(2), 153–164.

See Also

comVol, gSOBI, lbtest, auto.arima

Examples

if(require("stochvol")) {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

# Simulate SV models
s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y
s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y
s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y

# Create a daily time series
X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25)

res <- PVC(X)
res
coef(res)
plot(res)
head(bss.components(res))

MD(res$W, A) # Minimum Distance Index, should be close to zero
}

Second-order Separation Sub-White-Noise Asymptotic Testing with SOBI

Description

The function uses SOBI (Second Order Blind Identification) to test whether the last p-k latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. The test is asymptotic.

Usage

SOBIasymp(X, ...)

## Default S3 method:
SOBIasymp(X, k, tau = 1:12, eps = 1e-06, maxiter = 200, ...)
## S3 method for class 'ts'
SOBIasymp(X, ...)
## S3 method for class 'xts'
SOBIasymp(X, ...)
## S3 method for class 'zoo'
SOBIasymp(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

The number of latent series that are not white noise. Can be between 00 and p1p-1.

tau

The lags for the SOBI autocovariance matrices.

eps

The convergence tolerance for the joint diagonalization.

maxiter

The maximum number of iterations for the joint diagonalization.

...

Further arguments to be passed to or from methods.

Details

SOBI standardizes X with nn samples and jointly diagonalizes the autocovariance matrices of the standardized data for a chosen set of lags tau, yielding a transformation W\bf W giving the latent variables as S=XW{\bf S} = {\bf X} {\bf W}. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the sums of squares of the corresponding "eigenvalues" produced by the joint diagonalization. Under the null hypothesis the lower right corner (pk)×(pk)(p - k) \times (p - k) blocks of the autocovariance matrices of the sources are zero matrices and the sum mm of their squared norms over all lags can be used as a test statistic in inference on the true number of latent white noise series.

This function conducts the hypothesis test using the asymptotic null distribution of mm, a chi-squared distribution with T(pk)(pk+1)/2T(p - k)(p - k + 1)/2 degrees of freedom where TT is the number of autocovariance matrices used by SOBI.

Value

A list of class ictest, inheriting from class htest, containing:

statistic

The value of the test statistic.

p.value

The p-value of the test.

parameter

The degrees of freedom of the asymptotic null distribution.

method

Character string indicating which test was performed.

data.name

Character string giving the name of the data.

alternative

Character string specifying the alternative hypothesis.

k

The number of latent series that are not white noise used in the testing problem.

W

The transformation matrix to the latent series.

S

Multivariate time series with the centered source components.

D

The underlying eigenvalues of the autocovariance matrix.

MU

The location of the data which was subtracted before calculating SOBI.

tau

The used set of lags for the SOBI autocovariance matrices.

Author(s)

Klaus Nordhausen, Joni Virta

References

Virta, J. and Nordhausen, K. (2021), Determining the Signal Dimension in Second Order Source Separation. Statistica Sinica, 31, 135–156.

See Also

AMUSE, SOBI, AMUSEasymp

Examples

n <- 1000

  A <- matrix(rnorm(16), 4, 4)
  s1 <- arima.sim(list(ar = c(0, 0.6)), n)
  s2 <- arima.sim(list(ma = c(0, -0.5)), n)
  s3 <- rnorm(n)
  s4 <- rnorm(n)

  S <- cbind(s1, s2, s3, s4)
  X <- S %*% t(A)

  asymp_res_1 <- SOBIasymp(X, k = 1)
  asymp_res_1

  asymp_res_2 <- SOBIasymp(X, k = 2)
  asymp_res_2

  # Plots of the estimated sources, the last two are white noise
  plot(asymp_res_2)
  
  # Note that AMUSEasymp with lag 1 does not work due to the lack of short range dependencies
  AMUSEasymp(X, k = 1)

Second-order Separation Sub-White-Noise Bootstrap Testing with SOBI

Description

The function uses SOBI (Second Order Blind Identification) to test whether the last p-k latent series are pure white noise, assuming a p-variate second-order stationary blind source separation (BSS) model. Four different bootstrapping strategies are available and the function can be run in parallel.

Usage

SOBIboot(X, ...)

## Default S3 method:
SOBIboot(X, k, tau = 1:12, n.boot = 200, s.boot = c("p", "np1", "np2", "np3"),
         ncores = NULL, iseed = NULL, eps = 1e-06, maxiter = 200, ...)
## S3 method for class 'ts'
SOBIboot(X, ...)
## S3 method for class 'xts'
SOBIboot(X, ...)
## S3 method for class 'zoo'
SOBIboot(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

The number of latent series that are not white noise. Can be between 00 and p1p-1.

tau

The vector of lags for the SOBI autocovariance matrices.

n.boot

The number of bootstrapping samples.

s.boot

Bootstrapping strategy to be used. Possible values are "p" (default), "np1", "np2", "np3". See details for further information.

ncores

The number of cores to be used. If NULL or 1, no parallel computing is used. Otherwise makeCluster with type = "PSOCK" is used. It is the users repsonsibilty to choose a reasonable value for ncores. The function detectCores might be useful in this context.

iseed

If parallel computation is used, the seed passed on to clusterSetRNGStream. Default is NULL which means no fixed seed is used.

eps

The convergence tolerance for the joint diagonalization.

maxiter

The maximum number of iterations for the joint diagonalization.

...

Further arguments to be passed to or from methods.

Details

SOBI standardizes X with nn samples and jointly diagonalizes the autocovariance matrices of the standardized data for a chosen set of lags tau, yielding a transformation W\bf W giving the latent variables as S=XW{\bf S} = {\bf X} {\bf W}. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the sums of squares of the corresponding "eigenvalues" produced by the joint diagonalization. Under the null hypothesis the final pkp - k "eigenvalues" of each of the autocovariance matrices equal zero, λpkτ==λpτ=0\lambda^\tau_{p-k} = \cdots = \lambda^\tau_{p} = 0, and their mean square mm over all lags can be used as a test statistic in bootstrap-based inference on the true number of latent white noise series.

The function offers four different bootstrapping strategies for generating samples for which the null hypothesis approximately holds, and they are all based on the following general formula:

  1. Decompose the SOBI-estimated latent series S\bf S into the postulated signal S1{\bf S}_1 and white noise S2{\bf S}_2.

  2. Take nn bootstrap samples S2{\bf S}_2^* of S2{\bf S}_2, see the different strategies below.

  3. Recombine S=(S1,S2)\bf S^* = ({\bf S}_1, {\bf S}_2^*) and back-transform X=SW1{\bf X}^*= {\bf S}^* {\bf W}^{-1}.

  4. Compute the test statistic based on X{\bf X}^*.

The four different bootstrapping strategies are:

  1. s.boot = "p": The first strategy is parametric and simply generates all boostrap samples independently and identically from the standard normal distribution.

  2. s.boot = "np1": The second strategy is non-parametric and pools all observed n(pk)n(p - k) white noise observations together and draws the bootstrap samples from amongst them.

  3. s.boot = "np2": The third strategy is non-parametric and proceeds otherwise as the second strategy but acts component-wise. That is, separately for each of the pkp - k white noise series it pools the observed nn white noise observations together and draws the bootstrap samples of that particular latent series from amongst them.

  4. s.boot = "np3": The third strategy is non-parametric and instead of drawing the samples univariately as in the second and third strategies, proceeds by resampling nn vectors of size pkp - k from amongst all the observed nn white noise vectors.

The function can be run in parallel by setting ncores to the desired number of cores (should be less than the number of cores available - 1). When running code in parallel the standard random seed of R is overridden and if a random seed needs to be set it should be passed via the argument iseed. The argument iseed has no effect in case ncores equals 1 (the default value).

This function uses for the joint diagonalization the function frjd.int, which does not fail in case of failed convergence but returns the estimate from the final step.

Value

A list of class ictest, inheriting from class htest, containing:

statistic

The value of the test statistic.

p.value

The p-value of the test.

parameter

The number of bootstrap samples.

alternative

Character string specifying the alternative hypothesis.

k

The number of latent series that are not white noise used in the testing problem.

W

The transformation matrix to the latent series.

S

Multivariate time series with the centered source components.

D

The underlying eigenvalues of the autocovariance matrix.

MU

The location of the data which was subtracted before calculating SOBI.

tau

The used set of lags.

method

Character string indicating which test was performed.

data.name

Character string giving the name of the data.

s.boot

Character string denoting which bootstrapping test version was used.

Author(s)

Markus Matilainen, Klaus Nordhausen, Joni Virta

References

Matilainen, M., Nordhausen, K. and Virta, J. (2018), On the Number of Signals in Multivariate Time Series. In Deville, Y., Gannot, S., Mason, R., Plumbley, M.D. and Ward, D. (editors) "International Conference on Latent Variable Analysis and Signal Separation", LNCS 10891, 248–258. Springer, Cham., <doi:10.1007/978-3-319-93764-9_24>.

See Also

AMUSE, AMUSEboot, SOBI, tsboot

Examples

n <- 1000
  
  A <- matrix(rnorm(16), 4, 4)
  s1 <- arima.sim(list(ar = c(0, 0, 0.3, 0.6)), n)
  s2 <- arima.sim(list(ma = c(0, 0, -0.3, 0.3)), n)
  s3 <- rnorm(n)
  s4 <- rnorm(n)
  
  S <- cbind(s1, s2, s3, s4)
  X <- S %*% t(A)
  
  boot_res_1 <- SOBIboot(X, k = 1)
  boot_res_1
  
  boot_res_2 <- SOBIboot(X, k = 2)
  boot_res_2

  # Plots of the estimated sources, the last two are white noise
  plot(boot_res_2)

  # Note that AMUSEboot with lag 1 does not work due to the lack of short range dependencies
  AMUSEboot(X, k = 1)
  
  # xts series as input
  library("xts")
  data(sample_matrix)
  X2 <- as.xts(sample_matrix)
  boot_res_xts <- SOBIboot(X2, k = 2)
  plot(boot_res_xts, multi.panel = TRUE)

  # zoo series as input
  X3 <- as.zoo(X)
  boot_res_zoo <- SOBIboot(X3, k = 2)
  plot(boot_res_zoo)

Ladle Estimator to Estimate the Number of White Noise Components in SOS with SOBI

Description

The ladle estimator uses the joint diagonalization "eigenvalues" and "eigenvectors" of several autocovariance matrices to estimate the number of white noise components in second-order source separation (SOS).

Usage

SOBIladle(X, ...)

## Default S3 method:
SOBIladle(X, tau = 1:12, l = 20, sim = c("geom", "fixed"), n.boot = 200, ncomp =
          ifelse(ncol(X) > 10, floor(ncol(X)/log(ncol(X))), ncol(X) - 1),
          maxiter = 1000, eps = 1e-06, ...)
## S3 method for class 'ts'
SOBIladle(X, ...)
## S3 method for class 'xts'
SOBIladle(X, ...)
## S3 method for class 'zoo'
SOBIladle(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

tau

The lags for the SOBI autocovariance matrices.

l

If sim = "geom" then l is the success probability of the geometric distribution from where the bootstrap block lengths for the stationary bootstrap are drawn. If sim = "fixed" then l is the fixed block length for the fixed block bootstrap.

sim

If "geom" (default) then the stationary bootstrap is used. If "fixed" then the fixed block bootstrap is used.

n.boot

The number of bootstrapping samples. See tsboot for details.

ncomp

The number of components among which the ladle estimator is to be searched. Must be between 0 and ncol(X)-1. The default follows the recommendation of Luo and Li (2016).

maxiter

Maximum number of iterations.

eps

Convergence tolerance.

...

Further arguments to be passed to or from methods.

Details

SOBI standardizes X with nn samples and jointly diagonalizes the autocovariance matrices of the standardized data for a chosen set of lags tau, yielding a transformation W\bf W giving the latent variables as S=XW{\bf S} = {\bf X} {\bf W}. Assume, without loss of generality, that the latent components are ordered in decreasing order with respect to the sums of squares of the corresponding "eigenvalues" produced by the joint diagonalization. Under the assumption that we have kk non-white-noise components, the final pkp - k "eigenvalues" of each of the autocovariance matrices equal zero, λpkτ==λpτ=0\lambda^\tau_{p-k} = \cdots = \lambda^\tau_{p} = 0.

The change point from non-zero eigenvalues to zero eigenvalues is visible in the joint diagonalization "eigenvectors" of the autocovariance matrices as an increase in their boostrap variablity. Similarly, before the change point, the squared eigenvalues decrease in magnitude and afterwards they stay constant. The ladle estimate combines the scaled eigenvector bootstrap variability with the scaled eigenvalues to estimate the number of non-white-noise components. The estimate is the value of k=0,,k = 0, \ldots , ncomp where the combined measure achieves its minimum value.

This function uses for the joint diagonalization the function frjd.int, which does not fail in case of failed convergence but returns the estimate from the final step.

Value

A list of class ladle containing:

method

The string SOBI.

k

The estimated number of non-white-noise components.

fn

The vector giving the measures of variation of the eigenvectors using the bootstrapped eigenvectors for the different number of components.

phin

Normalized sums of squared eigenvalues of the SOBI matrices.

data.name

The name of the data for which the ladle estimate was computed.

gn

The main criterion for the ladle estimate - the sum of fn and phin. k is the value where gn takes its minimum.

lambda

The sums of squared eigenvalues of the SOBI matrices.

W

The transformation matrix to the source components. Also known as the unmixing matrix.

S

Multivariate time series with the centered source components.

MU

The location of the data which was subtracted before calculating the source components.

sim

The used boostrapping technique, either "geom" or "fixed".

tau

The used set of lags for the SOBI autocovariance matrices.

Author(s)

Klaus Nordhausen, Joni Virta

References

Nordhausen, K. and Virta, J.(2018), Ladle Estimator for Time Series Signal Dimension. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 428–432, <doi:10.1109/SSP.2018.8450695>.

Luo, W. and Li, B. (2016), Combining Eigenvalues and Variation of Eigenvectors for Order Determination, Biometrika, 103. 875–887. <doi:10.1093/biomet/asw051>

See Also

AMUSE, SOBI, AMUSEladle, frjd.int

Examples

n <- 1000
  
  s1 <- arima.sim(n = n, list(ar = 0.6, ma = c(0, -0.4)))
  s2 <- arima.sim(n = n, list(ar = c(0, 0.1,0.3), ma = c(0.2, 0.4)))
  s3 <- arima.sim(n = n, list(ar = c(0, 0.8)))
  Snoise <- matrix(rnorm(5*n), ncol = 5)
  S <- cbind(s1, s2, s3, Snoise)

  A <- matrix(rnorm(64), 8, 8)
  X <- S %*% t(A)
  
  ladle_SOBI <- SOBIladle(X, l = 20, sim = "geom")

  # The estimated number of non-white-noise components
  summary(ladle_SOBI)
  
  # The ladle plot
  ladleplot(ladle_SOBI)
  
  # Time series plots of the estimated components
  plot(ladle_SOBI)
  
  # Note that AMUSEladle with lag 1 does not work due to the lack of short range dependencies
  ladle_AMUSE <- AMUSEladle(X)

  summary(ladle_AMUSE)
  ladleplot(ladle_AMUSE)
  
  # xts series as input
  library("xts")
  data(sample_matrix)
  X2 <- as.xts(sample_matrix)
  ladle_SOBI_xts <- SOBIladle(X2, l = 20, sim = "geom")
  plot(ladle_SOBI_xts, multi.panel = TRUE)

  # zoo series as input
  X3 <- as.zoo(X)
  ladle_SOBI_zoo <- SOBIladle(X3, l = 20, sim = "geom")
  plot(ladle_SOBI_zoo)

Summary of an Object of Class tssdr

Description

Gives a summary of an object of class tssdr. It includes different types of methods to select the number of directions (sources) and lags.

Usage

## S3 method for class 'tssdr'
summary(object, type = c("rectangle", "alllag", "alldir", "big"), thres = 0.8, ...)

## S3 method for class 'summary.tssdr'
print(x, digits = 3, ...)
## S3 method for class 'summary.tssdr'
components(x, ...)
## S3 method for class 'summary.tssdr'
coef(object, ...)
## S3 method for class 'summary.tssdr'
plot(x, main = "The response and the chosen directions", ...)

Arguments

object

An object of class tssdr.

type

Method for choosing the important lags and directions. The choices are "rectangle", "alllag", "alldir" and "big". Default is "rectangle".

thres

The threshold value for choosing the lags and directions. Default is 0.8.

...

Further arguments to be passed to or from methods.

In methods for class 'summary.tssdr' only:

x

An object of class summary.tssdr

digits

The number of digits when printing an object of class summary.tssdr. Default is 3

main

A title for a plot when printing an object of class summary.tssdr.

Details

The sum of values of k0×p0k_0 \times p_0 matrix L\bf L of object is 1. The values of the matrix are summed together in ways detailed below, until the value is at least π\pi (thres). Let λij\lambda_{ij} be the element (i,j)(i, j) of the matrix L\bf L.

For alllag: k=k0k = k_0 and pp is the smallest value for which i=1pλijπ.\sum_{i = 1}^p \lambda_{ij} \ge \pi. where i=1,,pi = 1, \ldots, p. The chosen number of lags and directions are returned.

For alldir: p=p0p = p_0 and kk is the smallest value for which j=1kλijπ\sum_{j = 1}^k \lambda_{ij} \ge \pi where j=1,,kj = 1, \ldots, k. The chosen number of lags and directions are returned.

For rectangle: kk and pp are values such that their product kpk p is the smallest for which i=1pj=1kλijπ\sum_{i = 1}^p \sum_{j = 1}^k \lambda_{ij} \ge \pi where i=1,,pi = 1, \ldots, p and j=1,,kj = 1, \ldots, k. The chosen number of lags and directions are returned.

For big: rr is the smallest value of elements (i1,j1),,(ir,jr)(i_1, j_1), \ldots, (i_r, j_r) for which k=1rλik,jkπ\sum_{k = 1}^r \lambda_{i_k,j_k} \ge \pi where k=1,,rk = 1, \ldots, r. Thi indices of the matrix corresponding to the chosen values are returned.

Note that when printing a summary.tssdr object, all elements except the component S, which is the matrix of the chosen directions or a vector if there is only one direction, are printed.

Value

A list of class 'summary.tssdr' containing the following components:

W

The estimated signal separation matrix

L

The Lambda matrix for choosing lags and directions.

S

The estimated directions as time series object standardized to have mean 0 and unit variances.

type

The method for choosing the important lags and directions.

algorithm

The used algorithm as a character string.

yname

The name for the response time series yy.

Xname

The name for the predictor time series X\bf X.

k

The chosen number of lags (not for type = "big" ).

p

The chosen number of directions (not for type = "big").

pk

The chosen lag-direction combinations (for type = "big" only).

Author(s)

Markus Matilainen

References

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57–69.

See Also

tssdr

Examples

n <- 10000
A <- matrix(rnorm(9), 3, 3)

x1 <- arima.sim(n = n, list(ar = 0.2))
x2 <- arima.sim(n = n, list(ar = 0.8))
x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4))
eps2 <- rnorm(n - 1)
y <- 2*x1[1:(n - 1)] + 3*x2[1:(n - 1)] + eps2
X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A)

res2 <- tssdr(y, X, algorithm = "TSIR")
res2
summ2 <- summary(res2, thres = 0.5)
summ2
summary(res2) # Chooses more lags with larger threshold
summary(res2, type = "alllag") # Chooses all lags
summary(res2, type = "alldir", thres = 0.5) # Chooses all directions
summary(res2, type = "big", thres = 0.5) # Same choices than in summ2

Supervised Dimension Reduction for Multivariate Time Series

Description

Supervised dimension reduction for multivariate time series data. There are three different algorithms to choose from. TSIR is a time series version of Sliced Inverse Regression (SIR), TSAVE is a time series version of Sliced Average Variance Estimate (TSAVE) and a hybrid of TSIR and TSAVE is TSSH (Time series SIR SAVE Hybrid). For summary of an object of class tssdr, see summary.tssdr.

Usage

tssdr(y, X, ...)

## Default S3 method:
tssdr(y, X, algorithm = c("TSIR", "TSAVE", "TSSH"), k = 1:12, H = 10, weight = 0.5,
      method = c("frjd", "rjd"), eps = 1e-06, maxiter = 1000, ...)
## S3 method for class 'ts'
tssdr(y, X, ...)
## S3 method for class 'xts'
tssdr(y, X, ...)
## S3 method for class 'zoo'
tssdr(y, X, ...)

## S3 method for class 'tssdr'
print(x, digits = 3, ...)
## S3 method for class 'tssdr'
components(x, ...)
## S3 method for class 'tssdr'
plot(x, main = "The response and the directions", ...)

Arguments

y

A numeric vector or a time series object of class ts, xts or zoo (same type as X). Missing values are not allowed.

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo (same type as y). Missing values are not allowed.

algorithm

Algorithm to be used. The options are "TSIR", "TSAVE" and "TSSH". Default is "TSIR".

k

A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:12.

H

The number of slices. If "TSSH" is used, HH is a 2-vector; the first element is used for TSIR part and the second for TSAVE part. Default is H=10H = 10.

weight

Weight 0a10 \le a \le 1 for the hybrid method TSSH only. With a=1a = 1 it reduces to TSAVE and with a=0a = 0 to TSIR. Default is a=0.5a = 0.5.

method

The method to use for the joint diagonalization. The options are "rjd" and "frjd". Default is "frjd".

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

...

Further arguments to be passed to or from methods.

In methods for class 'tssdr' only:

x

An object of class tssdr

digits

The number of digits when printing an object of class tssdr. Default is 3

main

A title for a plot when printing an object of class tssdr.

Details

Assume that the pp-variate time series Z{\bf Z} with TT observations is whitened, i.e. Z=S1/2(Xt1Tt=1TXt){\bf Z}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S{\bf S} is a sample covariance matrix of X{\bf X}. Divide yy into HH disjoint intervals (slices) by its empirical quantiles.

For each lag jj, denote yjy_{j} for a vector of the last njn - j values of the sliced yy. Also denote Zj{\bf Z}_j for the first njn - j observations of Z{\bf Z}. Then Zjh{\bf Z}_{jh} are the disjoint slices of Zj{\bf Z}_j according to the values of yjy_{j}.

Let TjhT_{jh} be the number of observations in Zjh{\bf Z}_{jh}. Write A^jh=1Tjht=1Tjh(Zjh)t\bf \widehat{A}_{jh} = \frac{1}{T_{jh}}\sum_{t = 1}^{T_{jh}}({\bf Z}_{jh})_{t} for t=1,,Tjht = 1, \ldots, T_jh, and A^j=(A^j1,,A^jH){\bf \widehat A}_j = ({\bf \widehat{A}}_{j1}, \ldots, {\bf \widehat{A}}_{jH})'. Then for algorithm TSIR matrix

M^0j=Cov^Aj.{\bf \widehat{M}}_{0j} = {\bf \widehat{Cov}}_{A_j}.

Denote Cov^jh\bf \widehat{Cov}_{jh} for a sample covariance matrix of Zjh{\bf Z}_{jh}. Then for algorithm TSAVE matrix

M^0j=1Hh=1H(IpCov^jh)2.{\bf \widehat{M}}_{0j} = \frac{1}{H}\sum_{h = 1}^H({\bf I}_p - {\bf \widehat{Cov}_{jh}})^2.

h=1,,Hh = 1, \ldots, H.

For TSSH then matrix

M^2j=aM^1j+(1a)M^0j,{\bf \widehat{M}}_{2j} = a{\bf \widehat{M}_{1j}} + (1-a){\bf \widehat{M}_{0j}},

for a chosen 0a10 \le a \le 1. Note that the value of HH can be different for TSIR and TSAVE parts.

The algorithms find an orthogonal matrix U=(u1,,up){\bf U} = (\bf u_1, \ldots, \bf u_p)' by maximizing, for b=0,1b = 0, 1 or 22,

ikdiag(UM^bjU)2=i1pjk(uiM^bjui)2.\sum_{i \in k} ||diag({\bf U} {\bf \widehat{M}}_{bj} {\bf U}')||^2 = \sum_{i \in 1}^p \sum_{j \in k} ({\bf u}_i' {\bf \widehat{M}}_{bj} {\bf u}_i)^2.

for i=1,,pi = 1, \ldots, p and all lags jj. The final signal separation matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}.

Write λij=c(uiM^bjui)2\lambda_{ij} = c({\bf u}_i' {\bf \widehat{M}}_{bj} {\bf u}_i)^2, where cc is chosen in such way that i=1pjkλij=1.\sum_{i = 1}^p \sum_{j \in k} \lambda_{ij}= 1. for i=1,,pi = 1, \ldots, p and all lags jj. Then the (i,j)(i, j):th element of the matrix L\bf L is λij\lambda_{ij}.

To make a choice on which lags and directions to keep, see summary.tssdr. Note that when printing a tssdr object, all elements are printed, except the directions S.

Value

A list of class 'tssdr' containing the following components:

W

The estimated signal separation matrix.

k

The vector of the used lags.

S

The estimated directions as time series object standardized to have mean 0 and unit variances.

MU

The mean vector of X.

L

The Lambda matrix for choosing lags and directions.

H

The used number of slices.

yname

The name for the response time series yy.

Xname

The name for the predictor time series X\bf X.

algorithm

The used algorithm as a character string.

Author(s)

Markus Matilainen

References

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57–69.

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2019), Sliced Average Variance Estimation for Multivariate Time Series. Statistics: A Journal of Theoretical and Applied Statistics, 53, 630–655.

Li, K.C. (1991), Sliced Inverse Regression for Dimension Reduction, Journal of the American Statistical Association, 86, 316–327.

Cook, R. and Weisberg, S. (1991), Sliced Inverse Regression for Dimension Reduction, Comment. Journal of the American Statistical Association, 86, 328–332.

See Also

summary.tssdr, dr

Examples

n <- 10000
A <- matrix(rnorm(9), 3, 3)

x1 <- arima.sim(n = n, list(ar = 0.2))
x2 <- arima.sim(n = n, list(ar = 0.8))
x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4))
eps2 <- rnorm(n - 1)
y <- 2*x1[1:(n - 1)] + eps2
X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A)

res1 <- tssdr(y, X, algorithm = "TSAVE")
res1
summ1 <- summary(res1, type = "alllag", thres = 0.8)
summ1
plot(summ1)
head(components(summ1))
coef(summ1)

# Hybrid of TSIR and TSAVE. For TSIR part H = 10 and for TSAVE part H = 2.
tssdr(y, X, algorithm = "TSSH", weight = 0.6, H = c(10, 2))

A Variant of SOBI for Blind Source Separation

Description

The vSOBI (variant of Second Order Blind Identification) method for the blind source separation of time series with stochastic volatility. The method is a variant of SOBI, which is a method designed to separate ARMA sources, and an alternative to FixNA and FixNA2 methods.

Usage

vSOBI(X, ...)

## Default S3 method:
vSOBI(X, k = 1:12, eps = 1e-06, maxiter = 1000, G = c("pow", "lcosh"), 
      ordered = FALSE, acfk = NULL, original = TRUE, alpha = 0.05, ...)
## S3 method for class 'ts'
vSOBI(X, ...)
## S3 method for class 'xts'
vSOBI(X, ...)
## S3 method for class 'zoo'
vSOBI(X, ...)

Arguments

X

A numeric matrix or a multivariate time series object of class ts, xts or zoo. Missing values are not allowed.

k

A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:12.

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

G

Function G(x)G(x). The choices are "pow" (default) and "lcosh".

ordered

Whether to order components according to their volatility. Default is FALSE.

acfk

A vector of lags to be used in testing the presence of serial autocorrelation. Applicable only if ordered = TRUE.

original

Whether to return the original components or their residuals based on ARMA fit. Default is TRUE, i.e. the original components are returned. Applicable only if ordered = TRUE.

alpha

Alpha level for linear correlation detection. Default is 0.05.

...

Further arguments to be passed to or from methods.

Details

Assume that a pp-variate Y{\bf Y} with TT observations is whitened, i.e. Y=S1/2(Xt1Tt=1TXt){\bf Y}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t}), for t=1,,Tt = 1, \ldots, T, where S{\bf S} is the sample covariance matrix of X{\bf X}. The algorithm finds an orthogonal matrix U{\bf U} by maximizing

D(U)=k=1KDk(U){\bf D}({\bf U}) = \sum_{k = 1}^K {\bf D}_k({\bf U})

=k=1Ki=1p(1Tkt=1Tk[G(uiYt)G(uiYt+k)](1Tk)2t=1Tk[G(uiYt)]t=1Tk[G(uiYt+k)])2.= \sum_{k = 1}^K \sum_{i = 1}^p \left(\frac{1}{T - k}\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_t) G({\bf u}_i' {\bf Y}_{t + k})] - \left(\frac{1}{T - k}\right)^2\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_t)]\sum_{t=1}^{T - k}[G({\bf u}_i' {\bf Y}_{t + k})]\right)^2.

where i=1,,pi = 1, \ldots, p, k=1,,Kk = 1, \ldots, K and t=1,,Tt = 1, \ldots, T. For function G(x)G(x) the choices are x2x^2 and log(cosh(xx)).

The algorithm works iteratively starting with diag(p) as an initial value for an orthogonal matrix U=(u1,u2,,up){\bf U} = ({\bf u}_1, {\bf u}_2, \ldots, {\bf u}_p)'.

Matrix Tik{\bf T}_{ik} is a partial derivative of Dk(U){\bf D}_k({\bf U}) with respect to ui{\bf u}_i. Then Tk=(T1k,,Tpk){\bf T}_k = ({\bf T}_{1k}, \ldots, {\bf T}_{pk})', where pp is the number of columns in Y{\bf Y}, and T=k=1KTk{\bf T} = \sum_{k = 1}^K {\bf T}_k. The update for the orthogonal matrix Unew=(TT)1/2T{\bf U}_{new} = ({\bf TT}')^{-1/2}{\bf T} is calculated at each iteration step. The algorithm stops when

UnewUold||{\bf U}_{new} - {\bf U}_{old}||

is less than eps. The final unmixing matrix is then W=US1/2{\bf W} = {\bf US}^{-1/2}.

For ordered = TRUE the function orders the sources according to their volatility. First a possible linear autocorrelation is removed using auto.arima. Then a squared autocorrelation test is performed for the sources (or for their residuals, when linear correlation is present). The sources are then put in a decreasing order according to the value of the test statistic of the squared autocorrelation test. For more information, see lbtest.

Value

A list of class 'bssvol', inheriting from class 'bss', containing the following components:

W

The estimated unmixing matrix. If ordered = TRUE, the rows are ordered according to the order of the components.

k

The vector of the used lags.

S

The estimated sources as time series object standardized to have mean 0 and unit variances. If ordered = TRUE, then components are ordered according to their volatility. If original = FALSE, the sources with linear autocorrelation are replaced by their ARMA residuals.

MU

The mean vector of X.

If ordered = TRUE, then also the following components included in the list:

Sraw

The ordered original estimated sources as time series object standardized to have mean 0 and unit variances. Returned only if original = FALSE.

fits

The ARMA fits for the components with linear autocorrelation.

armaeff

A logical vector. Is TRUE if ARMA fit was done to the corresponding component.

linTS

The value of the modified Ljung-Box test statistic for each component.

linP

p-value based on the modified Ljung-Box test statistic for each component.

volTS

The value of the volatility clustering test statistic.

volP

p-value based on the volatility clustering test statistic.

Author(s)

Markus Matilainen

References

Belouchrani, A., Abed-Meriam, K., Cardoso, J.F. and Moulines, R. (1997), A Blind Source Separation Technique Using Second-Order Statistics, IEEE Transactions on Signal Processing, 434–444.

Matilainen, M., Miettinen, J., Nordhausen, K., Oja, H. and Taskinen, S. (2017), On Independent Component Analysis with Stochastic Volatility Models, Austrian Journal of Statistics, 46(3–4), 57–66.

See Also

FixNA, SOBI, lbtest, auto.arima

Examples

if(require("stochvol")) {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

# simulate SV models
s1 <- svsim(n, mu = -10, phi = 0.8, sigma = 0.1)$y
s2 <- svsim(n, mu = -10, phi = 0.9, sigma = 0.2)$y
s3 <- svsim(n, mu = -10, phi = 0.95, sigma = 0.4)$y

# create a daily time series
X <- ts(cbind(s1, s2, s3) %*% t(A), end = c(2015, 338), frequency = 365.25)

res <- vSOBI(X)
res
coef(res)
plot(res)
head(bss.components(res))

MD(res$W, A) # Minimum Distance Index, should be close to zero
}

Logarithmic Returns of Exchange Rates of 7 Currencies Against US Dollar

Description

This data set has logarithmic returns of exchange rates of 7 currencies against US dollar extracted from the International Monetary Fund's (IMF) database. These currencies are Australian Dollar (AUD), Canadian Dollar (CAD), Norwegian Kroner (NOK), Singapore Dollar (SGD), Swedish Kroner (SEK), Swiss Franc (CHF) and British Pound (GBP).

Usage

data("WeeklyReturnsData")

Format

An object of class ts with 605 observations on the following 7 variables.

AUD

The weekly logarithmic returns rAUD,t\mathbf r_{AUD, t} of the exchange rates of AUD against US Dollar.

CAD

The weekly logarithmic returns rCAD,t\mathbf r_{CAD, t} of the exchange rates of CAD against US Dollar.

NOK

The weekly logarithmic returns rNOK,t\mathbf r_{NOK, t} of the exchange rates of NOK against US Dollar.

SGD

The weekly logarithmic returns rSGD,t\mathbf r_{SGD, t} of the exchange rates of SGD against US Dollar.

SEK

The weekly logarithmic returns rSEK,t\mathbf r_{SEK, t} of the exchange rates of SEK against US Dollar.

CHF

The weekly logarithmic returns rCHF,t\mathbf r_{CHF, t} of the exchange rates of CHF against US Dollar.

GBP

The weekly logarithmic returns rGBP,t\mathbf r_{GBP, t} of the exchange rates of GBP against US Dollar.

Details

The daily exhange rates of the currencies against US Dollar from March 22, 2000 to October 26, 2011 are extracted from the International Monetary Fund's (IMF) Exchange Rates database from https://www.imf.org/external/np/fin/ert/GUI/Pages/CountryDataBase.aspx. These rates are representative rates (currency units per US Dollar), which are reported daily to the IMF by the issuing central bank.

The weekly averages of these exchange rates are then calculated. The logarithmic returns of the average weekly exchange rates are calculated for the currency jj as follows.

Let xj,t\mathbf x_{j, t} be the exchange rates of jj against US Dollar. Then

rj,t= log (xj,t) log (xj,t1),\mathbf r_{j, t} = \textrm{ log }(\mathbf x_{j, t}) - \textrm{ log } (\mathbf x_{j, t-1}),

where t=1,,605t = 1, \ldots, 605 and j=AUD,CAD,NOK,SGD,SEK,CHF,GBPj = AUD, CAD, NOK, SGD, SEK, CHF, GBP. The six missing values in rSEK,t\mathbf r_{SEK, t} are changed to 0. The assumption used here is that there has not been any change from the previous week.

The weekly returns data is then changed to a multivariate time series object of class ts. The resulting ts object is then dataset WeeklyReturnsData.

An example analysis of the data is given in Miettinen et al. (2018). Same data has also been used in Hu and Tsay (2014).

Source

International Monetary Fund (2017), IMF Exchange Rates, https://www.imf.org/external/np/fin/ert/GUI/Pages/CountryDataBase.aspx

For IMF Copyrights and Usage, and special terms and conditions pertaining to the use of IMF data, see https://www.imf.org/external/terms.htm

References

Miettinen, M., Matilainen, M., Nordhausen, K. and Taskinen, S. (2020), Extracting Conditionally Heteroskedastic Components Using Independent Component Analysis, Journal of Time Series Analysis,41, 293–311.

Hu and Tsay (2014), Principal Volatility Component Analysis, Journal of Business & Economic Statistics, 32(2), 153–164.

Examples

plot(WeeklyReturnsData)

res <- gSOBI(WeeklyReturnsData)
res

coef(res)
plot(res)
head(bss.components(res))