Title: | Nonparametric Estimation of Toeplitz Covariance Matrices |
---|---|
Description: | A nonparametric method to estimate Toeplitz covariance matrices from a sample of n independently and identically distributed p-dimensional vectors with mean zero. The data is preprocessed with the discrete cosine matrix and a variance stabilization transformation to obtain an approximate Gaussian regression setting for the log-spectral density function. Estimates of the spectral density function and the inverse of the covariance matrix are provided as well. Functions for simulating data and a protein data example are included. For details see (Klockmann, Krivobokova; 2023), <arXiv:2303.10018>. |
Authors: | Karolina Klockmann [aut, cre], Tatyana Krivobokova [aut] |
Maintainer: | Karolina Klockmann <[email protected]> |
License: | GPL-2 |
Version: | 0.2 |
Built: | 2024-11-24 06:43:31 UTC |
Source: | CRAN |
Dataset with molecular dynamics simulations for the yeast aquaporin (Aqy1) - the gated water channel of the yeast Pichi pastoris. The dataset contains only the diameter Y of the channel which is used in the data analysis in (Klockmann and Krivobokova, 2023). The diameter Y is measured by the distance between two centers of mass of certain residues of the protein. The dataset includes a 100 nanosecond time frame, split into 20000 equidistant observations. The full dataset, including the Euclidean coordinates of all 783 atoms, is available from the authors. For more details see (Klockmann, Krivobokova; 2023).
aquaporin
aquaporin
A data frame with 20000 rows and 1 variable:
Y
: the diameter of the channel
see (Klockmann, Krivobokova; 2023).
data(aquaporin)
data(aquaporin)
example1, example2 and example3 generate i.i.d. vectors from a given distribution with different Toeplitz covariance matrices.
The covariance function of the Toeplitz covariance matrix of
example1
: has a polynomial decay, ,
example2
: follows an model with coefficients
and innovations variance
,
example3
: yields a Lipschitz continuous spectral density that is not differentiable, i.e.
example1(p, n, sd, gamma, family = "Gaussian") example2(p, n, sd, family = "Gaussian") example3(p, n, sd, gamma, family = "Gaussian")
example1(p, n, sd, gamma, family = "Gaussian") example2(p, n, sd, family = "Gaussian") example3(p, n, sd, gamma, family = "Gaussian")
p |
vector length |
n |
sample size |
sd |
standard deviation |
gamma |
polynomial decay of covariance function for |
family |
distribution of the simulated data. Available distributions are " |
A list containing the following elements:
Y
: pxn
dimensional data matrix
sdf
: true spectral density function
acf
: true covariance function
example1(p=10, n=1, sd=1, gamma=1.2, family="Gaussian") example2(p=10,n=1,sd=1,family="Gaussian") example3(p=10, n=1, sd=1, gamma=2,family="Gaussian")
example1(p=10, n=1, sd=1, gamma=1.2, family="Gaussian") example2(p=10,n=1,sd=1,family="Gaussian") example3(p=10, n=1, sd=1, gamma=2,family="Gaussian")
Applies the Discrete Cosine I transform, data binning and the variance stabilizing transform function to the data.
Data.trafo(y, Te, dct.out = FALSE)
Data.trafo(y, Te, dct.out = FALSE)
y |
|
Te |
number of bins for data binning. |
dct.out |
logical. If |
A list containing the following elements:
m
: number of data points per bin, that is m=n*round(p/Te)
. If p/Te
is not an integer, the first/last bin may contain more than m
data points.
y.star
: 2Te-2
dimensional vector with binned, variance stabilized and mirrowed data. The bin number Te
may be modified to guarantee at least two data points per bin. If p/Te
is not an integer, the vector dimension is 2*floor(p/round(p/Te))-2
.
dct.matrix
: p
-dim. DCT-I matrix (if dct.out
=TRUE)
Calculates the periodic Demmler-Reinsch basisfor a given smoothness and a given vector of grid points. For details see (Schwarz, Krivobokova; 2016).
DR.basis(x, n, q)
DR.basis(x, n, q)
x |
|
n |
dimension of the basis |
q |
penalization order, |
mxn
dimensional matrix with the n
DR basis functions evaluated at grid points x
DR.basis(seq(1,10)/10,5,2)
DR.basis(seq(1,10)/10,5,2)
Estimates the Toeplitz covariance matrix, the inverse matrix and the spectral density from a sample of n
i.i.d. p
-dimensional vectors with mean zero.
Toep.estimator(y, Te, q, method, f.true = NULL)
Toep.estimator(y, Te, q, method, f.true = NULL)
y |
|
Te |
number of bins for data binning. |
q |
penalization order, |
method |
to select the smoothing parameter of the smoothing spline. Available methods are restricted maxmimum likelihodd " |
f.true |
Te-dimensional vector with the true spectral density function evaluated at equi-sapced points in [0, |
A list containing the following elements:
toep
: p
-dim. Toeplitz covariance matrix
toep.inv
: p
-dim. precision matrix
acf
: p
-dim. vector with the covariance function
sdf
: p
-dim. vector with the spectral density in the interval [0,1]
#EXAMPLE 1: Simulate Gaussian ARMA(2,2) library(nlme) library(MASS) p=100 n=1 Sigma=1.44*corMatrix(Initialize(corARMA(c(0.7, -0.4,-0.2, 0.2),p=2,q=2),data=diag(1:p))) Y=matrix(mvrnorm(n, mu=numeric(p), Sigma=Sigma),n,p) fit.toep=Toep.estimator(y=Y,Te=10,q=2,method="GCV")$toep #EXAMPLE 2: AQUAPORIN DATA data(aquaporin) n=length(aquaporin$Y) y.train=aquaporin$Y[1:(0.01*n)] y.train=y.train-mean(y.train) fit.toep=Toep.estimator(y=y.train,Te=10,q=1,method="ML")$toep
#EXAMPLE 1: Simulate Gaussian ARMA(2,2) library(nlme) library(MASS) p=100 n=1 Sigma=1.44*corMatrix(Initialize(corARMA(c(0.7, -0.4,-0.2, 0.2),p=2,q=2),data=diag(1:p))) Y=matrix(mvrnorm(n, mu=numeric(p), Sigma=Sigma),n,p) fit.toep=Toep.estimator(y=Y,Te=10,q=2,method="GCV")$toep #EXAMPLE 2: AQUAPORIN DATA data(aquaporin) n=length(aquaporin$Y) y.train=aquaporin$Y[1:(0.01*n)] y.train=y.train-mean(y.train) fit.toep=Toep.estimator(y=y.train,Te=10,q=1,method="ML")$toep