| Title: | Fast Computation of some Matrices Useful in Statistics |
|---|---|
| Description: | Small set of functions designed to speed up the computation of certain matrix operations that are commonly used in statistics and econometrics. It provides efficient implementations for the computation of several structured matrices, matrix decompositions and statistical procedures, many of which have minimal memory overhead. Furthermore, the package provides interfaces to C code callable by another C code from other R packages. |
| Authors: | Felipe Osorio [aut, cre] (ORCID: <https://orcid.org/0000-0002-4675-5201>), Alonso Ogueda [aut] (ORCID: <https://orcid.org/0000-0001-5411-1484>) |
| Maintainer: | Felipe Osorio <[email protected]> |
| License: | GPL-3 |
| Version: | 0.6-6 |
| Built: | 2026-06-13 08:30:39 UTC |
| Source: | https://github.com/cran/fastmatrix |
Small set of functions designed to speed up the computation of certain matrix operations that are commonly used in statistics and econometrics. It provides efficient implementations for the computation of several structured matrices, matrix decompositions and statistical procedures, many of which have minimal memory overhead. Furthermore, the package provides interfaces to C code callable by another C code from other R packages.
The fastmatrix package provides functions to the efficient construction of duplication, commutation and symmetrizer matrices with minimal storage requeriments. Common matrix decompositions (e.g. LU, LDL), rank-1 updates (e.g. Cholesky update), iterative solvers for linear systems and other linear algebra utilities, as well as basic matrix operations, such as Hadamard (elementwise) and Kronecker products, the Sherman-Morrison formula and the power method. The package also offers several statistical procedures, namely: the sweep operator, weighted mean and covariance (using online algorithms), ordinary least squares via multiple strategies (Cholesky, QR, SVD, sweep and conjugate gradients), ridge regression (with procedures for selecting the ridge parameter), omnibus tests for univariate normality, multivariate skewness and kurtosis measures, Mahalanobis distance (with positive-definiteness checks), Wilson-Hilferty transformation of gamma variables (useful, for instance, for goodness-of-fit of multivariate normal data), and some random number generators. Finally, the package provides interfaces for code written in C, enabling other R packages (or user-written C code) to access the C routines in the fastmatrix package.
Felipe Osorio [email protected], Alonso Ogueda [email protected]
Multiplication of 3-dimensional arrays was first introduced by Bates and Watts (1980). More extensions and technical details can be found in Wei (1998).
array.mult(a, b, x)array.mult(a, b, x)
a |
a numeric matrix. |
b |
a numeric matrix. |
x |
a three-dimensional array. |
Let be a 3-dimensional where
indices and indicate face, row and column, respectively. The
product is an array, with
and are and matrices
respectively. The elements of are defined as:
array.mult returns a 3-dimensional array of dimension .
Bates, D.M., Watts, D.G. (1980). Relative curvature measures of nonlinearity. Journal of the Royal Statistical Society, Series B 42, 1-25.
Wei, B.C. (1998). Exponential Family Nonlinear Models. Springer, New York.
x <- array(0, dim = c(2,3,3)) # 2 x 3 x 3 array x[,,1] <- c(1,2,2,4,3,6) x[,,2] <- c(2,4,4,8,6,12) x[,,3] <- c(3,6,6,12,9,18) a <- matrix(1, nrow = 2, ncol = 3) b <- matrix(1, nrow = 3, ncol = 2) y <- array.mult(a, b, x) # a 2 x 2 x 2 array yx <- array(0, dim = c(2,3,3)) # 2 x 3 x 3 array x[,,1] <- c(1,2,2,4,3,6) x[,,2] <- c(2,4,4,8,6,12) x[,,3] <- c(3,6,6,12,9,18) a <- matrix(1, nrow = 2, ncol = 3) b <- matrix(1, nrow = 3, ncol = 2) y <- array.mult(a, b, x) # a 2 x 2 x 2 array y
Force a square matrix to be symmetric
asSymmetric(x, lower = TRUE)asSymmetric(x, lower = TRUE)
x |
a square matrix to be forced to be symmetric. |
lower |
logical, should the upper (lower) triangle be replaced with the lower (upper) triangle? |
a square symmetric matrix.
a <- matrix(1:16, ncol = 4) isSymmetric(a) # FALSE a <- asSymmetric(a) # copy lower triangle into upper trianglea <- matrix(1:16, ncol = 4) isSymmetric(a) # FALSE a <- asSymmetric(a) # copy lower triangle into upper triangle
Computes the Bezier curve based on control points using the De Casteljau's method.
bezier(x, y, ngrid = 200)bezier(x, y, ngrid = 200)
x, y
|
vector giving the coordinates of the control points. Missing values are deleted. |
ngrid |
number of elements in the grid used to compute the smoother. |
Given control points the Bezier curve is given by
defined as
where . To evaluate the Bezier curve the De Casteljau's method is used.
A list containing xgrid and ygrid elements used to plot the Bezier curve.
# a tiny example x <- c(1.0, 0.25, 1.25, 2.5, 4.00, 5.0) y <- c(0.5, 2.00, 3.75, 4.0, 3.25, 1.0) plot(x, y, type = "o") z <- bezier(x, y, ngrid = 50) lines(z$xgrid, z$ygrid, lwd = 2, lty = 2, col = "red") # other simple example x <- c(4,6,4,5,6,7) y <- 1:6 plot(x, y, type = "o") z <- bezier(x, y, ngrid = 50) lines(z$xgrid, z$ygrid, lwd = 2, lty = 2, col = "red")# a tiny example x <- c(1.0, 0.25, 1.25, 2.5, 4.00, 5.0) y <- c(0.5, 2.00, 3.75, 4.0, 3.25, 1.0) plot(x, y, type = "o") z <- bezier(x, y, ngrid = 50) lines(z$xgrid, z$ygrid, lwd = 2, lty = 2, col = "red") # other simple example x <- c(4,6,4,5,6,7) y <- 1:6 plot(x, y, type = "o") z <- bezier(x, y, ngrid = 50) lines(z$xgrid, z$ygrid, lwd = 2, lty = 2, col = "red")
Bracket product of a matrix and a 3-dimensional array.
bracket.prod(a, x)bracket.prod(a, x)
a |
a numeric matrix. |
x |
a three-dimensional array. |
Let be a 3-dimensional array and
an matrix, then
is called the bracket product of and , that is an with elements
bracket.prod returns a 3-dimensional array of dimension .
Wei, B.C. (1998). Exponential Family Nonlinear Models. Springer, New York.
x <- array(0, dim = c(2,3,3)) # 2 x 3 x 3 array x[,,1] <- c(1,2,2,4,3,6) x[,,2] <- c(2,4,4,8,6,12) x[,,3] <- c(3,6,6,12,9,18) a <- matrix(1, nrow = 3, ncol = 2) y <- bracket.prod(a, x) # a 3 x 3 x 3 array yx <- array(0, dim = c(2,3,3)) # 2 x 3 x 3 array x[,,1] <- c(1,2,2,4,3,6) x[,,2] <- c(2,4,4,8,6,12) x[,,3] <- c(3,6,6,12,9,18) a <- matrix(1, nrow = 3, ncol = 2) y <- bracket.prod(a, x) # a 3 x 3 x 3 array y
Calculates Lin's concordance correlation coefficient for evaluating the degree of agreement between measurements generated by two different methods.
ccc(x, data, method = "z-transform", level = 0.95, equal.means = FALSE, ustat = TRUE, subset, na.action)ccc(x, data, method = "z-transform", level = 0.95, equal.means = FALSE, ustat = TRUE, subset, na.action)
x |
a formula or a numeric matrix or an object that can be coerced to a numeric matrix. |
data |
an optional data frame (or similar: see |
method |
a character string, indicating the method for the computation of the required
confidence interval. Options available are |
level |
the confidence level required, must be a single number between 0 and 1 (by default 95%). |
equal.means |
logical, should the means of the measuring devices be considered equal? In which case the restricted estimation is carried out under this assumption. |
ustat |
logical, should the concordance correlation coefficient be estimated using U-statistics? |
subset |
an optional expression indicating the subset of the rows of data that should be used in the fitting process. |
na.action |
a function that indicates what should happen when the data contain NAs. |
A list with class 'ccc' containing the following named components:
call |
a list containing an image of the |
x |
|
ccc |
estimate of the concordance correlation coefficient. |
var.ccc |
asymptotic variance of the concordance correlation coefficient estimate. |
accuracy |
estimate of the accuracy (or bias) coefficient that measures how far the
best-fit line deviates from a line at 45 degrees. No deviation from the 45 degree line
occurs when |
precision |
estimate of the precision (or Pearson correlation) coefficient. |
shifts |
list with the location and scale shifts. |
z |
Z-transformation parameter estimate. |
var.z |
asymptotic variance of the Z-transformation parameter estimate. |
confint |
confidence interval for the Lin's concordance correlation coefficient. |
bland |
a data frame with two columns containing the |
center |
the estimated mean vector. |
cov |
the estimated covariance matrix. |
ustat |
available only if |
Restricted |
available only if |
Bland, J., Altman, D. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet 327, 307-310.
King, T.S., Chinchilli, V.M. (2001). A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine 20, 2131-2147.
King, T.S., Chinchilli, V.M. (2001). Robust estimators of the concordance correlation coefficient. Journal of Biopharmaceutical Statistics 11, 83-105.
Lin, L. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255-268.
Lin, L. (2000). A note on the concordance correlation coefficient. Biometrics 56, 324-325.
Vallejos, R., Osorio, F., Ferrer, C. (2025). A new coefficient to measure agreement between two continuous variables. doi: 10.48550/arXiv.2507.07913.
## data in Fig.1 from Bland and Altman (1986). x <- list(Large = c(494,395,516,434,476,557,413,442,650,433, 417,656,267,478,178,423,427), Mini = c(512,430,520,428,500,600,364,380,658,445, 432,626,260,477,259,350,451)) x <- as.data.frame(x) plot(Mini ~ Large, data = x, xlim = c(100,800), ylim = c(100,800), xlab = "PERF by Large meter", ylab = "PERF by Mini meter") abline(c(0,1), col = "gray", lwd = 2) ## estimating CCC z <- ccc(~ Mini + Large, data = x, method = "asymp") z ## output: # Call: # ccc(x = ~ Mini + Large, data = x, method = "asymp") # # Coefficients: # estimate variance accuracy precision # 0.9427 0.0008 0.9994 0.9433 # # Asymptotic 95% confidence interval: # CCC SE lower upper # 0.9427 0.0286 0.8867 0.9988## data in Fig.1 from Bland and Altman (1986). x <- list(Large = c(494,395,516,434,476,557,413,442,650,433, 417,656,267,478,178,423,427), Mini = c(512,430,520,428,500,600,364,380,658,445, 432,626,260,477,259,350,451)) x <- as.data.frame(x) plot(Mini ~ Large, data = x, xlim = c(100,800), ylim = c(100,800), xlab = "PERF by Large meter", ylab = "PERF by Mini meter") abline(c(0,1), col = "gray", lwd = 2) ## estimating CCC z <- ccc(~ Mini + Large, data = x, method = "asymp") z ## output: # Call: # ccc(x = ~ Mini + Large, data = x, method = "asymp") # # Coefficients: # estimate variance accuracy precision # 0.9427 0.0008 0.9994 0.9433 # # Asymptotic 95% confidence interval: # CCC SE lower upper # 0.9427 0.0286 0.8867 0.9988
Conjugate gradients (CG) method is an iterative algorithm for solving linear systems with positive definite coefficient matrices.
cg(a, b, maxiter = 200, tol = 1e-7)cg(a, b, maxiter = 200, tol = 1e-7)
a |
a symmetric positive definite matrix containing the coefficients of the linear system. |
b |
a vector of right-hand sides of the linear system. |
maxiter |
the maximum number of iterations. Defaults to |
tol |
tolerance level for stopping iterations. |
a vector with the approximate solution, the iterations performed are returned as the attribute 'iterations'.
The underlying C code does not check for symmetry nor positive definitiveness.
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
Hestenes, M.R., Stiefel, E. (1952). Methods of conjugate gradients for solving linear equations. Journal of Research of the National Bureau of Standards 49, 409-436.
a <- matrix(c(4,3,0,3,4,-1,0,-1,4), ncol = 3) b <- c(24,30,-24) z <- cg(a, b) z # converged in 3 iterationsa <- matrix(c(4,3,0,3,4,-1,0,-1,4), ncol = 3) b <- c(24,30,-24) z <- cg(a, b) z # converged in 3 iterations
Density, distribution function, quantile function and random generation for the
chi distribution with df degrees of freedom.
dchi(x, df = 1, log = FALSE) pchi(q, df = 1, lower.tail = TRUE, log.p = FALSE) qchi(p, df = 1, lower.tail = TRUE, log.p = FALSE) rchi(n, df = 1)dchi(x, df = 1, log = FALSE) pchi(q, df = 1, lower.tail = TRUE, log.p = FALSE) qchi(p, df = 1, lower.tail = TRUE, log.p = FALSE) rchi(n, df = 1)
x, q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. |
df |
degrees of freedom (non-negative, but can be non-integer). |
log, log.p
|
logical; if TRUE, probabilities |
lower.tail |
logical; if TRUE (default), probabilities are |
If df is not specified, they assume the default value of 1.
The chi distribution with degrees of freedom (also known as shape parameter) has
density
where , and .
rchi implements the routine proposed by Monahan (1987) for generating chi-distributed
random variables.
dchi, pchi, and qchi are respectively the density, distribution
function and quantile function of the chi distribution. rchi generates random
deviates drawn from the chi distribution, the length of the result is determined by n.
Felipe Osorio
Forbes, C., Evans, M., Hastings, N., Peacock, B. (2010). Statistical Distributions, 4th Ed. Wiley, New York.
Johnson, N.L., Kotz, S., Balakrishnan, N. (1994). Continuous Univariate Distributions, Volume 1, 2nd Ed. Wiley, New York.
Monahan, J.F. (1987). An algorithm for generating chi random variables. ACM Transactions on Mathematical Software 13, 168-172.
Distributions for other standard distributions.
x <- rchi(1000, df = 2) ## QQ-plot for chi data against true theoretical distribution: qqplot(qchi(ppoints(1000), df = 2), x, main = "chi QQ-plot", xlab = "Theoretical quantiles", ylab = "Sample quantiles") abline(c(0,1), lwd = 2, lty = 2, col = "red")x <- rchi(1000, df = 2) ## QQ-plot for chi data against true theoretical distribution: qqplot(qchi(ppoints(1000), df = 2), x, main = "chi QQ-plot", xlab = "Theoretical quantiles", ylab = "Sample quantiles") abline(c(0,1), lwd = 2, lty = 2, col = "red")
function cholupdate returns the upper triangular Cholesky factor of
, where is the original Cholesky
factor of , with a column
vector of appropriate dimension.
cholupdate(r, x)cholupdate(r, x)
r |
a upper triangular matrix, the Cholesky factor of matrix a. |
x |
vector defining the rank one update. |
Golub, G.H., Van Loan, C.F. (2013). Matrix Computations, 4th Edition. John Hopkins University Press.
a <- matrix(c(1,1,1,1,2,3,1,3,6), ncol = 3) r <- chol(a) x <- c(0,0,1) b <- a + outer(x,x) r1 <- cholupdate(r, x) r1 all(r1 == chol(b)) # TRUEa <- matrix(c(1,1,1,1,2,3,1,3,6), ncol = 3) r <- chol(a) x <- c(0,0,1) b <- a + outer(x,x) r1 <- cholupdate(r, x) r1 all(r1 == chol(b)) # TRUE
Forms a symmetric circulant matrix using a backwards shift of its first column
circulant(x)circulant(x)
x |
the first column to form the circulant matrix. |
A symmetric circulant matrix.
x <- c(2,3,5,7,11,13) circulant(x)x <- c(2,3,5,7,11,13) circulant(x)
This function provides the minimum information required to create the commutation matrix.
The commutation matrix is a square matrix of order that, for an
matrix , transform vec) to vec.
comm.info(m = 1, n = m, condensed = TRUE)comm.info(m = 1, n = m, condensed = TRUE)
m |
a positive integer row dimension. |
n |
a positive integer column dimension. |
condensed |
logical. Information should be returned in compact form? |
This function returns a list containing two vectors that represent an element of
the commutation matrix and is accesed by the indexes in vectors row and col.
This information is used by function comm.prod to do some operations
involving the commutation matrix without forming it. This information also can be
obtained using function commutation.
A list containing the following elements:
row |
vector of indexes, each entry represents the row index of the commutation matrix. |
col |
vector of indexes, each entry represents the column index of the commutation
matrix. Only present if |
m |
positive integer, row dimension. |
n |
positive integer, column dimension. |
Magnus, J.R., Neudecker, H. (1979). The commutation matrix: some properties and applications. The Annals of Statistics 7, 381-394.
z <- comm.info(m = 3, n = 2, condensed = FALSE) z # where are the ones in commutation matrix of order '3,2'? K32 <- commutation(m = 3, n = 2, matrix = TRUE) K32 # only recommended if m and n are very smallz <- comm.info(m = 3, n = 2, condensed = FALSE) z # where are the ones in commutation matrix of order '3,2'? K32 <- commutation(m = 3, n = 2, matrix = TRUE) K32 # only recommended if m and n are very small
Given the row and column dimensions of a commutation matrix of order
and a conformable matrix , performs one of the matrix-matrix
operations:
, if side = "left" and transposed = FALSE, or
, if side = "left" and transposed = TRUE, or
, if side = "right" and transposed = FALSE, or
, if side = "right" and transposed = TRUE.
The main aim of comm.prod is to do this matrix multiplication without forming
the commutation matrix.
comm.prod(m = 1, n = m, x = NULL, transposed = FALSE, side = "left")comm.prod(m = 1, n = m, x = NULL, transposed = FALSE, side = "left")
m |
a positive integer row dimension. |
n |
a positive integer column dimension. |
x |
numeric matrix (or vector). |
transposed |
logical. Commutation matrix should be transposed? |
side |
a string selecting if commutation matrix is pre-multiplying |
Underlying Fortran code only uses information provided by comm.info
to performs the matrix multiplication. The commutation matrix is never created.
K42 <- commutation(m = 4, n = 2, matrix = TRUE) x <- matrix(1:24, ncol = 3) y <- K42 %*% x z <- comm.prod(m = 4, n = 2, x) # K42 is not stored all(z == y) # matrices y and z are equal!K42 <- commutation(m = 4, n = 2, matrix = TRUE) x <- matrix(1:24, ncol = 3) y <- K42 %*% x z <- comm.prod(m = 4, n = 2, x) # K42 is not stored all(z == y) # matrices y and z are equal!
This function returns the commutation matrix of order which transforms,
for an matrix , vec to
vec.
commutation(m = 1, n = m, matrix = FALSE, condensed = FALSE)commutation(m = 1, n = m, matrix = FALSE, condensed = FALSE)
m |
a positive integer row dimension. |
n |
a positive integer column dimension. |
matrix |
a logical indicating whether the commutation matrix will be returned. |
condensed |
logical. Information should be returned in compact form? |
This function is a wrapper function for the function comm.info. This function
provides the minimum information required to create the commutation matrix. If option
matrix = FALSE the commutation matrix is stored in two vectors containing the
coordinate list of indexes for rows and columns. Option condensed = TRUE only
returns vector of indexes for the rows of commutation matrix.
Warning: matrix = TRUE is not recommended, unless the order m
and n be small. This matrix can require a huge amount of storage.
Returns an by matrix (if requested).
Magnus, J.R., Neudecker, H. (1979). The commutation matrix: some properties and applications. The Annals of Statistics 7, 381-394.
Magnus, J.R., Neudecker, H. (2007). Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd Edition. Wiley, New York.
z <- commutation(m = 100, condensed = TRUE) object.size(z) # 40.6 Kb of storage z <- commutation(m = 100, condensed = FALSE) object.size(z) # 80.7 Kb of storage K100 <- commutation(m = 100, matrix = TRUE) # time: < 2 secs object.size(K100) # 400 Mb of storage, do not request this matrix! # a small example K32 <- commutation(m = 3, n = 2, matrix = TRUE) a <- matrix(1:6, ncol = 2) v <- K32 %*% vec(a) all(vec(t(a)) == as.vector(v)) # vectors are equal!z <- commutation(m = 100, condensed = TRUE) object.size(z) # 40.6 Kb of storage z <- commutation(m = 100, condensed = FALSE) object.size(z) # 80.7 Kb of storage K100 <- commutation(m = 100, matrix = TRUE) # time: < 2 secs object.size(K100) # 400 Mb of storage, do not request this matrix! # a small example K32 <- commutation(m = 3, n = 2, matrix = TRUE) a <- matrix(1:6, ncol = 2) v <- K32 %*% vec(a) all(vec(t(a)) == as.vector(v)) # vectors are equal!
This function is a constructor for the corAR1 correlation matrix representing
an autocorrelation structure of order 1.
corAR1(rho, p = 2)corAR1(rho, p = 2)
rho |
the value of the lag 1 autocorrelation, which must be between -1 and 1. |
p |
dimension of the requested correlation matrix. |
Returns a by matrix.
R <- corAR1(rho = 0.8, p = 5)R <- corAR1(rho = 0.8, p = 5)
This function is a constructor for the corCS correlation matrix representing
a compound symmetry structure corresponding to uniform correlation.
corCS(rho, p = 2)corCS(rho, p = 2)
rho |
the value of the correlation between any two correlated observations, which must be between -1 and 1. |
p |
dimension of the requested correlation matrix. |
Returns a by matrix.
R <- corCS(rho = 0.8, p = 5)R <- corCS(rho = 0.8, p = 5)
Returns a list containing the mean and covariance matrix of the data.
cov.MSSD(x)cov.MSSD(x)
x |
a matrix or data frame. As usual, rows are observations and columns are variables. |
This procedure uses the Holmes-Mergen method using the difference between each successive pairs of observations also known as Mean Square Successive Method (MSSD) to estimate the covariance matrix, which is given by
A list containing the following named components:
mean |
an estimate for the center (mean) of the data. |
cov |
the estimated covariance matrix. |
Holmes, D.S., Mergen, A.E. (1993).
Improving the performance of the control chart.
Quality Engineering 5, 619-625.
x <- cbind(1:10, c(1:3, 8:5, 8:10)) z0 <- cov(x) z0 z1 <- cov.MSSD(x) z1x <- cbind(1:10, c(1:3, 8:5, 8:10)) z0 <- cov(x) z0 z1 <- cov.MSSD(x) z1
Returns a list containing estimates of the weighted mean and covariance matrix of the data.
cov.weighted(x, weights = rep(1, nrow(x)))cov.weighted(x, weights = rep(1, nrow(x)))
x |
a matrix or data frame. As usual, rows are observations and columns are variables. |
weights |
a non-negative and non-zero vector of weights for each
observation. Its length must equal the number of rows of |
The covariance matrix is divided by the number of observations, which arise for instance, when we use the class of elliptical contoured distributions. Thus,
This differs from the behaviour of function cov.wt.
A list containing the following named components:
mean |
an estimate for the center (mean) of the data. |
cov |
the estimated (weighted) covariance matrix. |
Clarke, M.R.B. (1971). Algorithm AS 41: Updating the sample mean and dispersion matrix. Applied Statistics 20, 206-209.
x <- cbind(1:10, c(1:3, 8:5, 8:10)) z0 <- cov.weighted(x) # all weights are 1 D2 <- Mahalanobis(x, center = z0$mean, cov = z0$cov) p <- ncol(x) wts <- (p + 1) / (1 + D2) # nice weights! z1 <- cov.weighted(x, weights = wts) z1x <- cbind(1:10, c(1:3, 8:5, 8:10)) z0 <- cov.weighted(x) # all weights are 1 D2 <- Mahalanobis(x, center = z0$mean, cov = z0$cov) p <- ncol(x) wts <- (p + 1) / (1 + D2) # nice weights! z1 <- cov.weighted(x, weights = wts) z1
Given the order of two duplication matrices and a conformable matrix ,
this function performs the operation: ,
where and are duplication matrices of order
and , respectively.
dupl.cross(n = 1, k = n, x = NULL)dupl.cross(n = 1, k = n, x = NULL)
n |
order of the duplication matrix used pre-multiplying |
k |
order of the duplication matrix used post-multiplying |
x |
numeric matrix, this argument is required. |
This function calls dupl.prod to performs the matrix multiplications required
but without forming any duplication matrices.
D2 <- duplication(n = 2, matrix = TRUE) D3 <- duplication(n = 3, matrix = TRUE) x <- matrix(1, nrow = 9, ncol = 4) y <- t(D3) %*% x %*% D2 z <- dupl.cross(n = 3, k = 2, x) # D2 and D3 are not stored all(z == y) # matrices y and z are equal! x <- matrix(1, nrow = 9, ncol = 9) z <- dupl.cross(n = 3, x = x) # same matrix is used to pre- and post-multiplying x z # print resultD2 <- duplication(n = 2, matrix = TRUE) D3 <- duplication(n = 3, matrix = TRUE) x <- matrix(1, nrow = 9, ncol = 4) y <- t(D3) %*% x %*% D2 z <- dupl.cross(n = 3, k = 2, x) # D2 and D3 are not stored all(z == y) # matrices y and z are equal! x <- matrix(1, nrow = 9, ncol = 9) z <- dupl.cross(n = 3, x = x) # same matrix is used to pre- and post-multiplying x z # print result
This function provides the minimum information required to create the duplication matrix.
dupl.info(n = 1, condensed = TRUE)dupl.info(n = 1, condensed = TRUE)
n |
order of the duplication matrix. |
condensed |
logical. Information should be returned in compact form? |
This function returns a list containing two vectors that represent an element of
the duplication matrix and is accesed by the indexes in vectors row and col.
This information is used by function dupl.prod to do some operations
involving the duplication matrix without forming it. This information also can be
obtained using function duplication
A list containing the following elements:
row |
vector of indexes, each entry represents the row index of the duplication
matrix. Only present if |
col |
vector of indexes, each entry represents the column index of the duplication matrix. |
order |
order of the duplication matrix. |
z <- dupl.info(n = 3, condensed = FALSE) z # where are the ones in duplication of order 3? D3 <- duplication(n = 3, matrix = TRUE) D3 # only recommended if n is very smallz <- dupl.info(n = 3, condensed = FALSE) z # where are the ones in duplication of order 3? D3 <- duplication(n = 3, matrix = TRUE) D3 # only recommended if n is very small
Given the order of a duplication and a conformable matrix , performs
one of the matrix-matrix operations:
, if side = "left" and transposed = FALSE, or
, if side = "left" and transposed = TRUE, or
, if side = "right" and transposed = FALSE, or
, if side = "right" and transposed = TRUE,
where is the duplication matrix of order . The main aim of
dupl.prod is to do this matrix multiplication without forming the
duplication matrix.
dupl.prod(n = 1, x, transposed = FALSE, side = "left")dupl.prod(n = 1, x, transposed = FALSE, side = "left")
n |
order of the duplication matrix. |
x |
numeric matrix (or vector). |
transposed |
logical. Duplication matrix should be transposed? |
side |
a string selecting if duplication matrix is pre-multiplying |
Underlying C code only uses information provided by dupl.info to
performs the matrix multiplication. The duplication matrix is never created.
D4 <- duplication(n = 4, matrix = TRUE) x <- matrix(1, nrow = 16, ncol = 2) y <- crossprod(D4, x) z <- dupl.prod(n = 4, x, transposed = TRUE) # D4 is not stored all(z == y) # matrices y and z are equal!D4 <- duplication(n = 4, matrix = TRUE) x <- matrix(1, nrow = 16, ncol = 2) y <- crossprod(D4, x) z <- dupl.prod(n = 4, x, transposed = TRUE) # D4 is not stored all(z == y) # matrices y and z are equal!
This function returns the duplication matrix of order which transforms,
for a symmetric matrix , vech into vec.
duplication(n = 1, matrix = FALSE, condensed = FALSE)duplication(n = 1, matrix = FALSE, condensed = FALSE)
n |
order of the duplication matrix. |
matrix |
a logical indicating whether the duplication matrix will be returned. |
condensed |
logical. Information should be returned in compact form?. |
This function is a wrapper function for the function dupl.info. This function
provides the minimum information required to create the duplication matrix. If option
matrix = FALSE the duplication matrix is stored in two vectors containing the
coordinate list of indexes for rows and columns. Option condensed = TRUE only
returns vector of indexes for the columns of duplication matrix.
Warning: matrix = TRUE is not recommended, unless the order n
be small. This matrix can require a huge amount of storage.
Returns an by matrix (if requested).
Magnus, J.R., Neudecker, H. (1980). The elimination matrix, some lemmas and applications. SIAM Journal on Algebraic Discrete Methods 1, 422-449.
Magnus, J.R., Neudecker, H. (2007). Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd Edition. Wiley, New York.
z <- duplication(n = 100, condensed = TRUE) object.size(z) # 40.5 Kb of storage z <- duplication(n = 100, condensed = FALSE) object.size(z) # 80.6 Kb of storage D100 <- duplication(n = 100, matrix = TRUE) object.size(D100) # 202 Mb of storage, do not request this matrix! # a small example D3 <- duplication(n = 3, matrix = TRUE) a <- matrix(c( 1, 2, 3, 2, 3, 4, 3, 4, 5), nrow = 3) upper <- vech(a) v <- D3 %*% upper all(vec(a) == as.vector(v)) # vectors are equal!z <- duplication(n = 100, condensed = TRUE) object.size(z) # 40.5 Kb of storage z <- duplication(n = 100, condensed = FALSE) object.size(z) # 80.6 Kb of storage D100 <- duplication(n = 100, matrix = TRUE) object.size(D100) # 202 Mb of storage, do not request this matrix! # a small example D3 <- duplication(n = 3, matrix = TRUE) a <- matrix(c( 1, 2, 3, 2, 3, 4, 3, 4, 5), nrow = 3) upper <- vech(a) v <- D3 %*% upper all(vec(a) == as.vector(v)) # vectors are equal!
Equilibrate a rectangular or symmetric matrix using 2-norm.
equilibrate(x, scale = TRUE)equilibrate(x, scale = TRUE)
x |
a numeric matrix. |
scale |
a logical value, |
For scale = TRUE, the equilibrated matrix. The scalings and an approximation
of the condition number, are returned as attributes "scales" and "condition".
If is a rectangular matrix, only the columns are equilibrated.
x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) z <- equilibrate(x) apply(z, 2, function(x) sum(x^2)) # all 1 xx <- crossprod(x) equilibrate(xx)x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) z <- equilibrate(x) apply(z, 2, function(x) sum(x^2)) # all 1 xx <- crossprod(x) equilibrate(xx)
The Floyd-Warshall algorithm finds all shortest paths (if exist) in a directed graph.
floyd(x)floyd(x)
x |
adjacency matrix of a directed graph. |
Returns a list with final costs and the shortest path between two nodes.
Floyd, R.W. (1962). Algorithm 97: Shortest Path. Communications of the ACM 5 (6), 345.
x <- matrix(c(0,3,Inf,5,2,0,Inf,4,Inf,1,0,Inf,Inf,Inf,2,0), nrow = 4, ncol = 4, byrow = TRUE) z <- floyd(x) zx <- matrix(c(0,3,Inf,5,2,0,Inf,4,Inf,1,0,Inf,Inf,Inf,2,0), nrow = 4, ncol = 4, byrow = TRUE) z <- floyd(x) z
This function returns the Frank matrix of order .
frank(n = 1)frank(n = 1)
n |
order of the Frank matrix. |
A Frank matrix of order is a square matrix defined as
Returns an by matrix.
Frank, W.L. (1958). Computing eigenvalues of complex matrices by determinant evaluation and by methods of Danilewski and Wielandt. Journal of the Society for Industrial and Applied Mathematics 6, 378-392.
F5 <- frank(n = 5) det(F5) # equals 1F5 <- frank(n = 5) det(F5) # equals 1
It calculates the geometric mean using a Fused-Multiply-and-Add (FMA) compensated scheme for accurate computation of floating-point product.
geomean(x)geomean(x)
x |
a numeric vector containing the sample observations. |
If x contains any non-positive values, geomean returns NA and
a warning message is displayed.
The geometric mean is a measure of central tendency, which is defined as
This procedure calculates the product required in the geometric mean safely using a compensated scheme as proposed by Graillat (2009).
The geometric mean of the sample, a non-negative number.
Graillat, S. (2009). Accurate floating-point product and exponentiation. IEEE Transactions on Computers 58, 994-1000.
Oguita, T., Rump, S.M., Oishi, S. (2005). Accurate sum and dot product. SIAM Journal on Scientific Computing 26, 1955-1988.
set.seed(149) x <- rlnorm(1000) mean(x) # 1.68169 median(x) # 0.99663 geomean(x) # 1.01688set.seed(149) x <- rlnorm(1000) mean(x) # 1.68169 median(x) # 0.99663 geomean(x) # 1.01688
This function returns the Hadamard or element-wise product of two matrices
and , that have the same dimensions.
hadamard(x, y = x)hadamard(x, y = x)
x |
a numeric matrix or vector. |
y |
a numeric matrix or vector. |
A matrix with the same dimension of (and ) which corresponds
to the element-by-element product of the two matrices.
Styan, G.P.H. (1973). Hadamard products and multivariate statistical analysis, Linear Algebra and Its Applications 6, 217-240.
x <- matrix(rep(1:10, times = 5), ncol = 5) y <- matrix(rep(1:5, each = 10), ncol = 5) z <- hadamard(x, y) zx <- matrix(rep(1:10, times = 5), ncol = 5) y <- matrix(rep(1:5, each = 10), ncol = 5) z <- hadamard(x, y) z
Forms a symmetric Hankel matrix of order from the values in vector
and optionally the vector .
hankel(x, y = NULL)hankel(x, y = NULL)
x |
the first column to form the Hankel matrix. |
y |
the last column of the Hankel matrix. If |
A symmetric Hankel matrix of order .
x <- 1:4 y <- c(4,6,8,10) # H4 hankel(x) # H({1,2,3,4},{4,6,8,10}) hankel(x, y)x <- 1:4 y <- c(4,6,8,10) # H4 hankel(x) # H({1,2,3,4},{4,6,8,10}) hankel(x, y)
Performs large-sample methods for testing equality of correlated variables.
harris.test(x, test = "Wald")harris.test(x, test = "Wald")
x |
a matrix or data frame. As usual, rows are observations and columns are variables. |
test |
test statistic to be used. One of "Wald" (default), "log", "robust" or "log-robust". |
A list of class 'harris.test' with the following elements:
statistic |
value of the statistic, i.e. the value of either Wald test, using the log-transformation, or distribution-robust versions of the test (robust and log-robust). |
parameter |
the degrees of freedom for the test statistic, which is chi-square distributed. |
p.value |
the p-value for the test. |
estimate |
the estimated covariance matrix. |
method |
a character string indicating what type of test was performed. |
Harris, P. (1985). Testing the variance homogeneity of correlated variables. Biometrika 72, 103-107.
x <- iris[,1:4] z <- harris.test(x, test = "robust") zx <- iris[,1:4] z <- harris.test(x, test = "robust") z
This function returns the Helmert matrix of order .
helmert(n = 1)helmert(n = 1)
n |
order of the Helmert matrix. |
A Helmert matrix of order is a square matrix defined as
Helmert matrix is orthogonal and is frequently used in the analysis of variance (ANOVA).
Returns an by matrix.
Lancaster, H.O. (1965). The Helmert matrices. The American Mathematical Monthly 72, 4-12.
Gentle, J.E. (2007). Matrix Algebra: Theory, Computations, and Applications in Statistics. Springer, New York.
n <- 1000 set.seed(149) x <- rnorm(n) H <- helmert(n) object.size(H) # 7.63 Mb of storage K <- H[2:n,] z <- c(K %*% x) sum(z^2) # 933.1736 # same that (n - 1) * var(x)n <- 1000 set.seed(149) x <- rnorm(n) H <- helmert(n) object.size(H) # 7.63 Mb of storage K <- H[2:n,] z <- c(K %*% x) sum(z^2) # 933.1736 # same that (n - 1) * var(x)
This function returns the Householder matrix (also called Householder reflection)
or the Householder vector, which are constructed based on a nonzero vector .
house(x, matrix = FALSE)house(x, matrix = FALSE)
x |
an |
matrix |
a logical indicating whether the commutation matrix will be returned. |
A Householder transformation is a rank-1 modification of the identity matrix which can
be used to zero out selected elements of a vector. A Householder matrix
adopts the form
where is called Householder vector, and
If matrix = TRUE an by matrix is returned, otherwise house
function return a list containing the following components:
u |
the Householder vector. |
tau |
the scalar |
Golub, G.H., van Loan, C.F. (1996). Matrix Computations, 3rd Edition. The Johns Hopkins University Press, Baltimore.
x <- c(3,1,5,1) z <- house(x) z mat <- house(x, matrix = TRUE) v <- mat %*% x v[1,1] == minkowski(x) # TRUEx <- c(3,1,5,1) z <- house(x) z mat <- house(x, matrix = TRUE) v <- mat %*% x v[1,1] == minkowski(x) # TRUE
This function applies the Householder matrix defined by a nonzero vector to
a matrix . Thus, the output of this function is the matrix-matrix operation:
, if side = "left", or
, if side = "right".
house.prod(a, x, side = "left")house.prod(a, x, side = "left")
a |
a numeric rectangular matrix. |
x |
an |
side |
a string selecting if Householder matrix is pre-multiplying |
Underlying code never entail the explicit formation of the Householder matrix.
Golub, G.H., van Loan, C.F. (1996). Matrix Computations, 3rd Edition. The Johns Hopkins University Press, Baltimore.
x <- c(3,1,5,1) u <- house(x)$u house.prod(u, x) y <- as.matrix(x) house.prod(y, x)x <- c(3,1,5,1) u <- house(x)$u house.prod(u, x) y <- as.matrix(x) house.prod(y, x)
Returns TRUE if the given matrix is lower or upper triangular matrix.
is.lower.tri(x, diag = FALSE) is.upper.tri(x, diag = FALSE)is.lower.tri(x, diag = FALSE) is.upper.tri(x, diag = FALSE)
x |
a matrix of other R object with |
diag |
logical. Should the diagonal be included? |
Check if a matrix is lower or upper triangular. You can also include diagonal to the check.
x <- matrix(rnorm(10 * 3), ncol = 3) R <- chol(crossprod(x)) is.lower.tri(R) is.upper.tri(R)x <- matrix(rnorm(10 * 3), ncol = 3) R <- chol(crossprod(x)) is.lower.tri(R) is.upper.tri(R)
Jacobi method is an iterative algorithm for solving a system of linear equations.
jacobi(a, b, start, maxiter = 200, tol = 1e-7)jacobi(a, b, start, maxiter = 200, tol = 1e-7)
a |
a square numeric matrix containing the coefficients of the linear system. |
b |
a vector of right-hand sides of the linear system. |
start |
a vector for initial starting point. |
maxiter |
the maximum number of iterations. Defaults to |
tol |
tolerance level for stopping iterations. |
Let , , and denote the diagonal, lower
triangular and upper triangular parts of a matrix . Jacobi's method
solve the equation , iteratively by rewriting . Assuming that is nonsingular
leads to the iteration formula
a vector with the approximate solution, the iterations performed are returned as the attribute 'iterations'.
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
a <- matrix(c(5,-3,2,-2,9,-1,3,1,-7), ncol = 3) b <- c(-1,2,3) start <- c(1,1,1) z <- jacobi(a, b, start) z # converged in 15 iterationsa <- matrix(c(5,-3,2,-2,9,-1,3,1,-7), ncol = 3) b <- c(-1,2,3) start <- c(1,1,1) z <- jacobi(a, b, start) z # converged in 15 iterations
Performs an omnibus test for univariate normality.
JarqueBera.test(x, test = "DH")JarqueBera.test(x, test = "DH")
x |
a numeric vector containing the sample observations. |
test |
test statistic to be used. One of "DH" (Doornik-Hansen, the default), "JB" (Jarque-Bera), "robust" (robust modification by Gel and Gastwirth), "ALM" (Adjusted Lagrange multiplier). |
A list of class 'JarqueBera.test' with the following elements:
statistic |
value of the statistic, i.e. the value of either Doornik-Hansen, Jarque-Bera, or Adjusted Lagrange multiplier test. |
parameter |
the degrees of freedom for the test statistic, which is chi-square distributed. |
p.value |
the p-value for the test. |
skewness |
the estimated skewness coefficient. |
kurtosis |
the estimated kurtosis coefficient. |
method |
a character string indicating what type of test was performed. |
Doornik, J.A., Hansen, H. (2008). An omnibus test for univariate and multivariate normality. Oxford Bulletin of Economics and Statistics 70, 927-939.
Gel, Y.R., Gastwirth, J.L. (2008). A robust modification of the Jarque-Bera test of normality. Economics Letters 99, 30-32.
Jarque, C.M., Bera, A.K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters 6, 255-259.
Urzua, C.M. (1996). On the correct use of omnibus tests for normality. Economics Letters 53, 247-251.
set.seed(149) x <- rnorm(100) z <- JarqueBera.test(x, test = "DH") z set.seed(173) x <- runif(100) z <- JarqueBera.test(x, test = "DH") zset.seed(149) x <- rnorm(100) z <- JarqueBera.test(x, test = "DH") z set.seed(173) x <- runif(100) z <- JarqueBera.test(x, test = "DH") z
Computes the kronecker product of two matrices, and .
kronecker.prod(x, y = x)kronecker.prod(x, y = x)
x |
a numeric matrix or vector. |
y |
a numeric matrix or vector. |
Let be an and a matrix.
The matrix defined by
is called the Kronecker product of and .
An array with dimensions dim(x) * dim(y).
Magnus, J.R., Neudecker, H. (2007). Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd Edition. Wiley, New York.
kronecker function from base package is based on outer.
Our C version is slightly faster.
# block diagonal matrix: a <- diag(1:3) b <- matrix(1:4, ncol = 2) kronecker.prod(a, b) # examples with vectors ones <- rep(1, 4) y <- 1:3 kronecker.prod(ones, y) # 12-dimensional vector kronecker.prod(ones, t(y)) # 3 x 3 matrix# block diagonal matrix: a <- diag(1:3) b <- matrix(1:4, ncol = 2) kronecker.prod(a, b) # examples with vectors ones <- rep(1, 4) y <- 1:3 kronecker.prod(ones, y) # 12-dimensional vector kronecker.prod(ones, t(y)) # 3 x 3 matrix
Given an by real matrix and an -vector ,
this function constructs the Krylov matrix , where
krylov(a, b, m = ncol(a))krylov(a, b, m = ncol(a))
a |
a numeric square matrix of order |
b |
a numeric vector of length |
m |
length of the Krylov sequence. |
Returns an by matrix.
a <- matrix(c(1, 3, 2, -5, 1, 7, 1, 5, -4), ncol = 3, byrow = TRUE) b <- c(1, 1, 1) k <- krylov(a, b, m = 4) ka <- matrix(c(1, 3, 2, -5, 1, 7, 1, 5, -4), ncol = 3, byrow = TRUE) b <- c(1, 1, 1) k <- krylov(a, b, m = 4) k
Functions to compute measures of multivariate skewness and kurtosis
proposed by Mardia (1970),
and
kurtosis(x) skewness(x)kurtosis(x) skewness(x)
x |
matrix of data with, say, |
Mardia, K.V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika 57, 519-530.
Mardia, K.V., Zemroch, P.J. (1975). Algorithm AS 84: Measures of multivariate skewness and kurtosis. Applied Statistics 24, 262-265.
setosa <- iris[1:50,1:4] kurtosis(setosa) skewness(setosa)setosa <- iris[1:50,1:4] kurtosis(setosa) skewness(setosa)
Compute the LDL decomposition of a real symmetric matrix.
ldl(x)ldl(x)
x |
a symmetric numeric matrix whose LDL decomposition is to be computed. |
The factorization has the form , where
is a diagonal matrix and is unitary lower triangular.
The LDL decomposition of is returned as a list with components:
lower |
the unitary lower triangular factor |
d |
a vector containing the diagonal elements of |
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
a <- matrix(c(2,-1,0,-1,2,-1,0,-1,1), ncol = 3) z <- ldl(a) z # information of LDL factorization # computing det(a) prod(z$d) # product of diagonal elements of D # a non-positive-definite matrix m <- matrix(c(5,-5,-5,3), ncol = 2) try(chol(m)) # fails ldl(m)a <- matrix(c(2,-1,0,-1,2,-1,0,-1,1), ncol = 3) z <- ldl(a) z # information of LDL factorization # computing det(a) prod(z$d) # product of diagonal elements of D # a non-positive-definite matrix m <- matrix(c(5,-5,-5,3), ncol = 2) try(chol(m)) # fails ldl(m)
lu computes the LU factorization of a matrix.
lu(x) ## Default S3 method: lu(x) ## S3 method for class 'lu' solve(a, b, ...) is.lu(x)lu(x) ## Default S3 method: lu(x) ## S3 method for class 'lu' solve(a, b, ...) is.lu(x)
x |
a square numeric matrix whose LU factorization is to be computed. |
a |
an LU factorization of a square matrix. |
b |
a vector or matrix of right-hand sides of equations. |
... |
further arguments passed to or from other methods |
The LU factorization plays an important role in many numerical procedures. In
particular it is the basic method to solve the equation
for given matrix , and vector .
solve.lu is the method for solve for lu objects.
is.lu returns TRUE if x is a list
and inherits from "lu".
Unsuccessful results from the underlying LAPACK code will result in an
error giving a positive error code: these can only be interpreted by
detailed study of the Fortran code.
The LU factorization of the matrix as computed by LAPACK. The components in
the returned value correspond directly to the values returned by DGETRF.
lu |
a matrix with the same dimensions as |
pivot |
information on the pivoting strategy used during the factorization. |
To compute the determinant of a matrix (do you really need it?),
the LU factorization is much more efficient than using eigenvalues
(eigen). See det.
LAPACK uses column pivoting and does not attempt to detect rank-deficient matrices.
Anderson. E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A. Sorensen, D. (1999). LAPACK Users' Guide, 3rd Edition. SIAM.
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
extractL, extractU, constructX for
reconstruction of the matrices,
lu2inv
a <- matrix(c(3,2,6,17,4,18,10,-2,-12), ncol = 3) z <- lu(a) z # information of LU factorization # computing det(a) prod(diag(z$lu)) # product of diagonal elements of U # solve linear equations b <- matrix(1:6, ncol = 2) solve(z, b)a <- matrix(c(3,2,6,17,4,18,10,-2,-12), ncol = 3) z <- lu(a) z # information of LU factorization # computing det(a) prod(diag(z$lu)) # product of diagonal elements of U # solve linear equations b <- matrix(1:6, ncol = 2) solve(z, b)
Returns the original matrix from which the object was constructed or the components of the factorization.
constructX(x) extractL(x) extractU(x)constructX(x) extractL(x) extractU(x)
x |
object representing an LU factorization. This will typically have
come from a previous call to |
constructX returns , the original matrix from which the lu
object was constructed (because of the pivoting the matrix is not exactly
the product between and ).
extractL returns . This may be pivoted.
extractU returns .
lu.
a <- matrix(c(10,-3,5,-7,2,-1,0,6,5), ncol = 3) z <- lu(a) L <- extractL(z) L U <- extractU(z) U X <- constructX(z) all(a == X)a <- matrix(c(10,-3,5,-7,2,-1,0,6,5), ncol = 3) z <- lu(a) L <- extractL(z) L U <- extractU(z) U X <- constructX(z) all(a == X)
Invert a square matrix from its LU factorization.
lu2inv(x)lu2inv(x)
x |
object representing an LU factorization. This will typically have
come from a previous call to |
The inverse of the matrix whose LU factorization was given.
Unsuccessful results from the underlying LAPACK code will result in an
error giving a positive error code: these can only be interpreted by
detailed study of the Fortran code.
This is an interface to the LAPACK routine DGETRI. LAPACK is from
https://netlib.org/lapack/ and its guide is listed in the references.
Anderson. E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A. Sorensen, D. (1999). LAPACK Users' Guide, 3rd Edition. SIAM.
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
a <- matrix(c(3,2,6,17,4,18,10,-2,-12), ncol = 3) z <- lu(a) a %*% lu2inv(z)a <- matrix(c(3,2,6,17,4,18,10,-2,-12), ncol = 3) z <- lu(a) a %*% lu2inv(z)
Returns the squared Mahalanobis distance of all rows in and the
vector = center with respect to = cov.
This is (for vector ) defined as
Mahalanobis(x, center, cov, inverted = FALSE)Mahalanobis(x, center, cov, inverted = FALSE)
x |
vector or matrix of data. As usual, rows are observations and columns are variables. |
center |
mean vector of the distribution. |
cov |
covariance matrix ( |
inverted |
logical. If |
Unlike function mahalanobis, the covariance matrix is factorized using the
Cholesky decomposition, which allows to assess if cov is positive definite.
Unsuccessful results from the underlying LAPACK code will result in an error message.
x <- cbind(1:6, 1:3) xbar <- colMeans(x) S <- matrix(c(1,4,4,1), ncol = 2) # is negative definite D2 <- mahalanobis(x, center = xbar, S) all(D2 >= 0) # several distances are negative ## next command produces the following error: ## Covariance matrix is possibly not positive-definite ## Not run: D2 <- Mahalanobis(x, center = xbar, S)x <- cbind(1:6, 1:3) xbar <- colMeans(x) S <- matrix(c(1,4,4,1), ncol = 2) # is negative definite D2 <- mahalanobis(x, center = xbar, S) all(D2 >= 0) # several distances are negative ## next command produces the following error: ## Covariance matrix is possibly not positive-definite ## Not run: D2 <- Mahalanobis(x, center = xbar, S)
Performs Mardia's tests to assess multivariate normality based on the multivariate skewness and kurtosis coefficients.
mardia.test(x)mardia.test(x)
x |
matrix of data with, say, |
A list of class 'Mardia.test' with the following elements:
skewness |
a list containig the |
kurtosis |
a list containig the |
Mardia, K.V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika 57, 519-530.
Mardia, K.V. (1974). Applications of some measures of multivariate skewness and kurtosis in testing normality and robustness studies. Sankhya 36, 115-128.
setosa <- iris[1:50,1:4] z <- mardia.test(setosa) z set.seed(149) Sigma <- matrix(c(10,3,3,2), ncol = 2) x <- rmnorm(n = 300, Sigma = Sigma) z <- mardia.test(x) zsetosa <- iris[1:50,1:4] z <- mardia.test(setosa) z set.seed(149) Sigma <- matrix(c(10,3,3,2), ncol = 2) x <- rmnorm(n = 300, Sigma = Sigma) z <- mardia.test(x) z
This function computes the matrix function where
is upper triangular by applying a Parlett recurrence.
matrix.fun(a, FUN = "log")matrix.fun(a, FUN = "log")
a |
an upper triangular matrix. |
FUN |
the function to be applied, by default |
The used-defined function FUN is evaluated at the triangular matrix
argument. This function can be used in conjunction with Schur decomposition
to evaluate the function of a matrix.
Higham, N.J. (1986). Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia.
a <- matrix(c(1,2,3,0,3,4,0,0,5), ncol = 3, byrow = TRUE) fnc <- function(x) (1 + x) / x f <- matrix.fun(a, FUN = fnc) f a <- matrix(c(-49,24,-64,31), ncol = 2, byrow = TRUE) z <- schur(a) m <- z$m u <- z$vectors m <- matrix.fun(m, FUN = exp) u %*% m %*% t(u) # exp(a)a <- matrix(c(1,2,3,0,3,4,0,0,5), ncol = 3, byrow = TRUE) fnc <- function(x) (1 + x) / x f <- matrix.fun(a, FUN = fnc) f a <- matrix(c(-49,24,-64,31), ncol = 2, byrow = TRUE) z <- schur(a) m <- z$m u <- z$vectors m <- matrix.fun(m, FUN = exp) u %*% m %*% t(u) # exp(a)
Computes the inner product between two rectangular matrices calling BLAS.
matrix.inner(x, y = x)matrix.inner(x, y = x)
x |
a numeric matrix. |
y |
a numeric matrix. |
a real value, indicating the inner product between two matrices.
x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) y <- matrix(1, nrow = 6, ncol = 3) matrix.inner(x, y) # must be equal matrix.norm(x, type = "Frobenius")^2 matrix.inner(x)x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) y <- matrix(1, nrow = 6, ncol = 3) matrix.inner(x, y) # must be equal matrix.norm(x, type = "Frobenius")^2 matrix.inner(x)
Computes a matrix norm of x using LAPACK. The norm can be the one ("1")
norm, the infinity ("inf") norm, the Frobenius norm, the maximum modulus
("maximum") among elements of a matrix, as determined by the value of type.
matrix.norm(x, type = "Frobenius")matrix.norm(x, type = "Frobenius")
x |
a numeric matrix. |
type |
character string, specifying the type of matrix norm to be computed. A character indicating the type of norm desired.
|
As function norm in package base, method of matrix.norm calls
the LAPACK function DLANGE.
Note that the 1-, Inf- and maximum norm is faster to calculate than the Frobenius one.
The matrix norm, a non-negative number.
# a tiny example x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) matrix.norm(x, type = "Frobenius") matrix.norm(x, type = "1") matrix.norm(x, type = "Inf") # an example not that small n <- 1000 x <- .5 * diag(n) + 0.5 * matrix(1, nrow = n, ncol = n) matrix.norm(x, type = "Frobenius") matrix.norm(x, type = "1") matrix.norm(x, type = "Inf") matrix.norm(x, type = "maximum") # equal to 1# a tiny example x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) matrix.norm(x, type = "Frobenius") matrix.norm(x, type = "1") matrix.norm(x, type = "Inf") # an example not that small n <- 1000 x <- .5 * diag(n) + 0.5 * matrix(1, nrow = n, ncol = n) matrix.norm(x, type = "Frobenius") matrix.norm(x, type = "1") matrix.norm(x, type = "Inf") matrix.norm(x, type = "maximum") # equal to 1
Given coefficients of the polynomial and
an by matrix. This function computes the matrix polynomial
using Horner's scheme, where is the by identity matrix.
matrix.polynomial(a, coef = rep(1, power + 1), power = length(coef))matrix.polynomial(a, coef = rep(1, power + 1), power = length(coef))
a |
a numeric square matrix of order |
coef |
numeric vector containing the coefficients of the polinomial in order of increasing power. |
power |
a numeric exponent (which is forced to be an integer). If provided, |
Returns an by matrix.
a <- matrix(c(1, 3, 2, -5, 1, 7, 1, 5, -4), ncol = 3, byrow = TRUE) cf <- c(3, 1, 2) b <- matrix.polynomial(a, cf) b # 3 * diag(3) + a + 2 * a %*% a b <- matrix.polynomial(a, power = 2) b # diag(3) + a + a %*% aa <- matrix(c(1, 3, 2, -5, 1, 7, 1, 5, -4), ncol = 3, byrow = TRUE) cf <- c(3, 1, 2) b <- matrix.polynomial(a, cf) b # 3 * diag(3) + a + 2 * a %*% a b <- matrix.polynomial(a, power = 2) b # diag(3) + a + a %*% a
This function computes a square root of an matrix .
matrix.sqrt(a, method = "DB", maxiter = 50, tol = 1e-8)matrix.sqrt(a, method = "DB", maxiter = 50, tol = 1e-8)
a |
a square matrix. |
method |
the procedure used to obtain the square root. If |
maxiter |
the maximum number of iterations. Defaults to |
tol |
a numeric tolerance. |
A square root of a square matrix is obtained by solving the
equation , considering the Newton iteration proposed
by Denman and Beavers (1976), or alternatively is based on the Schur decomposition.
Denman, E.D., Beavers, A.N. (1976). The matrix sign function and computations in systems. Applied Mathematics and Computation 2, 63-94.
Higham, N.J. (1986). Newton's method for the matrix square root. Mathematics of Computation 46, 537-549.
Higham, N.J. (1986). Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia.
a <- matrix(c(35,17,3,17,46,11,3,11,12), ncol = 3) root <- matrix.sqrt(a) # 8 iterations # just checking root %*% root root <- matrix.sqrt(a, method = "schur")a <- matrix(c(35,17,3,17,46,11,3,11,12), ncol = 3) root <- matrix.sqrt(a) # 8 iterations # just checking root %*% root root <- matrix.sqrt(a, method = "schur")
Compute the Cholesky factorization of a real symmetric but not necessarily positive definite matrix.
mchol(x)mchol(x)
x |
a symmetric but not necessarily positive definite matrix to be factored. |
The lower triangular factor of modified Cholesky decomposition, i.e., the matrix
such that , where
is a nonnegative diagonal matrix that is zero if es sufficiently positive
definite.
Gill, P.E., Murray, W., Wright, M.H. (1981). Practical Optimization. Academic Press, London.
Nocedal, J., Wright, S.J. (1999). Numerical Optimization. Springer, New York.
# a non-positive-definite matrix a <- matrix(c(4,2,1,2,6,3,1,3,-.004), ncol = 3) try(chol(a)) # fails z <- mchol(a) z # triangular factor # modified 'a' matrix tcrossprod(z)# a non-positive-definite matrix a <- matrix(c(4,2,1,2,6,3,1,3,-.004), ncol = 3) try(chol(a)) # fails z <- mchol(a) z # triangular factor # modified 'a' matrix tcrossprod(z)
It calculates the mediancenter (or geometric median) of multivariate data.
mediancenter(x)mediancenter(x)
x |
a matrix or data frame. As usual, rows are observations and columns are variables. |
The mediancenter for a sample of multivariate observations is computed using a steepest descend method combined with bisection. The mediancenter invariant to rotations of axes and is useful as a multivariate generalization of the median of univariate sample.
A list containing the following named components:
median |
an estimate for the mediancenter of the data. |
iter |
the number of iterations performed, it is negative if a degenerate solution is found. |
Gower, J.C. (1974). Algorithm AS 78: The mediancentre. Applied Statistics 23, 466-470.
x <- cbind(1:10, c(1:3, 8:5, 8:10)) z <- mediancenter(x)$median # degenerate solution xbar <- colMeans(x) plot(x, xlab = "", ylab = "") points(x = xbar[1], y = xbar[2], pch = 16, col = "red") points(x = z[1], y = z[2], pch = 3, col = "blue", lwd = 2)x <- cbind(1:10, c(1:3, 8:5, 8:10)) z <- mediancenter(x)$median # degenerate solution xbar <- colMeans(x) plot(x, xlab = "", ylab = "") points(x = xbar[1], y = xbar[2], pch = 16, col = "red") points(x = z[1], y = z[2], pch = 3, col = "blue", lwd = 2)
Computes a -norm of vector . The norm can be the one ()
norm, Euclidean () norm, the infinity ( = Inf) norm. The underlying
C or Fortran code is inspired on ideas of BLAS Level 1.
minkowski(x, p = 2)minkowski(x, p = 2)
x |
a numeric vector. |
p |
a number, specifying the type of norm desired. Possible values include
real number greater or equal to 1, or Inf, Default value is |
Method of minkowski for = Inf calls idamax BLAS function.
For other values, C or Fortran subroutines using unrolled cycles are
called.
The vector -norm, a non-negative number.
# a tiny example x <- rnorm(1000) minkowski(x, p = 1) minkowski(x, p = 1.5) minkowski(x, p = 2) minkowski(x, p = Inf) x <- x / minkowski(x) minkowski(x, p = 2) # equal to 1# a tiny example x <- rnorm(1000) minkowski(x, p = 1) minkowski(x, p = 1.5) minkowski(x, p = 2) minkowski(x, p = Inf) x <- x / minkowski(x) minkowski(x, p = 2) # equal to 1
It calculates up to fourth central moments (or moments about the mean), and the skewness and kurtosis coefficients using an online algorithm.
moments(x)moments(x)
x |
a numeric vector containing the sample observations. |
The -th central moment is defined as
In particular, the second central moment is the variance of the sample. The sample skewness and kurtosis are defined, respectively, as
A list containing second, third and fourth central moments,
and skewness and kurtosis coefficients.
Spicer, C.C. (1972). Algorithm AS 52: Calculation of power sums of deviations about the mean. Applied Statistics 21, 226-227.
var.
set.seed(149) x <- rnorm(1000) z <- moments(x) zset.seed(149) x <- rnorm(1000) z <- moments(x) z
Returns an object of class "ols" that represents a linear model fit.
ols(formula, data, subset, na.action, method = "qr", tol = 1e-7, maxiter = 100, x = FALSE, y = FALSE, contrasts = NULL, ...)ols(formula, data, subset, na.action, method = "qr", tol = 1e-7, maxiter = 100, x = FALSE, y = FALSE, contrasts = NULL, ...)
formula |
an object of class |
data |
an optional data frame, list or environment (or object coercible
by |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
method |
the least squares fitting method to be used; the options are |
tol |
tolerance for the conjugate gradients ( |
maxiter |
The maximum number of iterations for the conjugate gradients ( |
x, y
|
logicals. If |
contrasts |
an optional list. See the |
... |
additional arguments (currently disregarded). |
ols returns an object of class "ols".
The function summary is used to obtain and print a summary of the
results. The generic accessor functions coefficients, fitted.values
and residuals extract various useful features of the value returned by ols.
An object of class "ols" is a list containing at least the
following components:
coefficients |
a named vector of coefficients |
residuals |
the residuals, that is response minus fitted values. |
fitted.values |
the fitted mean values. |
RSS |
the residual sum of squares. |
cov.unscaled |
a |
call |
the matched call. |
terms |
the |
contrasts |
(only where relevant) the contrasts used. |
y |
if requested, the response used. |
x |
if requested, the model matrix used. |
model |
if requested (the default), the model frame used. |
# tiny example of regression y <- c(1, 3, 3, 2, 2, 1) x <- matrix(c(1, 1, 2, 1, 3, 1, 1,-1, 2,-1, 3,-1), ncol = 2, byrow = TRUE) f0 <- ols(y ~ x) # intercept is included by default f0 # printing results (QR method was used) f1 <- ols(y ~ x, method = "svd") # using SVD method instead f1# tiny example of regression y <- c(1, 3, 3, 2, 2, 1) x <- matrix(c(1, 1, 2, 1, 3, 1, 1,-1, 2,-1, 3,-1), ncol = 2, byrow = TRUE) f0 <- ols(y ~ x) # intercept is included by default f0 # printing results (QR method was used) f1 <- ols(y ~ x, method = "svd") # using SVD method instead f1
This function is a switcher among various numerical fitting functions
(ols.fit.cg, ols.fit.chol, ols.fit.qr,
ols.fit.svd and ols.fit.sweep). The argument method
does the switching: "qr" for ols.fit.qr, etc. This should usually
not be used directly unless by experienced users.
ols.fit(x, y, method = "qr", tol = 1e-7, maxiter = 100)ols.fit(x, y, method = "qr", tol = 1e-7, maxiter = 100)
x |
design matrix of dimension |
y |
vector of observations of length |
method |
currently, methods |
tol |
tolerance for the conjugate gradients ( |
maxiter |
The maximum number of iterations for the conjugate gradients ( |
a list with components:
coefficients |
a named vector of coefficients |
residuals |
the residuals, that is response minus fitted values. |
fitted.values |
the fitted mean values. |
RSS |
the residual sum of squares. |
cov.unscaled |
a |
ols.fit.cg, ols.fit.chol, ols.fit.qr,
ols.fit.svd, ols.fit.sweep.
set.seed(151) n <- 100 p <- 2 x <- matrix(rnorm(n * p), n, p) # no intercept! y <- rnorm(n) fm <- ols.fit(x = x, y = y, method = "chol") fmset.seed(151) n <- 100 p <- 2 x <- matrix(rnorm(n * p), n, p) # no intercept! y <- rnorm(n) fm <- ols.fit(x = x, y = y, method = "chol") fm
Fits a linear model, returning the bare minimum computations.
ols.fit.cg(x, y, tol = 1e-7, maxiter = 100) ols.fit.chol(x, y) ols.fit.qr(x, y) ols.fit.svd(x, y) ols.fit.sweep(x, y)ols.fit.cg(x, y, tol = 1e-7, maxiter = 100) ols.fit.chol(x, y) ols.fit.qr(x, y) ols.fit.svd(x, y) ols.fit.sweep(x, y)
x, y
|
numeric vectors or matrices for the predictors and the response in
a linear model. Typically, but not necessarily, |
tol |
tolerance for the conjugate gradients ( |
maxiter |
The maximum number of iterations for the conjugate gradients ( |
The bare bones of an ols object: the coefficients, residuals, fitted values,
and some information used by summary.ols.
set.seed(151) n <- 100 p <- 2 x <- matrix(rnorm(n * p), n, p) # no intercept! y <- rnorm(n) z <- ols.fit.chol(x, y) zset.seed(151) n <- 100 p <- 2 x <- matrix(rnorm(n * p), n, p) # no intercept! y <- rnorm(n) z <- ols.fit.chol(x, y) z
The power method seeks to determine the eigenvalue of maximum modulus, and a corresponding eigenvector.
power.method(x, only.value = FALSE, maxiter = 100, tol = 1e-8)power.method(x, only.value = FALSE, maxiter = 100, tol = 1e-8)
x |
a symmetric matrix. |
only.value |
if |
maxiter |
the maximum number of iterations. Defaults to |
tol |
a numeric tolerance. |
When only.value is not true, as by default, the result is a list with components
"value" and "vector". Otherwise only the dominan eigenvalue is returned.
The performed number of iterations to reach convergence is returned as attribute "iterations".
eigen for eigenvalues and eigenvectors computation.
n <- 1000 x <- .5 * diag(n) + 0.5 * matrix(1, nrow = n, ncol = n) # dominant eigenvalue must be (n + 1) / 2 z <- power.method(x, only.value = TRUE)n <- 1000 x <- .5 * diag(n) + 0.5 * matrix(1, nrow = n, ncol = n) # dominant eigenvalue must be (n + 1) / 2 z <- power.method(x, only.value = TRUE)
Let be a matrix and an -dimensional
vector. Thus, the transformation
is called a rank-one update to , where is an scalar.
rank1.update(a, alpha, u, diag = FALSE)rank1.update(a, alpha, u, diag = FALSE)
a |
a numeric matrix. |
alpha |
a numeric scalar. |
u |
a numeric vector. |
diag |
logical. If |
If diag = FALSE, method of rank1.update calls BLAS level 2 subroutine DGER
for computational efficiency, otherwise the special structure of the problem is used.
a square matrix of the same order as a.
n <- 10 ones <- rep(1, n) z <- rank1.update(0.5 * ones, 0.5, ones, diag = TRUE) zn <- 10 ones <- rep(1, n) z <- rank1.update(0.5 * ones, 0.5, ones, diag = TRUE) z
Random vector generation uniformly in the unitary ball.
rball(n = 1, p = 2)rball(n = 1, p = 2)
n |
the number of samples requested |
p |
dimension of the unitary sphere |
The function rball is an interface to C routines, which make calls to
subroutines from BLAS.
If n = 1 a p-dimensional vector, otherwise a matrix of n
rows of random vectors.
Hormann, W., Leydold, J., Derflinger, G. (2004). Automatic Nonuniform Random Variate Generation. Springer, New York.
# generate the sample z <- rball(n = 500) # scatterplot of a random sample of 500 points uniformly distributed # in the unitary ball par(pty = "s") plot(z, xlab = "x", ylab = "y") title("500 points in the ball", font.main = 1)# generate the sample z <- rball(n = 500) # scatterplot of a random sample of 500 points uniformly distributed # in the unitary ball par(pty = "s") plot(z, xlab = "x", ylab = "y") title("500 points in the ball", font.main = 1)
Fit a linear model by ridge regression, returning an object of class "ridge".
ridge(formula, data, subset, lambda = 1.0, method = "GCV", ngrid = 200, tol = 1e-07, maxiter = 50, na.action, x = FALSE, y = FALSE, contrasts = NULL, ...)ridge(formula, data, subset, lambda = 1.0, method = "GCV", ngrid = 200, tol = 1e-07, maxiter = 50, na.action, x = FALSE, y = FALSE, contrasts = NULL, ...)
formula |
an object of class |
data |
an optional data frame, list or environment (or object coercible
by |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
lambda |
a scalar or vector of ridge constants. A value of 0 corresponds to ordinary least squares. |
method |
the method for choosing the ridge parameter lambda. If |
ngrid |
number of elements in the grid used to compute the GCV criterion.
Only required if |
tol |
tolerance for the optimization of the GCV criterion. Default is |
maxiter |
maximum number of iterations. The default is 50. |
x, y
|
logicals. If |
contrasts |
an optional list. See the |
... |
additional arguments to be passed to the low level regression fitting functions (not implemented). |
ridge function fits in linear ridge regression without scaling or centering
the regressors and the response. In addition, If an intercept is present in the model, its
coefficient is penalized.
A list with the following components:
dims |
dimensions of model matrix. |
coefficients |
a named vector of coefficients. |
scale |
a named vector of coefficients. |
fitted.values |
the fitted mean values. |
residuals |
the residuals, that is response minus fitted values. |
RSS |
the residual sum of squares. |
edf |
the effective number of parameters. |
GCV |
vector (if |
HKB |
HKB estimate of the ridge constant. |
LW |
LW estimate of the ridge constant. |
lambda |
vector (if |
optimal |
value of lambda with the minimum GCV (only relevant if |
iterations |
number of iterations performed by the algorithm (only relevant if |
call |
the matched call. |
terms |
the |
contrasts |
(only where relevant) the contrasts used. |
y |
if requested, the response used. |
x |
if requested, the model matrix used. |
model |
if requested, the model frame used. |
Golub, G.H., Heath, M., Wahba, G. (1979). Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21, 215-223.
Hoerl, A.E., Kennard, R.W., Baldwin, K.F. (1975). Ridge regression: Some simulations. Communication in Statistics 4, 105-123.
Hoerl, A.E., Kennard, R.W. (1970). Ridge regression: Biased estimation of nonorthogonal problems. Technometrics 12, 55-67.
Lawless, J.F., Wang, P. (1976). A simulation study of ridge and other regression estimators. Communications in Statistics 5, 307-323.
Lee, T.S (1987). Algorithm AS 223: Optimum ridge parameter selection. Applied Statistics 36, 112-118.
z <- ridge(GNP.deflator ~ ., data = longley, lambda = 4, method = "grid") z # ridge regression on a grid over seq(0, 4, length = 200) z <- ridge(GNP.deflator ~ ., data = longley) z # ridge parameter selected using GCV (default)z <- ridge(GNP.deflator ~ ., data = longley, lambda = 4, method = "grid") z # ridge regression on a grid over seq(0, 4, length = 200) z <- ridge(GNP.deflator ~ ., data = longley) z # ridge parameter selected using GCV (default)
Random number generation from the multivariate normal (Gaussian) distribution.
rmnorm(n = 1, mean = rep(0, nrow(Sigma)), Sigma = diag(length(mean)))rmnorm(n = 1, mean = rep(0, nrow(Sigma)), Sigma = diag(length(mean)))
n |
the number of samples requested |
mean |
a vector giving the means of each variable |
Sigma |
a positive-definite covariance matrix |
The function rmnorm is an interface to C routines, which make calls
to subroutines from LAPACK. The matrix decomposition is internally done using the
Cholesky decomposition. If Sigma is not non-negative definite then there
will be a warning message.
If a vector of the same length as mean, otherwise a
matrix of rows of random vectors.
Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer-Verlag, New York.
# covariance parameters Sigma <- matrix(c(10,3,3,2), ncol = 2) Sigma # generate the sample y <- rmnorm(n = 1000, Sigma = Sigma) var(y) # scatterplot of a random bivariate normal sample with mean # vector zero and covariance matrix 'Sigma' par(pty = "s") plot(y, xlab = "", ylab = "") title("bivariate normal sample", font.main = 1) # QQ-plot of transformed distances z <- WH.normal(y) par(pty = "s") qqnorm(z, main = "Transformed distances QQ-plot") abline(c(0,1), col = "red", lwd = 2, lty = 2)# covariance parameters Sigma <- matrix(c(10,3,3,2), ncol = 2) Sigma # generate the sample y <- rmnorm(n = 1000, Sigma = Sigma) var(y) # scatterplot of a random bivariate normal sample with mean # vector zero and covariance matrix 'Sigma' par(pty = "s") plot(y, xlab = "", ylab = "") title("bivariate normal sample", font.main = 1) # QQ-plot of transformed distances z <- WH.normal(y) par(pty = "s") qqnorm(z, main = "Transformed distances QQ-plot") abline(c(0,1), col = "red", lwd = 2, lty = 2)
Random vector generation uniformly on the sphere.
rsphere(n = 1, p = 2)rsphere(n = 1, p = 2)
n |
the number of samples requested |
p |
dimension of the unitary sphere |
The function rsphere is an interface to C routines, which make calls to
subroutines from BLAS.
If n = 1 a p-dimensional vector, otherwise a matrix of n
rows of random vectors.
Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer-Verlag, New York.
# generate the sample z <- rsphere(n = 200) # scatterplot of a random sample of 200 points uniformly distributed # on the unit circle par(pty = "s") plot(z, xlab = "x", ylab = "y") title("200 points on the circle", font.main = 1)# generate the sample z <- rsphere(n = 200) # scatterplot of a random sample of 200 points uniformly distributed # on the unit circle par(pty = "s") plot(z, xlab = "x", ylab = "y") title("200 points on the circle", font.main = 1)
Compute the scaled condition number of a rectangular matrix.
scaled.condition(x, scales = FALSE)scaled.condition(x, scales = FALSE)
x |
a numeric rectangular matrix. |
scales |
a logical value indicating whether the scaling factors that allow balancing
the columns of |
The columns of a rectangular matrix are equilibrated (but not centered),
then the scaled condition number is computed following the guidelines of Belsley (1990).
If requested, the column scalings are returned as the attribute 'scales'.
Belsley, D.A. (1990). Conditioning Diagnostics: Collinearity and Weak Data in Regression. Wiley, New York.
x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) scaled.condition(x)x <- matrix(c(1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1,-1, 1, 2,-1, 1, 3,-1), ncol = 3, byrow = TRUE) scaled.condition(x)
schur computes the Schur decomposition of an real matrix
.
schur(x, vectors = TRUE)schur(x, vectors = TRUE)
x |
a square numeric matrix to be decomposed. |
vectors |
logical, if |
For an real matrix , the Schur decomposition
is given by,
where is an orthogonal matrix and is a quasi-upper triangular
matrix. The column vectors (if requested) are the Schur vectors of
, and is the Schur form of .
Unsuccessful results from the underlying LAPACK code will result in an error giving
a error code: these can only be interpreted by detailed study of the Fortran code.
The Schur decomposition of the matrix as computed by LAPACK. The components in
the returned value correspond directly to the values returned by DGEES.
m |
a matrix with the same dimensions as |
values |
a vector containing the |
vectors |
an |
Anderson. E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A. Sorensen, D. (1999). LAPACK Users' Guide, 3rd Edition. SIAM.
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
a <- matrix(c(7,12,-2,-3), ncol = 2) z <- schur(a) z # information of Schur decomposition x <- matrix(c(0,0,1,2,1,0,2,2,1), ncol = 3) z <- schur(x) z # complex eigenvaluesa <- matrix(c(7,12,-2,-3), ncol = 2) z <- schur(a) z # information of Schur decomposition x <- matrix(c(0,0,1,2,1,0,2,2,1), ncol = 3) z <- schur(x) z # complex eigenvalues
Gauss-Seidel method is an iterative algorithm for solving a system of linear equations.
seidel(a, b, start, maxiter = 200, tol = 1e-7)seidel(a, b, start, maxiter = 200, tol = 1e-7)
a |
a square numeric matrix containing the coefficients of the linear system. |
b |
a vector of right-hand sides of the linear system. |
start |
a vector for initial starting point. |
maxiter |
the maximum number of iterations. Defaults to |
tol |
tolerance level for stopping iterations. |
Let , , and denote the diagonal, lower
triangular and upper triangular parts of a matrix . Gauss-Seidel method
solve the equation , iteratively by rewriting . Assuming that is
nonsingular leads to the iteration formula
a vector with the approximate solution, the iterations performed are returned as the attribute 'iterations'.
Golub, G.H., Van Loan, C.F. (1996). Matrix Computations, 3rd Edition. John Hopkins University Press.
a <- matrix(c(5,-3,2,-2,9,-1,3,1,-7), ncol = 3) b <- c(-1,2,3) start <- c(1,1,1) z <- seidel(a, b, start) z # converged in 10 iterationsa <- matrix(c(5,-3,2,-2,9,-1,3,1,-7), ncol = 3) b <- c(-1,2,3) start <- c(1,1,1) z <- seidel(a, b, start) z # converged in 10 iterations
The Sherman-Morrison formula gives a convenient expression for the inverse of the
rank 1 update where is a
matrix and , are -dimensional vectors. Thus
sherman.morrison(a, b, d = b, inverted = FALSE)sherman.morrison(a, b, d = b, inverted = FALSE)
a |
a numeric matrix. |
b |
a numeric vector. |
d |
a numeric vector. |
inverted |
logical. If |
Method of sherman.morrison calls BLAS level 2 subroutines DGEMV and
DGER for computational efficiency.
a square matrix of the same order as a.
n <- 10 ones <- rep(1, n) a <- 0.5 * diag(n) z <- sherman.morrison(a, ones, 0.5 * ones) zn <- 10 ones <- rep(1, n) a <- 0.5 * diag(n) z <- sherman.morrison(a, ones, 0.5 * ones) z
Perform the sweep operation (or reverse sweep) on the diagonal elements of a symmetric matrix.
sweep.operator(x, k = 1, reverse = FALSE)sweep.operator(x, k = 1, reverse = FALSE)
x |
a symmetric matrix. |
k |
elements (if |
reverse |
logical. If |
The symmetric sweep operator is a powerful tool in computational statistics with uses in stepwise regression, conditional multivariate normal distributions, MANOVA, and more.
a square matrix of the same order as x.
Goodnight, J.H. (1979). A tutorial on the SWEEP operator. The American Statistician 33, 149-158.
# tiny example of regression, last column contains 'y' xy <- matrix(c(1, 1, 1, 1, 1, 2, 1, 3, 1, 3, 1, 3, 1, 1,-1, 2, 1, 2,-1, 2, 1, 3,-1, 1), ncol = 4, byrow = TRUE) z <- crossprod(xy) z <- sweep.operator(z, k = 1:3) cf <- z[1:3,4] # regression coefficients RSS <- z[4,4] # residual sum of squares # an example not that small x <- matrix(rnorm(1000 * 100), ncol = 100) xx <- crossprod(x) z <- sweep.operator(xx, k = 1)# tiny example of regression, last column contains 'y' xy <- matrix(c(1, 1, 1, 1, 1, 2, 1, 3, 1, 3, 1, 3, 1, 1,-1, 2, 1, 2,-1, 2, 1, 3,-1, 1), ncol = 4, byrow = TRUE) z <- crossprod(xy) z <- sweep.operator(z, k = 1:3) cf <- z[1:3,4] # regression coefficients RSS <- z[4,4] # residual sum of squares # an example not that small x <- matrix(rnorm(1000 * 100), ncol = 100) xx <- crossprod(x) z <- sweep.operator(xx, k = 1)
This function provides the information required to create the symmetrizer matrix.
symm.info(n = 1)symm.info(n = 1)
n |
order of the symmetrizer matrix. |
This function returns a list containing vectors that represent an element of the
symmetrizer matrix and is accesed by the indexes in vectors row, col
and values contained in val. This information is used by function symm.prod
to do some operations involving the symmetrizer matrix without forming it. This
information also can be obtained using function symmetrizer.
A list containing the following elements:
row |
vector of indexes, each entry represents the row index of the symmetrizer matrix. |
col |
vector of indexes, each entry represents the column index of the symmetrizer matrix. |
val |
vector of values, each entry represents the value of the symmetrizer matrix
at element given by |
order |
order of the symmetrizer matrix. |
z <- symm.info(n = 3) z # elements in symmetrizer matrix of order 3 N3 <- symmetrizer(n = 3, matrix = TRUE) N3 # only recommended if n is very smallz <- symm.info(n = 3) z # elements in symmetrizer matrix of order 3 N3 <- symmetrizer(n = 3, matrix = TRUE) N3 # only recommended if n is very small
Given the order of a symmetrizer matrix of order and a
conformable matrix , performs one of the matrix-matrix operations:
, if side = "left", or
, if side = "right",
The main aim of symm.prod is to do this matrix multiplication without forming
the symmetrizer matrix.
symm.prod(n = 1, x = NULL, side = "left")symm.prod(n = 1, x = NULL, side = "left")
n |
order of the symmetrizer matrix. |
x |
numeric matrix (or vector). |
side |
a string selecting if symmetrizer matrix is pre-multiplying |
Underlying C code only uses information provided by symm.info to
performs the matrix multiplication. The symmetrizer matrix is never created.
N4 <- symmetrizer(n = 4, matrix = TRUE) x <- matrix(1:32, ncol = 2) y <- N4 %*% x z <- symm.prod(n = 4, x) # N4 is not stored all(z == y) # matrices y and z are equal!N4 <- symmetrizer(n = 4, matrix = TRUE) x <- matrix(1:32, ncol = 2) y <- N4 %*% x z <- symm.prod(n = 4, x) # N4 is not stored all(z == y) # matrices y and z are equal!
This function returns the symmetrizer matrix of order which transforms,
for every matrix , vec into
vec.
symmetrizer(n = 1, matrix = FALSE)symmetrizer(n = 1, matrix = FALSE)
n |
order of the symmetrizer matrix. |
matrix |
a logical indicating whether the symmetrizer matrix will be returned. |
This function is a wrapper function for the function symm.info. This function
provides the information required to create the symmetrizer matrix. If option matrix = FALSE
the symmetrizer matrix is stored in three vectors containing the coordinate list of
indexes for rows, columns and the values.
Warning: matrix = TRUE is not recommended, unless the order n
be small. This matrix can require a huge amount of storage.
Returns an by matrix (if requested).
Abadir, K.M., Magnus, J.R. (2005). Matrix Algebra. Cambridge University Press.
Magnus, J.R., Neudecker, H. (2007). Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd Edition. Wiley, New York.
z <- symmetrizer(n = 100) object.size(z) # 319 Kb of storage N100 <- symmetrizer(n = 100, matrix = TRUE) # time: < 2 secs object.size(N100) # 800 Mb of storage, do not request this matrix! # a small example N3 <- symmetrizer(n = 3, matrix = TRUE) a <- matrix(rep(c(2,4,6), each = 3), ncol = 3) a b <- 0.5 * (a + t(a)) b v <- N3 %*% vec(a) all(vec(b) == as.vector(v)) # vectors are equal!z <- symmetrizer(n = 100) object.size(z) # 319 Kb of storage N100 <- symmetrizer(n = 100, matrix = TRUE) # time: < 2 secs object.size(N100) # 800 Mb of storage, do not request this matrix! # a small example N3 <- symmetrizer(n = 3, matrix = TRUE) a <- matrix(rep(c(2,4,6), each = 3), ncol = 3) a b <- 0.5 * (a + t(a)) b v <- N3 %*% vec(a) all(vec(b) == as.vector(v)) # vectors are equal!
This function returns a vector obtained by stacking the columns of .
vec(x)vec(x)
x |
a numeric matrix. |
Let be a by matrix, then vec()
is a -dimensional vector.
x <- matrix(rep(1:10, each = 10), ncol = 10) x y <- vec(x) yx <- matrix(rep(1:10, each = 10), ncol = 10) x y <- vec(x) y
This function returns a vector obtained by stacking the lower triangular part of a square matrix.
vech(x)vech(x)
x |
a square matrix. |
Let be a by matrix, then vech()
is a -dimensional vector.
x <- matrix(rep(1:10, each = 10), ncol = 10) x y <- vech(x) yx <- matrix(rep(1:10, each = 10), ncol = 10) x y <- vech(x) y
Returns the Wilson-Hilferty transformation of random variables with chi-squared distribution.
WH.normal(x)WH.normal(x)
x |
vector or matrix of data with, say, |
Let be a random variable, where denotes the squared Mahalanobis
distance defined as
Thus the Wilson-Hilferty transformation is given by
and is approximately distributed as a standard normal distribution. This
is useful, for instance, in the construction of QQ-plots.
Wilson, E.B., and Hilferty, M.M. (1931). The distribution of chi-square. Proceedings of the National Academy of Sciences of the United States of America 17, 684-688.
x <- iris[,1:4] z <- WH.normal(x) par(pty = "s") qqnorm(z, main = "Transformed distances QQ-plot") abline(c(0,1), col = "red", lwd = 2, lty = 2)x <- iris[,1:4] z <- WH.normal(x) par(pty = "s") qqnorm(z, main = "Transformed distances QQ-plot") abline(c(0,1), col = "red", lwd = 2, lty = 2)
Applies the whitening transformation to a data matrix based on the Cholesky decomposition of the empirical covariance matrix.
whitening(x, Scatter = NULL)whitening(x, Scatter = NULL)
x |
vector or matrix of data with, say, |
Scatter |
covariance (or scatter) matrix ( |
Returns the whitened data matrix , where
with the empirical covariance matrix.
Kessy, A., Lewin, A., Strimmer, K. (2018). Optimal whitening and decorrelation. The American Statistician 72, 309-314.
x <- iris[,1:4] species <- iris[,5] pairs(x, col = species) # plot of Iris # whitened data z <- whitening(x) pairs(z, col = species) # plot ofx <- iris[,1:4] species <- iris[,5] pairs(x, col = species) # plot of Iris # whitened data z <- whitening(x) pairs(z, col = species) # plot of
Returns the Wilson-Hilferty transformation of random variables with Gamma distribution.
wilson.hilferty(x, shape, rate = 1)wilson.hilferty(x, shape, rate = 1)
x |
a numeric vector containing Gamma distributed deviates. |
shape, rate
|
shape and rate parameters. Must be positive. |
Let be a random variable following a Gamma distribution with parameters = shape
and = rate with density
where , , and consider the random variable
. Thus, the Wilson-Hilferty transformation
is approximately distributed as a standard normal distribution. This is useful, for instance, in the construction of QQ-plots.
Terrell, G.R. (2003). The Wilson-Hilferty transformation is locally saddlepoint. Biometrika 90, 445-453.
Wilson, E.B., and Hilferty, M.M. (1931). The distribution of chi-square. Proceedings of the National Academy of Sciences of the United States of America 17, 684-688.
x <- rgamma(n = 300, shape = 2, rate = 1) z <- wilson.hilferty(x, shape = 2, rate = 1) par(pty = "s") qqnorm(z, main = "Transformed Gamma QQ-plot") abline(c(0,1), col = "red", lwd = 2, lty = 2)x <- rgamma(n = 300, shape = 2, rate = 1) z <- wilson.hilferty(x, shape = 2, rate = 1) par(pty = "s") qqnorm(z, main = "Transformed Gamma QQ-plot") abline(c(0,1), col = "red", lwd = 2, lty = 2)