Package 'lori' reference manual

Title:	Imputation of High-Dimensional Count Data using Side Information
Description:	Analysis, imputation, and multiple imputation of count data using covariates. LORI uses a log-linear Poisson model where main row and column effects, as well as effects of known covariates and interaction terms can be fitted. The estimation procedure is based on the convex optimization of the Poisson loss penalized by a Lasso type penalty and a nuclear norm. LORI returns estimates of main effects, covariate effects and interactions, as well as an imputed count table. The package also contains a multiple imputation procedure. The methods are described in Robin, Josse, Moulines and Sardy (2019) <arXiv:1703.02296v4>.
Authors:	Genevieve Robin [aut, cre]
Maintainer:	Genevieve Robin <[email protected]>
License:	GPL-3
Version:	2.2.2
Built:	2024-11-06 06:40:33 UTC
Source:	CRAN

Alpine plant communities in Aravo, France: Abundance data and covariates

Description

Originally published in Choler, P. 2005. Consistent shifts in Alpine plant traits along a mesotopographical gradient. Arctic, Antarctic, and Alpine Research 37: 444–453.

Usage

data(aravo)
data(aravo)

Format

A list with 4 attributes:

spe: abundance table of 82 species in 75 environments
env: a matrix of 6 covariates for the 75 environments
traits: a matrix of 8 covariates for the 82 species
spe.names: a vector of 82 species names

Details

Analysed in Dray, S., Choler, P., Dolédec, S., Peres-Neto, P.R., Thuiler, W., Pavoine, S. & ter Braak, C.J.F. 2014. Combining the fourth-corner and the RLQ methods for assessing trait responses to environmental variation. Ecology 95: 14-21

Description from Dray et al. (2014): Community composition of vascular plants was determined in 75 5 × 5 m plots. Each site was described by six environmental variables: mean snowmelt date over the period 1997–1999, slope inclination, aspect, index of microscale landform, index of physical disturbance due to cryoturbation and solifluction, and an index of zoogenic disturbance due to trampling and burrowing activities of the Alpine marmot. All variables are quantitative except the landform and zoogenic disturbance indices that are categorical variables with five and three categories, respectively. Eight quantitative functional traits (i.e., vegetative height, lateral spread, leaf elevation angle, leaf area, leaf thickness, specific leaf area, mass-based leaf nitrogen content, and seed mass) were measured on the 82 most abundant plant species (out of a total of 132 recorded species).

Source

http://pbil.univ-lyon1.fr/ade4/ade4-html/aravo.html

covmat

Description

covmat

Usage

covmat(n, p, R = NULL, C = NULL, E = NULL, center = F)
covmat(n, p, R = NULL, C = NULL, E = NULL, center = F)

Arguments

`n`	number of rows
`p`	number ofcolumns
`R`	nxK1 matrix of row covariates
`C`	nxK2 matrix of column covariates
`E`	(n+p)xK3 matrix of row-column covariates
`center`	boolean indicating whether the returned covariate matrix should be centered (for identifiability)

Value

the joint product of R and C column-binded with E, a (np)x(K1+K2+K3) matrix in order row1col1,row2col1,...,rowncol1, row1col2, row2col2,...,rowncolp

Examples

R <- matrix(rnorm(10), 5)
C <- matrix(rnorm(9), 3)
covs <- covmat(5,3,R,C)
R <- matrix(rnorm(10), 5)
C <- matrix(rnorm(9), 3)
covs <- covmat(5,3,R,C)

The cv.lori method performs automatic selection of the regularization parameters (lambda1 and lambda2) used in the lori function. These parameters are selected by cross-validation. The classical procedure is to apply cv.lori to the data to select the regularization parameters, and to then impute and analyze the data using the lori function (or mi.lori for multiple imputation).

Description

The cv.lori method performs automatic selection of the regularization parameters (lambda1 and lambda2) used in the lori function. These parameters are selected by cross-validation. The classical procedure is to apply cv.lori to the data to select the regularization parameters, and to then impute and analyze the data using the lori function (or mi.lori for multiple imputation).

Usage

cv.lori(
  Y,
  cov = NULL,
  intercept = T,
  reff = T,
  ceff = T,
  rank.max = 5,
  N = 5,
  len = 20,
  prob = 0.2,
  algo = c("alt", "mcgd"),
  thresh = 1e-05,
  maxit = 10,
  trace.it = F,
  parallel = F
)
cv.lori(
  Y,
  cov = NULL,
  intercept = T,
  reff = T,
  ceff = T,
  rank.max = 5,
  N = 5,
  len = 20,
  prob = 0.2,
  algo = c("alt", "mcgd"),
  thresh = 1e-05,
  maxit = 10,
  trace.it = F,
  parallel = F
)

Arguments

`Y`	[matrix, data.frame] abundance table (nxp)
`cov`	[matrix, data.frame] design matris (npxq)
`intercept`	[boolean] whether an intercept should be fitted, default value is FALSE
`reff`	[boolean] whether row effects should be fitted, default value is TRUE
`ceff`	[boolean] whether column effects should be fitted, default value is TRUE
`rank.max`	[integer] maximum rank of interaction matrix, default is 2
`N`	[integer] number of cross-validation folds
`len`	[integer] the size of the grid
`prob`	[numeric in (0,1)] the proportion of entries to remove for cross-validation
`algo`	type of algorithm to use, either one of "mcgd" (mixed coordinate gradient descent, adapted to large dimensions) or "alt" (alternating minimization, adapted to small dimensions)
`thresh`	[positive number] convergence threshold, default is 1e-5
`maxit`	[integer] maximum number of iterations, default is 100
`trace.it`	[boolean] whether information about convergence should be printed
`parallel`	[boolean] whether computations should be performed in parallel on multiple cores

Value

A list with the following elements

`lambda1`	regularization parameter estimated by cross-validation for nuclear norm penalty (interaction matrix)
`lambda2`	regularization parameter estimated by cross-validation for l1 norm penalty (main effects)
`errors`	a table containing the prediction errors for all pairs of parameters

Examples

X <- matrix(rnorm(20), 10)
Y <- matrix(rpois(10, 1:10), 5)
res <- cv.lori(Y, X, N=2, len=2)
X <- matrix(rnorm(20), 10)
Y <- matrix(rpois(10, 1:10), 5)
res <- cv.lori(Y, X, N=2, len=2)

The lori method implements a method to analyze and impute incomplete count tables. An important feature of the method is that it can take into account main effects of rows and columns, as well as effects of continuous or categorical covariates, and interaction. The estimation procedure is based on minimizing a Poisson loss penalized by a Lasso type penalty (sparse vector of covariate effects) and a nuclear norm penalty inducing a low-rank interaction matrix (a few latent factors summarize the interactions).

Description

The lori method implements a method to analyze and impute incomplete count tables. An important feature of the method is that it can take into account main effects of rows and columns, as well as effects of continuous or categorical covariates, and interaction. The estimation procedure is based on minimizing a Poisson loss penalized by a Lasso type penalty (sparse vector of covariate effects) and a nuclear norm penalty inducing a low-rank interaction matrix (a few latent factors summarize the interactions).

Usage

lori(
  Y,
  cov = NULL,
  lambda1 = NULL,
  lambda2 = NULL,
  intercept = T,
  reff = T,
  ceff = T,
  rank.max = 2,
  algo = c("alt", "mcgd"),
  thresh = 1e-05,
  maxit = 100,
  trace.it = F,
  parallel = F
)
lori(
  Y,
  cov = NULL,
  lambda1 = NULL,
  lambda2 = NULL,
  intercept = T,
  reff = T,
  ceff = T,
  rank.max = 2,
  algo = c("alt", "mcgd"),
  thresh = 1e-05,
  maxit = 100,
  trace.it = F,
  parallel = F
)

Arguments

`Y`	[matrix, data.frame] count table (nxp).
`cov`	[matrix, data.frame] design matrix (np*q) in order row1xcol1,row2xcol2,..,rownxcol1,row1xcol2,row2xcol2,...,...,rownxcolp
`lambda1`	[positive number] the regularization parameter for the interaction matrix.
`lambda2`	[positive number] the regularization parameter for the covariate effects.
`intercept`	[boolean] whether an intercept should be fitted, default value is FALSE
`reff`	[boolean] whether row effects should be fitted, default value is TRUE
`ceff`	[boolean] whether column effects should be fitted, default value is TRUE
`rank.max`	[integer] maximum rank of interaction matrix (smaller than min(n-1,p-1))
`algo`	type of algorithm to use, either one of "mcgd" (mixed coordinate gradient descent, adapted to large dimensions) or "alt" (alternating minimization, adapted to small dimensions)
`thresh`	[positive number] convergence tolerance of algorithm, by default `1e-6`.
`maxit`	[integer] maximum allowed number of iterations.
`trace.it`	[boolean] whether convergence information should be printed
`parallel`	[boolean] whether computations should be performed in parallel on multiple cores

Value

A list with the following elements

`X`	nxp matrix of log of expected counts
`alpha`	row effects
`beta`	column effects
`epsilon`	covariate effects
`theta`	nxp matrix of row-column interactions
`imputed`	nxp matrix of imputed counts
`means`	nxp matrix of expected counts (exp(X))
`cov`	npxK matrix of covariates

Examples

The mi.lori performs M multiple imputations using the lori method. Multiple imputation allows to produce estimates of missing values, as well as intervals of variability. The classical procedure is to perform M multiple imputations using the mi.lori method, and to aggregate them using the pool.lori method.

Description

The mi.lori performs M multiple imputations using the lori method. Multiple imputation allows to produce estimates of missing values, as well as intervals of variability. The classical procedure is to perform M multiple imputations using the mi.lori method, and to aggregate them using the pool.lori method.

Usage

mi.lori(
  Y,
  cov = NULL,
  lambda1 = NULL,
  lambda2 = NULL,
  M = 25,
  intercept = T,
  reff = T,
  ceff = T,
  rank.max = 5,
  algo = c("alt", "mcgd"),
  thresh = 1e-05,
  maxit = 1000,
  trace.it = F
)
mi.lori(
  Y,
  cov = NULL,
  lambda1 = NULL,
  lambda2 = NULL,
  M = 25,
  intercept = T,
  reff = T,
  ceff = T,
  rank.max = 5,
  algo = c("alt", "mcgd"),
  thresh = 1e-05,
  maxit = 1000,
  trace.it = F
)

Arguments

`Y`	[matrix, data.frame] count table (nxp).
`cov`	[matrix, data.frame] design matrix (np*q) in order row1xcol1,row2xcol2,..,rownxcol1,row1xcol2,row2xcol2,...,...,rownxcolp
`lambda1`	[positive number] the regularization parameter for the interaction matrix.
`lambda2`	[positive number] the regularization parameter for the covariate effects.
`M`	[integer] the number of multiple imputations to perform
`intercept`	[boolean] whether an intercept should be fitted, default value is FALSE
`reff`	[boolean] whether row effects should be fitted, default value is TRUE
`ceff`	[boolean] whether column effects should be fitted, default value is TRUE
`rank.max`	[integer] maximum rank of interaction matrix (smaller than min(n-1,p-1))
`algo`	type of algorithm to use, either one of "mcgd" (mixed coordinate gradient descent, adapted to large dimensions) or "alt" (alternating minimization, adapted to small dimensions)
`thresh`	[positive number] convergence tolerance of algorithm, by default `1e-6`.
`maxit`	[integer] maximum allowed number of iterations.
`trace.it`	[boolean] whether convergence information should be printed

Value

`mi.imputed`	a list of length M containing the imputed count tables
`mi.alpha`	a (Mxn) matrix containing in rows the estimated row effects (one row corresponds to one single imputation)
`mi.beta`	a (Mxp) matrix containing in rows the estimated column effects (one row corresponds to one single imputation)
`mi.epsilon`	a (Mxq) matrix containing in rows the estimated effects of covariates (one row corresponds to one single imputation)
`mi.theta`	a list of length M containing the estimated interaction matrices
`mi.mu`	a list of length M containing the estimated Poisson means
`mi.y`	list of bootstrapped count tables used fot multiple imputation
`Y`	original incomplete count table

Examples

X <- matrix(rnorm(50), 25)
Y <- matrix(rpois(25, 1:25), 5)
res <- mi.lori(Y, X, 10, 10, 2)
X <- matrix(rnorm(50), 25)
Y <- matrix(rpois(25, 1:25), 5)
res <- mi.lori(Y, X, 10, 10, 2)

The pool.lori method aggregates lori multiple imputation results. Multiple imputation allows to produce estimates of missing values, as well as intervals of variability. The classical procedure is to perform multiple imputation using the mi.lori method, and to aggregate them using the pool.lori method.

Description

The pool.lori method aggregates lori multiple imputation results. Multiple imputation allows to produce estimates of missing values, as well as intervals of variability. The classical procedure is to perform multiple imputation using the mi.lori method, and to aggregate them using the pool.lori method.

Usage

pool.lori(res.mi)
pool.lori(res.mi)

Arguments

res.mi

a multiple imputation result from the function mi.lori

Value

`pool.impute`	a list containing the pooled means (mean) and variance (var) of the imputed values
`pool.alpha`	a list containing the pooled means (mean) and variance (var) of the row effects
`pool.beta`	a list containing the pooled means (mean) and variance (var) of the column effects
`pool.epsilon`	a list containing the pooled means (mean) and variance (var) of the covariate effects
`pool.theta`	a list containing the pooled means (mean) and variance (var) of the interactions

Examples

X <- matrix(rnorm(50), 25)
Y <- matrix(rpois(25, 1:25), 5)
res <- mi.lori(Y, X, 10, 10, 2)
poolres <- pool.lori(res)
X <- matrix(rnorm(50), 25)
Y <- matrix(rpois(25, 1:25), 5)
res <- mi.lori(Y, X, 10, 10, 2)
poolres <- pool.lori(res)

automatic selection of nuclear norm regularization parameter

Description

automatic selection of nuclear norm regularization parameter

Usage

qut(Y, cov, lambda2 = 0, q = 0.95, N = 100, reff = T, ceff = T)
qut(Y, cov, lambda2 = 0, q = 0.95, N = 100, reff = T, ceff = T)

Arguments

`Y`	A matrix of counts (contingency table).
`cov`	A (np)xK matrix of K covariates about rows and columns
`lambda2`	A positive number, the regularization parameter for covariates main effects
`q`	A number between `0` and `1`. The quantile of the distribution of $lambda_QUT$ to take.
`N`	An integer. The number of parametric bootstrap samples to draw.
`reff`	[boolean] whether row effects should be fitted, default value is TRUE
`ceff`	[boolean] whether column effects should be fitted, default value is TRUE

Value

the value of $lambda_QUT$ to use in LoRI.

Examples

X = matrix(rnorm(30), 15)
Y = matrix(rpois(15, 1:15), 5)
lambda = qut(Y,X, 10, N=10)
X = matrix(rnorm(30), 15)
Y = matrix(rpois(15, 1:15), 5)
lambda = qut(Y,X, 10, N=10)

Package 'lori'

Help Index

Alpine plant communities in Aravo, France: Abundance data and covariates

Description

Usage

Format

Details

Source

covmat

Description

Usage

Arguments

Value

Examples

Description

Usage

Arguments

Value

Examples

Description

Usage

Arguments

Value

Examples

Description

Usage

Arguments

Value

Examples

Description

Usage

Arguments

Value

Examples

automatic selection of nuclear norm regularization parameter

Description

Usage

Arguments

Value

Examples