Title: | Reconstruction Set Test |
---|---|
Description: | Contains logic for sample-level variable set scoring using randomized reduced rank reconstruction error. Frost, H. Robert (2023) "Reconstruction Set Test (RESET): a computationally efficient method for single sample gene set testing based on randomized reduced rank reconstruction error" <doi:10.1101/2023.04.03.535366>. |
Authors: | H. Robert Frost [aut, cre] |
Maintainer: | H. Robert Frost <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.0 |
Built: | 2024-12-02 06:47:33 UTC |
Source: | CRAN |
Implementation of Reconstruction Set Test (RESET), a method for single sample variable set testing based on randomized reduced rank reconstruction error.
Package: | RESET |
Type: | Package |
Version: | 0.2.2 |
Date: | 2023 |
License: | GPL-2 |
This work was supported by the National Institutes of Health grants R35GM146586, R21CA253408, P20GM130454 and P30CA023108.
H. Robert Frost
Frost, H. R. (2023). Reconstruction Set Test (RESET): a computationally efficient method for single sample gene set testing based on randomized reduced rank reconstruction error. biorXiv e-prints. doi: https://doi.org/10.1101/2023.04.03.535366
Utility function that converts the scores generated by reset or resetViaPCA to or from per-variable scores.
This has the same effect as the per.var
argument to reset or resetViaPCA. Conversion to per-variable scores will divide each overall score and each sample-level score by the scaled size of the associated variable set. The variable set sizes are scaled by dividing by the mean size among all tested sets (this will result in scores not changing in magnitude for sets of mean size). Conversion from per-variable scores will instead multiply scored by the scaled variable set size.
convertToPerVarScores(reset.out, var.sets, to.per.var=TRUE)
convertToPerVarScores(reset.out, var.sets, to.per.var=TRUE)
reset.out |
The list returned from a call to reset or resetViaPCA. |
var.sets |
List of variable sets where each element in the list corresponds to a set and the list element is a vector variable names. List names are variable set names. This must the same variable set structure used to create the |
to.per.var |
If true, the scores are converted to a per-variable format by dividing each score by the size of the associated variable set size. If false, scores are converted from a per-variable format by multiplying by the associated variable set size. |
Version of the input list where scores have been appropriately converted to or from a per-variable format.
# Create a collection of 3 overlapping variable sets of different sizes var.sets = list(set1=1:10, set2=1:20, set3=1:30) # Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=1), nrow=100) # Inflate first 10 rows for first 10 variables, i.e., the first # 10 samples should have elevated scores for the first variable set X[1:10,1:10] = rpois(100, lambda=5) # Execute RESET using non-randomized basis computation reset.out = reset(X, var.sets=var.sets, k=2, random.threshold=20) # Display the overall scores reset.out$v # Convert to per-variable scores reset.out.2 = convertToPerVarScores(reset.out=reset.out, var.sets=var.sets, to.per.var=TRUE) # Display the overall scores in per-variable format reset.out.2$v # Convert from per-variable scores reset.out.3 = convertToPerVarScores(reset.out=reset.out.2, var.sets=var.sets, to.per.var=FALSE) # Display the overall scores reset.out.3$v
# Create a collection of 3 overlapping variable sets of different sizes var.sets = list(set1=1:10, set2=1:20, set3=1:30) # Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=1), nrow=100) # Inflate first 10 rows for first 10 variables, i.e., the first # 10 samples should have elevated scores for the first variable set X[1:10,1:10] = rpois(100, lambda=5) # Execute RESET using non-randomized basis computation reset.out = reset(X, var.sets=var.sets, k=2, random.threshold=20) # Display the overall scores reset.out$v # Convert to per-variable scores reset.out.2 = convertToPerVarScores(reset.out=reset.out, var.sets=var.sets, to.per.var=TRUE) # Display the overall scores in per-variable format reset.out.2$v # Convert from per-variable scores reset.out.3 = convertToPerVarScores(reset.out=reset.out.2, var.sets=var.sets, to.per.var=FALSE) # Display the overall scores reset.out.3$v
Utility function that converts the scores generated by resetForSeurat to or from per-variable scores.
This has the same effect as the per.var
argument to resetForSeurat.
Conversion to per-variable scores will divide each overall score and each sample-level score by the scaled size of the associated gene set.
The gene set sizes are scaled by dividing by the mean size among all tested sets (this will reset in scores not changing in magnitude for sets of mean size). Conversion from per-variable scores will instead multiply scored by the scaled gene set size.
convertToPerVarScoresForSeurat(seurat.data, gene.set.collection, to.per.var=TRUE)
convertToPerVarScoresForSeurat(seurat.data, gene.set.collection, to.per.var=TRUE)
seurat.data |
The Seurat object returned from a call to resetForSeurat. |
gene.set.collection |
List of m gene sets for which scores are computed. Each element in the list corresponds to a gene set and the list element is a vector of indices for the genes in the set. The index value is defined relative to the order of genes in the Seurat active assay. This needs to be the same gene set collection used in the call to resetForSeurat. |
to.per.var |
If true, the scores are converted to a per-variable format by dividing each score by the size of the associated variable set size. If false, scores are converted from a per-variable format by multiplying by the associated variable set size. |
Version of the Seurat object where the RESET scores have been appropriately converted to or from a per-variable format.
# Only run example code if Seurat package is available if (requireNamespace("Seurat", quietly=TRUE) & requireNamespace("SeuratObject", quietly=TRUE)) { # Define a collection with three overlapping gene sets of different sizes collection=list(set1=1:10, set2=1:20, set3=6:10) # Execute on the pbmc_small scRNA-seq data set included with SeuratObject # See vignettes for more detailed Seurat examples reset.out = resetForSeurat(seurat.data=SeuratObject::pbmc_small, num.pcs=2, k=2, gene.set.collection=collection) # Retrieve the overall scores reset.out@[email protected] # Convert the scores to a per-variable format reset.out.2 = convertToPerVarScoresForSeurat(seurat.data=reset.out, gene.set.collection=collection) # Retrieve the overall scores reset.out.2@[email protected] }
# Only run example code if Seurat package is available if (requireNamespace("Seurat", quietly=TRUE) & requireNamespace("SeuratObject", quietly=TRUE)) { # Define a collection with three overlapping gene sets of different sizes collection=list(set1=1:10, set2=1:20, set3=6:10) # Execute on the pbmc_small scRNA-seq data set included with SeuratObject # See vignettes for more detailed Seurat examples reset.out = resetForSeurat(seurat.data=SeuratObject::pbmc_small, num.pcs=2, k=2, gene.set.collection=collection) # Retrieve the overall scores reset.out@assays$RESET@meta.features # Convert the scores to a per-variable format reset.out.2 = convertToPerVarScoresForSeurat(seurat.data=reset.out, gene.set.collection=collection) # Retrieve the overall scores reset.out.2@assays$RESET@meta.features }
Utility function that creates a variable set list in the format required by reset (i.e., list of variable indices) given the variable names and an ordered vector of variable names.
createVarSetCollection(var.names, var.sets, min.size=1, max.size)
createVarSetCollection(var.names, var.sets, min.size=1, max.size)
var.names |
Vector of variable names. This should correspond to the order of variables as represented by the columns of the target matrix X. |
var.sets |
List of m variable sets where each element in the list corresponds to a set and the list element is a vector variable names. List names are variable set names. |
min.size |
Minimum set size after filtering out variable not in the var.names vector. Sets whose post-filtering size is below this are removed from the final collection list. Default is 1 and cannot be set to less than 1. |
max.size |
Maximum variable set size after filtering out variables not in the var.names vector. Sets whose post-filtering size is above this are removed from the final collection list. If not specified, no filtering is performed. |
Version of the input variable set collection list where variable names have been replaced by position indices, variables not present in the var.names vector have been removed and sets failing the min/max size constraints have been removed.
# Create a collection with two sets defined over 3 variables createVarSetCollection(var.names=c("A", "B", "C"), var.sets = list(set1=c("A", "B"), set2=c("B", "C")), min.size=2, max.size=3)
# Create a collection with two sets defined over 3 variables createVarSetCollection(var.names=c("A", "B", "C"), var.sets = list(set1=c("A", "B"), set2=c("B", "C")), min.size=2, max.size=3)
Computes a rank k
approximation of the column space of an n-by-p input matrix X
using a sparse randomized embedding with optional subspace power iterations.
Specifically, a p-by-k random text matrix O
is created where all elements are generated as independent N(0,1) or U(0,1) random variables except for elements designated as sparse via the specified sparsity.structure
, which are set to 0. If a sparse structure is used, the non-zero elements can alternatively be set to the constant value of 1 for a non-random embedding. The test matrix is used to create an n-by-k sketch matrix Y
as Y=XO
. If q>0
, subspace power iterations are performed on Y
via algorithm 2 in the paper by Erichson, et al. associated with the rsvd
R package (https://doi.org/10.18637/jss.v089.i11). The returned rank k
column space approximation of X
is then generated via a column-pivoted QR decomposition of Y
.
randomColumnSpace(X, k=2, q=0, sparsity.structure=NULL, test.dist="normal")
randomColumnSpace(X, k=2, q=0, sparsity.structure=NULL, test.dist="normal")
X |
An n-by-p target matrix. |
k |
Target rank. Defaults to 2. |
q |
Number of power iterations. Defaults to 0. |
sparsity.structure |
Optional sparsity structure. Should be specified as a vector whose elements are
the indices (in column-oriented format) of the non-sparse elements in the p x k random test matrix |
test.dist |
Type of random variable used to populate non-sparse elements of random test matrix |
A n-by-k estimate of the column space of X
.
# Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=2), nrow=100) # Create a random sparsity structure for 100-by-5 random test matrix; half elements will be 0 sparsity.structure = sample(1:500, 250, replace=TRUE) # Compute rank 5 estimate of column space of X using a sparse test matrix Q = randomColumnSpace(X,k=5,sparsity.structure=sparsity.structure) # Compute using a dense test matrix with U(0,1) RVs Q = randomColumnSpace(X,k=5,test.dist="uniform")
# Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=2), nrow=100) # Create a random sparsity structure for 100-by-5 random test matrix; half elements will be 0 sparsity.structure = sample(1:500, 250, replace=TRUE) # Compute rank 5 estimate of column space of X using a sparse test matrix Q = randomColumnSpace(X,k=5,sparsity.structure=sparsity.structure) # Compute using a dense test matrix with U(0,1) RVs Q = randomColumnSpace(X,k=5,test.dist="uniform")
Computes an approximate rank k
singular value decomposition (SVD) of an n-by-p input matrix X
using a sparse randomized embedding with optional subspace power iterations. The randomColumnSpace
method is used to generate an rank k
approximation of the column space of X
. This n-by-k approximation Y
is then used to create a k-by-p projection B
of X
onto this rank k
subspace via B=Y^TX
. A non-random SVD is computed for B
and this SVD solution is used to generate an approximate rank k
SVD of X
.
randomSVD(X, k=2, q=0, sparsity.structure=NULL, test.dist="normal")
randomSVD(X, k=2, q=0, sparsity.structure=NULL, test.dist="normal")
X |
An n-by-p target matrix. |
k |
Target rank. Defaults to 2. See description in |
q |
Number of power iterations. Defaults to 0. See description in |
sparsity.structure |
Optional sparsity structure. See description in |
test.dist |
Type of random variable used to populate non-sparse elements of random test matrix.
See description in |
List with the following elements:
u
a matrix whose columns are the top k approximate left singular vectors of X
.
d
a vector containing the top k approximate singular values of X
.
v
a matrix whose columns are the top k approximate right singular vectors of X
.
# Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=2), nrow=100) # Create a random sparsity structure for 100-by-5 random test matrix; half elements will be 0 sparsity.structure = sample(1:500, 250, replace=TRUE) # Compute rank 5 SVD of X using a sparse test matrix svd.out = randomSVD(X,k=5,sparsity.structure=sparsity.structure) # Compute using a dense test matrix with U(0,1) RVs svd.out = randomSVD(X,k=5,test.dist="uniform")
# Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=2), nrow=100) # Create a random sparsity structure for 100-by-5 random test matrix; half elements will be 0 sparsity.structure = sample(1:500, 250, replace=TRUE) # Compute rank 5 SVD of X using a sparse test matrix svd.out = randomSVD(X,k=5,sparsity.structure=sparsity.structure) # Compute using a dense test matrix with U(0,1) RVs svd.out = randomSVD(X,k=5,test.dist="uniform")
Implementation of the Reconstruction Set Test (RESET) method, which transforms an n-by-p input matrix X
into an n-by-m matrix of sample-level variable set scores and a length m vector of overall variable set scores. Execution of RESET involves the following sequence of steps:
If center.X=TRUE
, mean center the columns of X
. If X.test
is specified, the centering is instead
performed on just the columns of X
corresponding to each variable set.
See documentation for the X
and center.X
parameters for more details.
If scale.X=TRUE
, scale the columns of X
to have variance 1. If X.test
is specified, the scaling is instead
performed on just the columns of X
corresponding to each variable set.
See documentation for the X
and scale.X
parameters for more details.
If center.X.test=TRUE
, mean center the columns of X.test
.
See documentation for the X.test
and center.X.test
parameters for more details.
If scale.X.test=TRUE
, scale the columns of X.test
.
See documentation for the X.test
and scale.X.test
parameters for more details.
Set the reconstruction target matrix T
to X
or, if X.test
is specified, to X.test
.
Compute the norm of T
and norm of each row of T
. By default, these are the Frobenius and Euclidean norms respectively.
For each set in var.sets
, sample-level and matrix level scores are generated as follows:
Create a subset of X
called X.var.set
that only includes the columns of X
correponding to the variables
in the set.
Compute a rank k
orthonormal basis Q
for the column space of X.var.set
.
If the size of the set is less then or equal to random.threshold
, then this is computed as the top k
columns
of the Q
matrix from a column-pivoted QR decomposition of X.var.set
, otherwise, it is approximated using
a randomized algorithm implemented by randomColumnSpace
.
The reduced rank reconstruction of T
is then created as Q Q^T T
.
The original T
is subtracted from the reconstruction to represent the reconstruction error and the appropriate norm
is computed on each row and the entire error matrix.
The overall score is the log2 ratio of the norm of the original T
to the norm of the reconstruction error matrix.
The score for each sample is the log2 ratio of the norm of the corresponding row of the original T
to the norm of the same row of the reconstruction error matrix.
If per.var=TRUE
, then the overall and sample-level scores are divided by the variable set size.
reset(X, X.test, center.X=TRUE, scale.X=FALSE, center.X.test=TRUE, scale.X.test=FALSE, var.sets, k=2, random.threshold, k.buff=0, q=0, test.dist="normal", norm.type="2", per.var=FALSE)
reset(X, X.test, center.X=TRUE, scale.X=FALSE, center.X.test=TRUE, scale.X.test=FALSE, var.sets, k=2, random.threshold, k.buff=0, q=0, test.dist="normal", norm.type="2", per.var=FALSE)
X |
The n-by-p target matrix; columns represent variables and rows represent samples. |
X.test |
Matrix that will be combined with the |
center.X |
Flag which controls whether the values in |
scale.X |
Flag which controls whether the values in |
center.X.test |
Flag which controls whether the values in |
scale.X.test |
Flag which controls whether the values in |
var.sets |
List of m variable sets, each element is a vector of indices of variables in the set that correspond to columns in |
k |
Rank of reconstruction. Default to 2. Cannot be larger than the minimum variable set size. |
random.threshold |
If specified, indicates the variable set size above which a randomized reduced-rank reconstruction is used. If the variable set size is less or equal to random.threshold, then a non-random reconstruction is computed. Defaults to k and cannot be less than k. |
k.buff |
Additional dimensions used in randomized reduced-rank construction algorithm. Defaults to 0.
Values above 0 can improve the accuracy of the
randomized reconstruction at the expense of additional computational complexity. If |
q |
Number of power iterations for randomized SVD (see |
test.dist |
Distribution for non-zero elements of random test matrix used in randomized SVD algorithm. See description for |
norm.type |
The type of norm to use for computing reconstruction error. Defaults to "2" for Euclidean/Frobenius norm. Other supported option is "1" for L1 norm. |
per.var |
If true, the computed scores for each variable set are divided by the scaled variable set size to generate per-variable scores. Variable set size scaling is performed by dividing all sizes by the mean size (this will generate per-variable scores of approximately the same magnitude as the non-per-variable scores). |
A list with the following elements:
S
an n-by-m matrix of sample-level variable set scores.
v
a length m vector of overall variable set scores.
createVarSetCollection
,randomColumnSpace
# Create a collection of 5 variable sets each of size 10 var.sets = list(set1=1:10, set2=11:20, set3=21:30, set4=31:40, set5=41:50) # Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=1), nrow=100) # Inflate first 10 rows for first 10 variables, i.e., the first # 10 samples should have elevated scores for the first variable set X[1:10,1:10] = rpois(100, lambda=5) # Execute RESET using non-randomized basis computation reset(X, var.sets=var.sets, k=2, random.threshold=10) # Execute RESET with randomized basis computation # (random.threshold will default to k value which is less # than the size of all variable sets) reset(X, var.sets=var.sets, k=2, k.buff=2) # Execute RESET with non-zero k.buff reset(X, var.sets=var.sets, k=2, k.buff=2) # Execute RESET with non-zero q reset(X, var.sets=var.sets, k=2, q=1) # Execute RESET with L1 vs L2 norm reset(X, var.sets=var.sets, k=2, norm.type="1") # Project the X matrix onto the first 5 PCs and use that as X.test # Scale X before calling prcomp() so that no centering or scaling # is needed within reset() X = scale(X) X.test = prcomp(X,center=FALSE,scale=FALSE,retx=TRUE)$x[,1:5] reset(X, X.test=X.test, center.X=FALSE, scale.X=FALSE, center.X.test=FALSE, scale.X.test=FALSE, var.sets=var.sets, k=2)
# Create a collection of 5 variable sets each of size 10 var.sets = list(set1=1:10, set2=11:20, set3=21:30, set4=31:40, set5=41:50) # Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=1), nrow=100) # Inflate first 10 rows for first 10 variables, i.e., the first # 10 samples should have elevated scores for the first variable set X[1:10,1:10] = rpois(100, lambda=5) # Execute RESET using non-randomized basis computation reset(X, var.sets=var.sets, k=2, random.threshold=10) # Execute RESET with randomized basis computation # (random.threshold will default to k value which is less # than the size of all variable sets) reset(X, var.sets=var.sets, k=2, k.buff=2) # Execute RESET with non-zero k.buff reset(X, var.sets=var.sets, k=2, k.buff=2) # Execute RESET with non-zero q reset(X, var.sets=var.sets, k=2, q=1) # Execute RESET with L1 vs L2 norm reset(X, var.sets=var.sets, k=2, norm.type="1") # Project the X matrix onto the first 5 PCs and use that as X.test # Scale X before calling prcomp() so that no centering or scaling # is needed within reset() X = scale(X) X.test = prcomp(X,center=FALSE,scale=FALSE,retx=TRUE)$x[,1:5] reset(X, X.test=X.test, center.X=FALSE, scale.X=FALSE, center.X.test=FALSE, scale.X.test=FALSE, var.sets=var.sets, k=2)
Executes the Reconstruction Set Test (RESET) method (reset
) on
normalized scRNA-seq data stored in a Seurat object. Will analyze the
normalized counts in the active assay (i.e., either the log-normalized or SCTransformed counts).
resetForSeurat(seurat.data, num.pcs, gene.set.collection, k=10, random.threshold, k.buff=0, q=0, test.dist="normal", norm.type="2", per.var=FALSE)
resetForSeurat(seurat.data, num.pcs, gene.set.collection, k=10, random.threshold, k.buff=0, q=0, test.dist="normal", norm.type="2", per.var=FALSE)
seurat.data |
The Seurat object that holds the scRNA-seq data. Assumes PCA has already been performed on a centered and scaled version of the normalized counts. |
num.pcs |
Number of PCs to use in the RESET method. If not specified, will use all computed PCs.
The projection of the scRNA-seq data onto these PCs will be used as the |
gene.set.collection |
List of m gene sets for which scores are computed.
Each element in the list corresponds to a gene set and the list element is a vector
of indices for the genes in the set. The index value is defined relative to the
order of genes in the Seurat active assay. Gene set names should be specified as list names.
See |
k |
See description in |
random.threshold |
See description in |
k.buff |
See description in |
q |
See description in |
test.dist |
See description in |
norm.type |
See description in |
per.var |
See description in |
An updated Seurat object with the following modifications:
RESET sample-level gene set score matrix S stored in the "data" slot of a new "RESET" Assay object.
RESET overall gene set scores stored in the feature metadata column "RESET".
# Only run example code if Seurat package is available if (requireNamespace("Seurat", quietly=TRUE) & requireNamespace("SeuratObject", quietly=TRUE)) { # Define a collection with three gene sets collection=list(set1=1:10, set2=11:20, set3=21:30) # Execute on the pbmc_small scRNA-seq data set included with SeuratObject # See vignettes for more detailed Seurat examples reset.out = resetForSeurat(seurat.data=SeuratObject::pbmc_small, num.pcs=5, gene.set.collection=collection) # Retrieve the scores for the first 10 cells reset.out@assays$RESET[,1:10] # Retrieve the overall scores reset.out@[email protected] }
# Only run example code if Seurat package is available if (requireNamespace("Seurat", quietly=TRUE) & requireNamespace("SeuratObject", quietly=TRUE)) { # Define a collection with three gene sets collection=list(set1=1:10, set2=11:20, set3=21:30) # Execute on the pbmc_small scRNA-seq data set included with SeuratObject # See vignettes for more detailed Seurat examples reset.out = resetForSeurat(seurat.data=SeuratObject::pbmc_small, num.pcs=5, gene.set.collection=collection) # Retrieve the scores for the first 10 cells reset.out@assays$RESET[,1:10] # Retrieve the overall scores reset.out@assays$RESET@meta.features }
Wrapper around the reset
method that uses the projection of X
onto the top num.pcs
principal components as X.test
.
This PC projection is computed using a randomized reduced rank SVD as implemented by randomSVD
.
resetViaPCA(X, center=TRUE, scale=FALSE, num.pcs=2, pca.buff=2, pca.q=1, var.sets, k=2, random.threshold, k.buff=0, q=0, test.dist="normal", norm.type="2", per.var=FALSE)
resetViaPCA(X, center=TRUE, scale=FALSE, num.pcs=2, pca.buff=2, pca.q=1, var.sets, k=2, random.threshold, k.buff=0, q=0, test.dist="normal", norm.type="2", per.var=FALSE)
X |
See description in |
center |
Flag which controls whether the values in |
scale |
Flag which controls whether the values in |
num.pcs |
Number of principal components used for computing the projection of |
pca.buff |
Number of extra dimensions used when calling |
pca.q |
Number of power iterations used when calling |
var.sets |
See description in |
k |
See description in |
random.threshold |
See description in |
k.buff |
See description in |
q |
See description in |
test.dist |
See description in |
norm.type |
See description in |
per.var |
See description in |
A list with the following elements:
S
an n-by-m matrix of sample-level variable set scores.
v
a length m vector of overall variable set scores.
reset
,createVarSetCollection
,randomColumnSpace
# Create a collection of 5 variable sets each of size 10 var.sets = list(set1=1:10, set2=11:20, set3=21:30, set4=31:40, set5=41:50) # Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=1), nrow=100) # Inflate first 10 rows for first 10 variables, i.e., the first # 10 samples should have elevated scores for the first variable set X[1:10,1:10] = rpois(100, lambda=5) # Execute RESET when reconstruction measured on top 10 PCs # with mean centering performed before computing PCs resetViaPCA(X, num.pcs=10, var.sets=var.sets, k=2, random.threshold=10) # Execute RESET when reconstruction measured on top 10 # uncentered PCs with centering performed as needed inside reset() resetViaPCA(X, center=FALSE, num.pcs=10, var.sets=var.sets, k=2, random.threshold=10)
# Create a collection of 5 variable sets each of size 10 var.sets = list(set1=1:10, set2=11:20, set3=21:30, set4=31:40, set5=41:50) # Simulate a 100-by-100 matrix of random Poisson data X = matrix(rpois(10000, lambda=1), nrow=100) # Inflate first 10 rows for first 10 variables, i.e., the first # 10 samples should have elevated scores for the first variable set X[1:10,1:10] = rpois(100, lambda=5) # Execute RESET when reconstruction measured on top 10 PCs # with mean centering performed before computing PCs resetViaPCA(X, num.pcs=10, var.sets=var.sets, k=2, random.threshold=10) # Execute RESET when reconstruction measured on top 10 # uncentered PCs with centering performed as needed inside reset() resetViaPCA(X, center=FALSE, num.pcs=10, var.sets=var.sets, k=2, random.threshold=10)