Title: | L1-Norm PCA Methods |
---|---|
Description: | Implementations of several methods for principal component analysis using the L1 norm. The package depends on COIN-OR Clp version >= 1.17.4. The methods implemented are PCA-L1 (Kwak 2008) <DOI:10.1109/TPAMI.2008.114>, L1-PCA (Ke and Kanade 2003, 2005) <DOI:10.1109/CVPR.2005.309>, L1-PCA* (Brooks, Dula, and Boone 2013) <DOI:10.1016/j.csda.2012.11.007>, L1-PCAhp (Visentin, Prestwich and Armagan 2016) <DOI:10.1007/978-3-319-46227-1_37>, wPCA (Park and Klabjan 2016) <DOI: 10.1109/ICDM.2016.0054>, awPCA (Park and Klabjan 2016) <DOI: 10.1109/ICDM.2016.0054>, PCA-Lp (Kwak 2014) <DOI:10.1109/TCYB.2013.2262936>, and SharpEl1-PCA (Brooks and Dula, submitted). |
Authors: | Sapan Jot <[email protected]>, Paul Brooks <[email protected]>, Andrea Visentin <[email protected]>, Young Woong Park <[email protected]>, and Yi-Hui Zhou <[email protected]> |
Maintainer: | Paul Brooks <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.5.7 |
Built: | 2024-12-30 08:59:36 UTC |
Source: | CRAN |
This package contains implementations of six principal component analysis methods using the L1 norm. The package depends on COIN-OR Clp version >= 1.17.4. The methods implemented are PCA-L1 (Kwak 2008), L1-PCA (Ke and Kanade 2003, 2005), L1-PCA* (Brooks, Dula, and Boone 2013), L1-PCAhp (Visentin, Prestwich and Armagan 2016), wPCA (Park and Klabjan 2016), and awPCA (Park and Klabjan 2016).
Package: | pcaL1 |
Version: | 1.5.7 |
Date: | 2023-01-16 |
License: | GPL (>=3) |
URL: | http://www.optimization-online.org/DB_HTML/2012/04/3436.html, http://www.coin-or.org |
SystemRequirements: | COIN-OR Clp (>= 1.17.4) |
Index:
awl1pca awPCA l1pca L1-PCA l1pcahp L1-PCAhp l1pcastar L1-PCA* l1projection L1-Norm Projection on a Subspace L2PCA_approx Subroutine for awl1pca l2projection L2-Norm Projection on a Subspace pcal1 PCA-L1 pcalp PCA-Lp pcaL1-package pcaL1: L1-Norm PCA Methods plot.awl1pca Plot an awl1pca Object plot.l1pca Plot an l1pca Object plot.l1pcahp Plot an l1pcahp Object plot.l1pcastar Plot an l1pcastar Object plot.pcal1 Plot a pcal1 Object plot.pcalp Plot a pcalp Object plot.wl1pca Plot an wl1pca Object plot.sharpel1pca Plot a sharpel1pca Object sharpel1pca SharpeEL1-PCA sharpel1rs SharpEl1-RS sparsel1pca SparseEl1-PCA wl1pca wPCA
Sapan Jot <[email protected]>, Paul Brooks <[email protected]>, Andrea Visentin <[email protected]>,Young Woong Park <[email protected]>, and Yi-Hui Zhou <[email protected]>
Maintainer: Paul Brooks <[email protected]>
Brooks and Dula (2017) Estimating L1-Norm Best-Fit Lines, submitted
Brooks J.P., Dula J.H., and Boone E.L. (2013) A Pure L1-Norm Princpal Component Analysis, Computational Statistics & Data Analysis, 61:83-98. DOI:10.1016/j.csda.2012.11.007
Ke Q. and Kanade T. (2005) Robust L1 Norm Factorization in the Presence of Outliers and Missing Data by Alternative Convex Programming, IEEE Conference on Computer Vision and Pattern Recognition. DOI:10.1109/CVPR.2005.309
Kwak N. (2008) Principal Component Analysis Based on L1-Norm Maximization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30: 1672-1680. DOI:10.1109/TPAMI.2008.114
Kwak N. (2014) Principal Component Analysis by Lp-Norm Maximization, IEEE Transactions on Cybernetics, 44:594-609. DOI:10.1109/TCYB.2013.2262936
Park, Y.W. and Klabjan, D. (2016) Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis, IEEE International Conference on Data Mining (ICDM). DOI: 10.1109/ICDM.2016.0054
Visentin A., Prestwich S., and Armagan S. T. (2016) Robust Principal Component Analysis by Reverse Iterative Linear Programming, Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 593-605. DOI:10.1007/978-3-319-46227-1_37
Zhou, Y.-H. and Marron, J.S. (2016) Visualization of Robust L1PCA, Stat, 5:173-184. DOI:10.1002/sta4.113
Performs a principal component analysis using the algorithm awPCA described by Park and Klabjan (2016).
awl1pca(X, projDim=1, center=TRUE, projections="l2", tolerance=0.001, iterations=200, beta=0.99, gamma=0.1)
awl1pca(X, projDim=1, center=TRUE, projections="l2", tolerance=0.001, iterations=200, beta=0.99, gamma=0.1)
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the mean, default is TRUE. |
projections |
whether to calculate projections (reconstructions and scores) using the L2 norm ("l2", default) or the L1 norm ("l1"). |
tolerance |
for testing convergence; if the sum of absolute values of loadings vectors is smaller, then the algorithm terminates. |
iterations |
maximum number of iterations in optimization routine. |
beta |
algorithm parameter to set up bound for weights. |
gamma |
algorithm parameter to determine whether to use approximation formula or prcomp function. |
The calculation is performed according to the algorithm described by Park and Klabjan (2016). The method is an iteratively reweighted least squares algorithm for L1-norm principal component analysis.
'awl1pca' returns a list with class "awl1pca" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
projPoints |
the matrix of L2-norm projections of points on the fitted subspace in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
L1error |
sum of the L1 norm of reconstruction errors. |
nIter |
number of iterations. |
ElapsedTime |
elapsed time. |
Park, Y.W. and Klabjan, D. (2016) Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis, IEEE International Conference on Data Mining (ICDM), 2016. DOI: 10.1109/ICDM.2016.0054
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myawl1pca <- awl1pca(X) ##projects data into 2 dimensions. myawl1pca <- awl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(myawl1pca$scores)
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myawl1pca <- awl1pca(X) ##projects data into 2 dimensions. myawl1pca <- awl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(myawl1pca$scores)
Performs a principal component analysis using the algorithm L1-PCA given by Ke and Kanade (2005).
l1pca(X, projDim=1, center=TRUE, projections="l1", initialize="l2pca", tolerance=0.0001, iterations=10)
l1pca(X, projDim=1, center=TRUE, projections="l1", initialize="l2pca", tolerance=0.0001, iterations=10)
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the median, default is TRUE. |
projections |
Whether to calculate reconstructions and scores using the L1 ("l1", default) or L2 ("l2") norm. |
initialize |
initial guess for loadings matrix. Options are: "l2pca" - use traditional PCA/SVD, "random" - use a randomly-generated matrix. The user can also provide a matrix as an initial guess. |
tolerance |
sets the convergence tolerance for the algorithm, default is 0.0001. |
iterations |
sets the number of iterations to run before returning the result, default is 10. |
The calculation is performed according to the linear programming-based algorithm described by Ke and Kanade (2005). The method is a locally-convergent algorithm for finding the L1-norm best-fit subspace by alternatively optimizing the scores and the loadings matrix at each iteration. Linear programming instances are solved using Clp (http://www.coin-or.org)
'l1pca' returns a list with class "l1pca" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns defined the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates (reconstructions). The matrix has dimension nrow(X) x ncol(X). |
Ke Q. and Kanade T. (2005) Robust L1 norm factorization in the presence of outliers and missing data by alternative convex programming, IEEE Conference on Computer Vision and Pattern Recognition. DOI:10.1109/CVPR.2005.309
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pca <- l1pca(X) ##projects data into 2 dimensions. myl1pca <- l1pca(X, projDim=2, center=FALSE, tolerance=0.00001, iterations=20) ## plot first two scores plot(myl1pca$scores)
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pca <- l1pca(X) ##projects data into 2 dimensions. myl1pca <- l1pca(X, projDim=2, center=FALSE, tolerance=0.00001, iterations=20) ## plot first two scores plot(myl1pca$scores)
Performs a principal component analysis using the algorithm L1-PCAhp described by Visentin, Prestwich and Armagan (2016)
l1pcahp(X, projDim=1, center=TRUE, projections="none", initialize="l2pca", threshold=0.0001)
l1pcahp(X, projDim=1, center=TRUE, projections="none", initialize="l2pca", threshold=0.0001)
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the median, default is TRUE. |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default). |
initialize |
method for initial guess for loadings matrix. Options are: "l2pca" - use traditional PCA/SVD, "random" - use a randomly-generated matrix. |
threshold |
sets the convergence threshold for the algorithm, default is 0.001. |
The calculation is performed according to the algorithm described by Visentin, Prestwich and Armagan (2016). The algorithm computes components iteratively in reverse, using a new heuristic based on Linear Programming. Linear programming instances are solved using Clp (http://www.coin-or.org).
'l1pcahp' returns a list with class "l1pcahp" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x ncol(X). The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
Visentin A., Prestwich S., and Armagan S. T. (2016) Robust Principal Component Analysis by Reverse Iterative Linear Programming, Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 593-605. DOI:10.1007/978-3-319-46227-1_37
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcahp <- l1pcahp(X) ##projects data into 2 dimensions. myl1pcahp <- l1pcahp(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcahp$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcahp <- l1pcahp(X) ##projects data into 2 dimensions. myl1pcahp <- l1pcahp(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcahp$scores)
Performs a principal component analysis using the algorithm L1-PCA* described by Brooks, Dula, and Boone (2013)
l1pcastar(X, projDim=1, center=TRUE, projections="none")
l1pcastar(X, projDim=1, center=TRUE, projections="none")
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1 |
center |
whether to center the data using the median, default is TRUE |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default) |
The calculation is performed according to the algorithm described by Brooks, Dula, and Boone (2013). The algorithm finds successive directions of minimum dispersion in the data by finding the L1-norm best-fit hyperplane at each iteration. Linear programming instances are solved using Clp (http://www.coin-or.org)
'l1pcastar' returns a list with class "l1pcastar" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x ncol(X). The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
Brooks J.P., Dula J.H., and Boone E.L. (2013) A Pure L1-Norm Princpal Component Analysis, Computational Statistics & Data Analysis, 61:83-98. DOI:10.1016/j.csda.2012.11.007
Zhou, Y.-H. and Marron, J.S. (2016) Visualization of Robust L1PCA, Stat, 5:173-184. DOI:10.1002/sta4.113
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcastar <- l1pcastar(X) ##projects data into 2 dimensions. myl1pcastar <- l1pcastar(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcastar$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcastar <- l1pcastar(X) ##projects data into 2 dimensions. myl1pcastar <- l1pcastar(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcastar$scores)
Provides the L1-norm projection of points on a subspace, including both scores and reconstructions.
l1projection(X, loadings)
l1projection(X, loadings)
X |
data, in |
loadings |
an orthonormal matrix of loadings vectors |
The scores and reconstructions are calculated by solving a linear program.
'l1projection' returns a list containing the following components:
scores |
the matrix of projected points |
projPoints |
the matrix of projected points in terms of the original coordinates (reconstructions) |
Provides an approximation of traditional PCA described by Park and Klabjan (2016) as a subroutine for awl1pca.
L2PCA_approx(ev.prev, pc.prev, projDim, X.diff)
L2PCA_approx(ev.prev, pc.prev, projDim, X.diff)
ev.prev |
matrix of principal component loadings from a previous iteration of awl1pca |
pc.prev |
vector of eigenvalues from previous iteration of awl1pca |
projDim |
number of dimensions to project data into, must be an integer |
X.diff |
The difference between the current weighted matrix estimate and the estimate from the previous iteration |
The calculation is performed according to equations (11) and (12) in Park and Klabjan (2016). The method is an approximation for traditional principal component analysis.
'L2PCA_approx' returns a list containing the following components:
eigenvalues |
Estimate of eigenvalues of the covariance matrix. |
eigenvectors |
Estimate of eigenvectors of the covariance matrix. |
Park, Y.W. and Klabjan, D. (2016) Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis, IEEE International Conference on Data Mining (ICDM), 2016.
Provides the L2-norm projection of points on a subspace, including both scores and reconstructions.
l2projection(X, loadings)
l2projection(X, loadings)
X |
data, in |
loadings |
an orthonormal matrix of loadings vectors |
The scores and reconstructions are calculated by solving a linear program.
'l2projection' returns a list containing the following components:
scores |
the matrix of projected points |
projPoints |
the matrix of projected points in terms of the original coordinates (reconstructions) |
Performs a principal component analysis using the algorithm PCA-L1 given by Kwak (2008).
pcal1(X, projDim=1, center=TRUE, projections="none", initialize="l2pca")
pcal1(X, projDim=1, center=TRUE, projections="none", initialize="l2pca")
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the median, default is TRUE. |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default). |
initialize |
initial guess for first component. Options are: "l2pca" - use traditional PCA/SVD, "maxx" - use the point with the largest norm, "random" - use a random vector. The user can also provide a vector as the initial guess. |
The calculation is performed according to the algorithm described by Kwak (2008). The method is a locally-convergent algorithm for finding successive directions of maximum L1 dispersion.
'pcal1' returns a list with class "pcal1" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates (reconstructions). The matrix has dimension nrow(X) x ncol(X). |
Kwak N. (2008) Principal component analysis based on L1-norm maximization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30: 1672-1680. DOI:10.1109/TPAMI.2008.114
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcal1 <- pcal1(X) ##projects data into 2 dimensions. mypcal1 <- pcal1(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mypcal1$scores)
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcal1 <- pcal1(X) ##projects data into 2 dimensions. mypcal1 <- pcal1(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mypcal1$scores)
Performs a principal component analysis using the greedy algorithms PCA-Lp(G) and PCA-Lp(L) given by Kwak (2014).
pcalp(X, projDim=1, p = 1.0, center=TRUE, projections="none", initialize="l2pca",solution = "L", epsilon = 0.0000000001, lratio = 0.02)
pcalp(X, projDim=1, p = 1.0, center=TRUE, projections="none", initialize="l2pca",solution = "L", epsilon = 0.0000000001, lratio = 0.02)
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
p |
p-norm use to measure the distance between points. |
center |
whether to center the data using the median, default is TRUE. |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default). |
initialize |
method for initial guess for component. Options are: "l2pca" - use traditional PCA/SVD, "maxx" - use the point with the largest norm, "random" - use a random vector. |
solution |
method projection vector update. Options are: "G" - PCA-Lp(G) implementation: Gradient search, "L" - PCA-Lp(L) implementation: Lagrangian (default). |
epsilon |
for checking convergence. |
lratio |
learning ratio, default is 0.02. Suggested value 1/(nr. instances). |
The calculation is performed according to the algorithm described by Kwak (2014), an extension of the original Kwak(2008). The method is a greedy locally-convergent algorithm for finding successive directions of maximum Lp dispersion.
'pcalp' returns a list with class "pcalp" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
Kwak N. (2008) Principal component analysis based on L1-norm maximization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30: 1672-1680. DOI:10.1109/TPAMI.2008.114
Kwak N. (2014). Principal component analysis by Lp-norm maximization. IEEE transactions on cybernetics, 44(5), 594-609. DOI: 10.1109/TCYB.2013.2262936
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcalp <- pcalp(X, p = 1.5) ##projects data into 2 dimensions. mypcalp <- pcalp(X, projDim=2, p = 1.5, center=FALSE, projections="l1") ## plot first two scores plot(mypcalp$scores)
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcalp <- pcalp(X, p = 1.5) ##projects data into 2 dimensions. mypcalp <- pcalp(X, projDim=2, p = 1.5, center=FALSE, projections="l1") ## plot first two scores plot(mypcalp$scores)
Plots the scores on the first two principal components.
## S3 method for class 'awl1pca' plot(x, ...)
## S3 method for class 'awl1pca' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class awl1pca
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myawl1pca <- awl1pca(X) ##projects data into 2 dimensions. myawl1pca <- awl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(myawl1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myawl1pca <- awl1pca(X) ##projects data into 2 dimensions. myawl1pca <- awl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(myawl1pca$scores)
Plots the scores on the first two principal components.
## S3 method for class 'l1pca' plot(x, ...)
## S3 method for class 'l1pca' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class l1pca
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pca <- l1pca(X) ##projects data into 2 dimensions. myl1pca <- l1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(myl1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pca <- l1pca(X) ##projects data into 2 dimensions. myl1pca <- l1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(myl1pca$scores)
Plots the scores on the first two principal components.
## S3 method for class 'l1pcahp' plot(x, ...)
## S3 method for class 'l1pcahp' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class l1pcahp
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcahp <- l1pcahp(X) ##projects data into 2 dimensions. myl1pcahp <- l1pcahp(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcahp$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcahp <- l1pcahp(X) ##projects data into 2 dimensions. myl1pcahp <- l1pcahp(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcahp$scores)
Plots the scores on the first two principal components.
## S3 method for class 'l1pcastar' plot(x, ...)
## S3 method for class 'l1pcastar' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class l1pcastar
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcastar <- l1pcastar(X) ##projects data into 2 dimensions. myl1pcastar <- l1pcastar(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcastar$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) myl1pcastar <- l1pcastar(X) ##projects data into 2 dimensions. myl1pcastar <- l1pcastar(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(myl1pcastar$scores)
Plots the scores on the first two principal components.
## S3 method for class 'pcal1' plot(x, ...)
## S3 method for class 'pcal1' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class pcal1
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcal1 <- pcal1(X) ##projects data into 2 dimensions. mypcal1 <- pcal1(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mypcal1$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcal1 <- pcal1(X) ##projects data into 2 dimensions. mypcal1 <- pcal1(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mypcal1$scores)
Plots the scores on the first two principal components.
## S3 method for class 'pcalp' plot(x, ...)
## S3 method for class 'pcalp' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class pcalp
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcalp <- pcalp(X) ##projects data into 2 dimensions. mypcalp <- pcalp(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mypcalp$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mypcalp <- pcalp(X) ##projects data into 2 dimensions. mypcalp <- pcalp(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mypcalp$scores)
Plots the scores on the first two principal components.
## S3 method for class 'sharpel1pca' plot(x, ...)
## S3 method for class 'sharpel1pca' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class sharpel1pca
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1pca <- sharpel1pca(X) ##projects data into 2 dimensions. mysharpel1pca <- sharpel1pca(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1pca <- sharpel1pca(X) ##projects data into 2 dimensions. mysharpel1pca <- sharpel1pca(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1pca$scores)
Plots the scores on the first two principal components.
## S3 method for class 'sharpel1rs' plot(x, ...)
## S3 method for class 'sharpel1rs' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class sharpel1rs
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1rs <- sharpel1rs(X) ##projects data into 2 dimensions. mysharpel1rs <- sharpel1rs(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1rs$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1rs <- sharpel1rs(X) ##projects data into 2 dimensions. mysharpel1rs <- sharpel1rs(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1rs$scores)
Plots the scores on the first two principal components.
## S3 method for class 'sparsel1pca' plot(x, ...)
## S3 method for class 'sparsel1pca' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class sparsel1pca
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysparsel1pca <- sparsel1pca(X) ##projects data into 2 dimensions. mysparsel1pca <- sparsel1pca(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysparsel1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysparsel1pca <- sparsel1pca(X) ##projects data into 2 dimensions. mysparsel1pca <- sparsel1pca(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysparsel1pca$scores)
Plots the scores on the first two principal components.
## S3 method for class 'wl1pca' plot(x, ...)
## S3 method for class 'wl1pca' plot(x, ...)
x |
an object of class |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot
, for objects of class wl1pca
.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mywl1pca <- wl1pca(X) ##projects data into 2 dimensions. mywl1pca <- wl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(mywl1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mywl1pca <- wl1pca(X) ##projects data into 2 dimensions. mywl1pca <- wl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(mywl1pca$scores)
Performs a principal component analysis using the algorithm SharpEl1-PCA described by Brooks and Dula (2017, submitted)
sharpel1pca(X, projDim=1, center=TRUE, projections="none")
sharpel1pca(X, projDim=1, center=TRUE, projections="none")
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the median, default is TRUE. |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default). |
The calculation is performed according to the algorithm described by Brooks and Dula (2017, submitted). The algorithm finds successive, orthogonal fitted lines in the data.
'sharpel1pca' returns a list with class "sharpel1pca" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
minobjectives |
the L1 distance of points to their projections in the fitted subspace. |
Brooks J.P. and Dula J.H. (2017) Estimating L1-Norm Best-Fit Lines, submitted.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1pca <- sharpel1pca(X) ##projects data into 2 dimensions. mysharpel1pca <- sharpel1pca(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1pca <- sharpel1pca(X) ##projects data into 2 dimensions. mysharpel1pca <- sharpel1pca(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1pca$scores)
Fits a line in the presence of missing data based on an L1-norm criterion.
sharpel1rs(X, projDim=1, center=TRUE, projections="none")
sharpel1rs(X, projDim=1, center=TRUE, projections="none")
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the median, default is TRUE. |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default). |
The algorithm finds successive, orthogonal fitted lines in the data.
'sharpel1rs' returns a list with class "sharpel1rs" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
minobjectives |
the L1 distance of points to their projections in the fitted subspace. |
Valizadeh Gamchi, F. and Brooks J.P. (2023), working paper.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1rs <- sharpel1rs(X) ##projects data into 2 dimensions. mysharpel1rs <- sharpel1rs(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1rs$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysharpel1rs <- sharpel1rs(X) ##projects data into 2 dimensions. mysharpel1rs <- sharpel1rs(X, projDim=2, center=FALSE, projections="l1") ## plot first two scores plot(mysharpel1rs$scores)
L1-norm line fitting with L1-regularization.
sparsel1pca(X, projDim=1, center=TRUE, projections="none", lambda=0)
sparsel1pca(X, projDim=1, center=TRUE, projections="none", lambda=0)
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the median, default is TRUE. |
projections |
whether to calculate reconstructions and scores using the L1 norm ("l1") the L2 norm ("l2") or not at all ("none", default). |
lambda |
If negative and number of rows is at most 100, calculates all possible breakpoints for the regularization parameter. Otherwise, fits a regularlized line with lambda set to that value. |
The calculation is performed according to the algorithm described by Ling and Brooks (2023, working paper). The algorithm finds successive, orthogonal fitted lines in the data.
'sparsel1pca' returns a list with class "sparsel1pca" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
dispExp |
the proportion of L1 dispersion explained by the loadings vectors. Calculated as the L1 dispersion of the score on each component divided by the L1 dispersion in the original data. |
projPoints |
the matrix of projected points in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
minobjectives |
the L1 distance of points to their projections in the fitted subspace. |
Ling, X. and Brooks J.P. (2023) L1-Norm Regularized L1-Norm Best-Fit Lines, working paper.
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysparsel1pca <- sparsel1pca(X, lambda=0.5) ##projects data into 2 dimensions. mysparsel1pca <- sparsel1pca(X, projDim=2, center=FALSE, projections="l1", lambda=0.5) ## plot first two scores plot(mysparsel1pca$scores)
##for a 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mysparsel1pca <- sparsel1pca(X, lambda=0.5) ##projects data into 2 dimensions. mysparsel1pca <- sparsel1pca(X, projDim=2, center=FALSE, projections="l1", lambda=0.5) ## plot first two scores plot(mysparsel1pca$scores)
Provides the (weighted) L1-norm distances and total distance of points to a subspace.
weightedL1Distance(X, loadings, weights)
weightedL1Distance(X, loadings, weights)
X |
data, in |
loadings |
an orthonormal matrix of loadings vectors |
weights |
a list of weights for loadings vectors |
The reconstructions are calculated by solving a linear program. Then the weights are applied to the distances.
'weightedL1Distance' returns a list containing the following components:
wDistances |
list of weighted distances |
totalDistance |
total distance |
Performs a principal component analysis using the algorithm wPCA described by Park and Klabjan (2016).
wl1pca(X, projDim=1, center=TRUE, projections="l2", tolerance=0.001, iterations=200, beta=0.99)
wl1pca(X, projDim=1, center=TRUE, projections="l2", tolerance=0.001, iterations=200, beta=0.99)
X |
data, must be in |
projDim |
number of dimensions to project data into, must be an integer, default is 1. |
center |
whether to center the data using the mean, default is TRUE |
projections |
whether to calculate projections (reconstructions and scores) using the L2 norm ("l2", default) or the L1 norm ("l1"). |
tolerance |
for testing convergence; if the sum of absolute values of loadings vectors is smaller, then the algorithm terminates. |
iterations |
maximum number of iterations in optimization routine. |
beta |
algorithm parameter to set up bound for weights. |
The calculation is performed according to the algorithm described by Park and Klabjan (2016). The method is an iteratively reweighted least squares algorithm for L1-norm principal component analysis.
'wl1pca' returns a list with class "wl1pca" containing the following components:
loadings |
the matrix of variable loadings. The matrix has dimension ncol(X) x projDim. The columns define the projected subspace. |
scores |
the matrix of projected points. The matrix has dimension nrow(X) x projDim. |
projPoints |
the matrix of L2 projections points on the fitted subspace in terms of the original coordinates. The matrix has dimension nrow(X) x ncol(X). |
L1error |
sum of the L1 norm of reconstruction errors. |
nIter |
number of iterations. |
ElapsedTime |
elapsed time. |
Park, Y.W. and Klabjan, D. (2016) Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis, IEEE International Conference on Data Mining (ICDM), 2016. DOI: 10.1109/ICDM.2016.0054
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mywl1pca <- wl1pca(X) ##projects data into 2 dimensions. mywl1pca <- wl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(mywl1pca$scores)
##for 100x10 data matrix X, ## lying (mostly) in the subspace defined by the first 2 unit vectors, ## projects data into 1 dimension. X <- matrix(c(runif(100*2, -10, 10), rep(0,100*8)),nrow=100) + matrix(c(rep(0,100*2),rnorm(100*8,0,0.1)),ncol=10) mywl1pca <- wl1pca(X) ##projects data into 2 dimensions. mywl1pca <- wl1pca(X, projDim=2, center=FALSE) ## plot first two scores plot(mywl1pca$scores)