The ‘RcppML’ package provides high-performance machine learning algorithms using Rcpp with a focus on matrix factorization.
Install the latest development version of RcppML from github:
RcppML contains extremely fast NNLS solvers. Use the
nnls
function to solve systems of equations subject to
non-negativity constraints.
The RcppML::solve
function solves the equation for where
is symmetric positive definite matrix of dimensions and is a vector of
length or a matrix of dimensions .
# construct a system of equations
X <- matrix(rnorm(2000),100,20)
btrue <- runif(20)
y <- X %*% btrue + rnorm(100)
a <- crossprod(X)
b <- crossprod(X, y)
# solve the system of equations
x <- RcppML::nnls(a, b)
# use only coordinate descent
x <- RcppML::nnls(a, b, fast_nnls = FALSE, cd_maxit = 1000, cd_tol = 1e-8)
RcppML::solve
implements a new and fastest-in-class
algorithm for non-negative least squares:
cd_maxit = 0
to use only the FAST solver.Project dense linear factor models onto real-valued sparse matrices
(or any matrix coercible to Matrix::dgCMatrix
) using
RcppML::project
.
RcppML::project
solves the equation for .
RcppML::nmf
finds a non-negative matrix factorization by
alternating least squares (alternating projections of linear models and
).
There are several ways in which the NMF algorithm differs from other currently available methods:
The following example runs rank-10 NMF on a random 1000 x 1000 matrix that is 90% sparse:
A <- rsparsematrix(100, 100, 0.1)
model <- RcppML::nmf(A, 10, verbose = F)
w <- model$w
d <- model$d
h <- model$h
model_tolerance <- tail(model$tol, 1)
Tolerance is simply a measure of the average correlation between \eqn{w_{i-1} and and and for a given iteration .
For symmetric factorizations (when ), tolerance becomes a measure of the correlation between and , and diagonalization is automatically performed to enforce symmetry:
A_sym <- as(crossprod(A), "dgCMatrix")
#> 'as(<dsCMatrix>, "dgCMatrix")' is deprecated.
#> Use 'as(., "generalMatrix")' instead.
#> See help("Deprecated") and help("Matrix-deprecated").
model <- RcppML::nmf(A_sym, 10, verbose = F)
Mean squared error of a factorization can be calculated for a given
model using the RcppML::mse
function: