Package 'cml' reference manual

Title:	Conditional Manifold Learning
Description:	Finds a low-dimensional embedding of high-dimensional data, conditioning on available manifold information. The current version supports conditional MDS (based on either conditional SMACOF in Bui (2021) <arXiv:2111.13646> or closed-form solution in Bui (2022) <doi:10.1016/j.patrec.2022.11.007>) and conditional ISOMAP in Bui (2021) <arXiv:2111.13646>.
Authors:	Anh Tuan Bui [aut, cre]
Maintainer:	Anh Tuan Bui <[email protected]>
License:	GPL-2
Version:	0.2.2
Built:	2024-11-09 06:18:22 UTC
Source:	CRAN

Conditional Manifold Learning

Description

Finds a low-dimensional embedding of high-dimensional data, conditioning on available manifold information. The current version supports conditional MDS (based on either conditional SMACOF or closed-form solution) and conditional ISOMAP.

Please cite this package as follows:

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646

Bui, A. T. (2022). A Closed-Form Solution for Conditional Multidimensional Scaling. Pattern Recognition Letters 164, 148-152. https://doi.org/10.1016/j.patrec.2022.11.007

Details

Brief descriptions of the main functions of the package are provided below:

condMDS(): is the conditional MDS method, which uses conditional SMACOF to optimize its conditional stress objective function.

condMDSeigen(): is the conditional MDS method, which uses a closed-form solution based on multiple linear regression and eigendecomposition.

condIsomap(): is the conditional ISOMAP method, which is basically conditional MDS applying to graph distances (i.e., estimated geodesic distances) of the given distances/dissimilarities.

Author(s)

Anh Tuan Bui

Maintainer: Anh Tuan Bui <[email protected]>

References

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646.

Bui, A. T. (2022). A Closed-Form Solution for Conditional Multidimensional Scaling. Pattern Recognition Letters 164, 148-152. https://doi.org/10.1016/j.patrec.2022.11.007

Examples

## Generate car-brand perception data
factor.weights <- c(90, 88, 83, 82, 81, 70, 68)/562
N <- 100
set.seed(1)
data <- matrix(runif(N*7), N, 7)
colnames(data) <- c('Quality', 'Safety', 'Value',	'Performance', 'Eco', 'Design', 'Tech')
rownames(data) <- paste('Brand', 1:N)
data.hat <- data + matrix(rnorm(N*7), N, 7)*data*.05
data.weighted <- t(apply(data, 1, function(x) x*factor.weights))
d <- dist(data.weighted)
d.hat <- d + rnorm(length(d))*d*.05

## The following examples use the first 4 factors as known features
# Conditional MDS based on conditional SMACOF
u.cmds = condMDS(d.hat, data.hat[,1:4], 3, init='none')
u.cmds$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cmds$U)$cancor # canonical correlations
vegan::procrustes(data.hat[,5:7], u.cmds$U, symmetric = TRUE)$ss # Procrustes statistic

# Conditional MDS based on the closed-form solution
u.cmds = condMDSeigen(d.hat, data.hat[,1:4], 3)
u.cmds$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cmds$U)$cancor # canonical correlations
vegan::procrustes(data.hat[,5:7], u.cmds$U, symmetric = TRUE)$ss # Procrustes statistic

# Conditional MDS based on conditional SMACOF,
# initialized by the closed-form solution
u.cmds = condMDS(d.hat, data.hat[,1:4], 3, init='eigen')
u.cmds$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cmds$U)$cancor # canonical correlations
vegan::procrustes(data.hat[,5:7], u.cmds$U, symmetric = TRUE)$ss # Procrustes statistic

# Conditional ISOMAP
u.cisomap = condIsomap(d.hat, data.hat[,1:4], 3, k = 20, init='eigen')
u.cisomap$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cisomap$U)$cancor
vegan::procrustes(data.hat[,5:7], u.cisomap$U, symmetric = TRUE)$ss
## Generate car-brand perception data
factor.weights <- c(90, 88, 83, 82, 81, 70, 68)/562
N <- 100
set.seed(1)
data <- matrix(runif(N*7), N, 7)
colnames(data) <- c('Quality', 'Safety', 'Value',	'Performance', 'Eco', 'Design', 'Tech')
rownames(data) <- paste('Brand', 1:N)
data.hat <- data + matrix(rnorm(N*7), N, 7)*data*.05
data.weighted <- t(apply(data, 1, function(x) x*factor.weights))
d <- dist(data.weighted)
d.hat <- d + rnorm(length(d))*d*.05

## The following examples use the first 4 factors as known features
# Conditional MDS based on conditional SMACOF
u.cmds = condMDS(d.hat, data.hat[,1:4], 3, init='none')
u.cmds$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cmds$U)$cancor # canonical correlations
vegan::procrustes(data.hat[,5:7], u.cmds$U, symmetric = TRUE)$ss # Procrustes statistic

# Conditional MDS based on the closed-form solution
u.cmds = condMDSeigen(d.hat, data.hat[,1:4], 3)
u.cmds$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cmds$U)$cancor # canonical correlations
vegan::procrustes(data.hat[,5:7], u.cmds$U, symmetric = TRUE)$ss # Procrustes statistic

# Conditional MDS based on conditional SMACOF,
# initialized by the closed-form solution
u.cmds = condMDS(d.hat, data.hat[,1:4], 3, init='eigen')
u.cmds$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cmds$U)$cancor # canonical correlations
vegan::procrustes(data.hat[,5:7], u.cmds$U, symmetric = TRUE)$ss # Procrustes statistic

# Conditional ISOMAP
u.cisomap = condIsomap(d.hat, data.hat[,1:4], 3, k = 20, init='eigen')
u.cisomap$B # compare with diag(factor.weights[1:4])
ccor(data.hat[,5:7], u.cisomap$U)$cancor
vegan::procrustes(data.hat[,5:7], u.cisomap$U, symmetric = TRUE)$ss

Canonical Correlations

Description

Computes canonical correlations for two sets of multivariate data x and y.

Usage

ccor(x, y)
ccor(x, y)

Arguments

`x`	the first multivariate dataset.
`y`	the second multivariate dataset.

Value

a list of the following components:

`cancor`	a vector of canonical correlations.
`xcoef`	a matrix, each column of which is the vector of coefficients of x to produce the corresponding canonical covariate.
`ycoef`	a matrix, each column of which is the vector of coefficients of y to produce the corresponding canonical covariate.

Author(s)

Anh Tuan Bui

Examples

ccor(iris[,1:2], iris[,3:4])
ccor(iris[,1:2], iris[,3:4])

Conditional Euclidean distance

Description

Internal functions.

Usage

condDist(U, V.tilda, one_n_t=t(rep(1,nrow(U))))
condDist2(U, V.tilda2, one_n_t=t(rep(1,nrow(U))))
condDist(U, V.tilda, one_n_t=t(rep(1,nrow(U))))
condDist2(U, V.tilda2, one_n_t=t(rep(1,nrow(U))))

Arguments

`U`	the embedding `U`
`V.tilda`	`= V %*% B`
`V.tilda2`	`= V %% b^2t(V)`
`one_n_t`	`= t(rep(1,nrow(U)))`

Value

a dist object.

Author(s)

Anh Tuan Bui

References

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646.

Conditional ISOMAP

Description

Finds a low-dimensional manifold embedding of a given distance/dissimilarity matrix, conditioning on available manifold information. The method applies conditional MDS (see condMDS) to a graph distance matrix computed for the given distances/dissimilarities, using the isomap{vegan} function.

Usage

condIsomap(d, V, u.dim, epsilon = NULL, k, W,
           method = c('matrix', 'vector'), exact = TRUE,
           it.max = 1000, gamma = 1e-05,
           init = c('none', 'eigen', 'user'),
           U.start, B.start, ...) condIsomap(d, V, u.dim, epsilon = NULL, k, W,
           method = c('matrix', 'vector'), exact = TRUE,
           it.max = 1000, gamma = 1e-05,
           init = c('none', 'eigen', 'user'),
           U.start, B.start, ...)

Arguments

`d`	a distance/dissimilarity matrix of N entities (or a `dist` object).
`V`	an Nxq matrix of q manifold auxiliary parameter values of the N entities.
`u.dim`	the embedding dimension.
`epsilon`	shortest dissimilarity retained.
`k`	Number of shortest dissimilarities retained for a point. If both `epsilon` and `k` are given, `epsilon` will be used.
`W`	an NxN symmetric weight matrix. If not given, a matrix of ones will be used.
`method`	if `matrix`, there are no restrictions for the B matrix . If `vector`, the B matrix is restricted to be diagonal. The latter is more efficient for large q.
`exact`	only relevant if `W` is not given. In this case, if `exact == FALSE`, `U` is updated by the large-N approximation formula.
`it.max`	the max number of conditional SMACOF iterations.
`gamma`	conditional SMACOF stops early if the reduction of normalized conditional stress is less than `gamma`
`init`	initialization method.
`U.start`	user-defined starting values for the embedding (when `init = 'user'`)
`B.start`	starting `B` matrix.
`...`	other arguments for the `isomap{vegan}` function.

Value

`U`	the embedding result.
`B`	the estimated `B` matrix.
`stress`	Normalized conditional stress value.
`sigma`	the conditional stress value at each iteration.
`init`	the value of the `init` argument.
`U.start`	the starting values for the embedding.
`B.start`	starting values for the `B` matrix.
`method`	the value of the `method` argument.
`exact`	the value of the `exact` argument.

Author(s)

Anh Tuan Bui

References

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646.

Bui, A. T. (2022). A Closed-Form Solution for Conditional Multidimensional Scaling. Pattern Recognition Letters 164, 148-152. https://doi.org/10.1016/j.patrec.2022.11.007

Examples

# see help(cml)
# see help(cml)

Conditional Multidimensional Scaling

Description

Wrapper of condSmacof, which finds a low-dimensional embedding of a given distance/dissimilarity matrix, conditioning on available manifold information.

Usage

condMDS(d, V, u.dim, W,
        method = c('matrix', 'vector'), exact = TRUE,
        it.max = 1000, gamma = 1e-05,
        init = c('none', 'eigen', 'user'),
        U.start, B.start) condMDS(d, V, u.dim, W,
        method = c('matrix', 'vector'), exact = TRUE,
        it.max = 1000, gamma = 1e-05,
        init = c('none', 'eigen', 'user'),
        U.start, B.start)

Arguments

`d`	a distance/dissimilarity matrix of N entities (or a `dist` object).
`V`	an Nxq matrix of q manifold auxiliary parameter values of the N entities.
`u.dim`	the embedding dimension.
`W`	an NxN symmetric weight matrix. If not given, a matrix of ones will be used.
`method`	if `matrix`, there are no restrictions for the B matrix . If `vector`, the B matrix is restricted to be diagonal. The latter is more efficient for large q.
`exact`	only relevant if `W` is not given. In this case, if `exact == FALSE`, `U` is updated by the large-N approximation formula.
`it.max`	the max number of conditional SMACOF iterations.
`gamma`	conditional SMACOF stops early if the reduction of normalized conditional stress is less than `gamma`
`init`	initialization method.
`U.start`	user-defined starting values for the embedding (when `init = 'user'`)
`B.start`	starting `B` matrix.

Value

`U`	the embedding result.
`B`	the estimated `B` matrix.
`stress`	Normalized conditional stress value.
`sigma`	the conditional stress value at each iteration.
`init`	the value of the `init` argument.
`U.start`	the starting values for the embedding.
`B.start`	starting values for the `B` matrix.
`method`	the value of the `method` argument.
`exact`	the value of the `exact` argument.

Author(s)

Anh Tuan Bui

References

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646.

Bui, A. T. (2022). A Closed-Form Solution for Conditional Multidimensional Scaling. Pattern Recognition Letters 164, 148-152. https://doi.org/10.1016/j.patrec.2022.11.007

Examples

# see help(cml)
# see help(cml)

Conditional Multidimensional Scaling With Closed-Form Solution

Description

Provides a closed-form solution for conditional multidimensional scaling, based on multiple linear regression and eigendecomposition.

Usage

condMDSeigen(d, V, u.dim, method = c('matrix', 'vector'))
condMDSeigen(d, V, u.dim, method = c('matrix', 'vector'))

Arguments

`d`	a `dist` object of N entities.
`V`	an Nxq matrix of q manifold auxiliary parameter values of the N entities.
`u.dim`	the embedding dimension.
`method`	if `matrix`, there are no restrictions for the B matrix . If `vector`, the B matrix is restricted to be diagonal.

Value

`U`	the embedding result.
`B`	the estimated `B` matrix.
`eig`	the computed eigenvalues.
`stress`	the corresponding normalized conditional stress value of the solution.

Author(s)

Anh Tuan Bui

References

Bui, A. T. (2022). A Closed-Form Solution for Conditional Multidimensional Scaling. Pattern Recognition Letters 164, 148-152. https://doi.org/10.1016/j.patrec.2022.11.007

Examples

# see help(cml)
# see help(cml)

Conditional SMACOF

Description

Conditional SMACOF algorithms. Intended for internal usage.

Usage

condSmacof(d, V, u.dim, W,
           method = c('matrix', 'vector'), exact = TRUE,
           it.max = 1000, gamma = 1e-05,
           init = c('none', 'eigen', 'user'),
           U.start, B.start)
condSmacof(d, V, u.dim, W,
           method = c('matrix', 'vector'), exact = TRUE,
           it.max = 1000, gamma = 1e-05,
           init = c('none', 'eigen', 'user'),
           U.start, B.start)

Arguments

`d`	a `dist` object of N entities.
`V`	an Nxq matrix of q manifold auxiliary parameter values of the N entities.
`u.dim`	the embedding dimension.
`W`	an NxN symmetric weight matrix. If not given, a matrix of ones will be used.
`method`	if `matrix`, there are no restrictions for the B matrix . If `vector`, the B matrix is restricted to be diagonal. The latter is more efficient for large q.
`exact`	only relevant if `W` is not given. In this case, if `exact == FALSE`, `U` is updated by the large-N approximation formula.
`it.max`	the max number of conditional SMACOF iterations.
`gamma`	conditional SMACOF stops early if the reduction of normalized conditional stress is less than `gamma`
`init`	initialization method.
`U.start`	user-defined starting values for the embedding (when `init = 'user'`)
`B.start`	starting `B` matrix.

Value

`U`	the embedding result.
`B`	the estimated `B` matrix.
`stress`	Normalized conditional stress value.
`sigma`	the conditional stress value at each iteration.
`init`	the value of the `init` argument.
`U.start`	the starting values for the embedding.
`B.start`	starting values for the `B` matrix.
`method`	the value of the `method` argument.
`exact`	the value of the `exact` argument.

Author(s)

Anh Tuan Bui

References

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646.

Bui, A. T. (2022). A Closed-Form Solution for Conditional Multidimensional Scaling. Pattern Recognition Letters. https://doi.org/10.1016/j.patrec.2022.11.007

C(Z)

Description

Internal function.

Usage

cz(w, d, dz)
cz(w, d, dz)

Arguments

`w`	the `dist` object of a weight matrix.
`d`	the `dist` object of a distance/dissimilarity matrix.
`dz`	the `dist` object of conditional distances.

Value

the matrix C(Z)

Author(s)

Anh Tuan Bui

References

Bui, A.T. (2021). Dimension Reduction with Prior Information for Knowledge Discovery. arXiv:2111.13646. https://arxiv.org/abs/2111.13646.

Moore-Penrose Inverse

Description

Computes the Moore-Penrose inverse (a.k.a., generalized inverse or pseudoinverse) of a matrix based on singular-value decomposition (SVD).

Usage

mpinv(A, eps = sqrt(.Machine$double.eps))
mpinv(A, eps = sqrt(.Machine$double.eps))

Arguments

`A`	a matrix of real numbers.
`eps`	a threshold (to be multiplied with the largest singular value) for dropping SVD parts that correspond to small singular values.

Value

the Moore-Penrose inverse.

Author(s)

Anh Tuan Bui

Examples

mpinv(2*diag(4))
mpinv(2*diag(4))

Package 'cml'

Help Index

Conditional Manifold Learning

Description

Details

Author(s)

References

Examples

Canonical Correlations

Description

Usage

Arguments

Value

Author(s)

Examples

Conditional Euclidean distance

Description

Usage

Arguments

Value

Author(s)

References

Conditional ISOMAP

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Conditional Multidimensional Scaling

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Conditional Multidimensional Scaling With Closed-Form Solution

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Conditional SMACOF

Description

Usage

Arguments

Value

Author(s)

References

C(Z)

Description

Usage

Arguments

Value

Author(s)

References

Moore-Penrose Inverse

Description

Usage

Arguments

Value

Author(s)

Examples