Package 'mp' reference manual

Title:	Multidimensional Projection Techniques
Description:	Multidimensional projection techniques are used to create two dimensional representations of multidimensional data sets.
Authors:	Francisco M. Fatore, Samuel G. Fadel
Maintainer:	Francisco M. Fatore <[email protected]>
License:	GPL
Version:	0.4.1
Built:	2025-01-24 06:33:11 UTC
Source:	CRAN

Force Scheme Projection

Description

Creates a 2D representation of the data based on a dissimilarity matrix. A few modifications have been made in relation to the method described in the literature: shuffled indices are used to minimize the order dependency factor, only a fraction of delta is used for better stability and a tolerance factor was introduced as a second stop criterion.

Usage

forceScheme(D, Y = NULL, max.iter = 50, tol = 0, fraction = 8,
  eps = 1e-05)
forceScheme(D, Y = NULL, max.iter = 50, tol = 0, fraction = 8,
  eps = 1e-05)

Arguments

`D`	A dissimilarity structure such as that returned by dist or a full symmetric matrix containing the dissimilarities.
`Y`	Initial 2D configuration. A random configuration will be used when omitted.
`max.iter`	Maximum number of iterations that the algorithm will run.
`tol`	The tolerance for the accumulated error between iterations. If set to 0, the algorithm will run max.iter times.
`fraction`	Controls the point movement. Larger values means less freedom to move.
`eps`	Minimum distance between two points.

Value

The 2D representation of the data.

References

Eduardo Tejada, Rosane Minghim, Luis Gustavo Nonato: On improved projection techniques to support visual exploration of multi-dimensional data sets. Information Visualization 2(4): 218-231 (2003)

Examples

# Eurodist example
emb <- forceScheme(eurodist)
plot(emb, type = "n", xlab ="", ylab ="", asp=1, axes=FALSE, main="")
text(emb, labels(eurodist), cex = 0.6)

# Iris example
emb <- forceScheme(dist(iris[,1:4]))
plot(emb, col=iris$Species)
# Eurodist example
emb <- forceScheme(eurodist)
plot(emb, type = "n", xlab ="", ylab ="", asp=1, axes=FALSE, main="")
text(emb, labels(eurodist), cex = 0.6)

# Iris example
emb <- forceScheme(dist(iris[,1:4]))
plot(emb, col=iris$Species)

Tests whether the given matrix is symmetric.

Description

Tests whether the given matrix is symmetric.

Usage

is.symmetric(mat)
is.symmetric(mat)

Arguments

mat

Matrix to be tested for symmetry.

Value

Whether the matrix is symmetric.

Local Affine Multidimensional Projection

Description

Creates a 2D representation of the data. Requires a subsample (sample.indices) and its 2D representation (Ys).

Usage

lamp(X, sample.indices = NULL, Ys = NULL, cp = 1)
lamp(X, sample.indices = NULL, Ys = NULL, cp = 1)

Arguments

`X`	A data frame or matrix.
`sample.indices`	The indices of data points in X used as subsamples. If not given, some points from X will be randomly selected and Ys will be generated by calling forceScheme on them.
`Ys`	Initial 2D configuration of the data subsamples (will be ignored if sample.indices is NULL). Scaling the columns to [-0.5, 0.5] is recommended to avoid scaling problems.
`cp`	Proportion of nearest control points to be used.

Value

The 2D representation of the data.

References

Joia, P.; Paulovich, F.V.; Coimbra, D.; Cuminato, J.A.; Nonato, L.G., "Local Affine Multidimensional Projection," Visualization and Computer Graphics, IEEE Transactions on , vol.17, no.12, pp.2563,2571, Dec. 2011

Examples

# Iris example
emb <- lamp(iris[, 1:4])
plot(emb, col=iris$Species)

# Iris example
emb <- lamp(iris[, 1:4])
plot(emb, col=iris$Species)

Least-Square Projection

Description

Creates a q-dimensional representation of multidimensional data. Requires a subsample (sample.indices) and its qD representation (Ys).

Usage

lsp(X, sample.indices = NULL, Ys = NULL, k = 15, q = 2)
lsp(X, sample.indices = NULL, Ys = NULL, k = 15, q = 2)

Arguments

`X`	A data frame or matrix.
`sample.indices`	The indices of data points in X used as subsamples. If not given, some rows from X will be randomly selected and Ys will be generated by calling forceScheme on them.
`Ys`	Initial kD configuration of the data subsamples (will be ignored if sample.indices is NULL).
`k`	Number of neighbors used to build the neighborhood graph.
`q`	The target dimensionality.

Value

The qD representation of the data.

References

F. V. Paulovich, L. Nonato, R. Minghim, and H. Levkowitz, Least-Square Projection: A fast high-precision multidimensional projection technique and its application to document mapping, vol. 14, no. 3, pp. 564-575.

Examples

# Iris example
emb <- lsp(iris[, 1:4])
plot(emb, col=iris$Species)

# Iris example
emb <- lsp(iris[, 1:4])
plot(emb, col=iris$Species)

Multidimensional Projection Techniques

Description

Implementation of multidimensional projection techniques

Pekalska's approach to speeding up Sammon's mapping.

Description

Creates a k-dimensional representation of the data. As input, a subsample and its k-dimensional mapping are required. The method approximates the subsample mapping to a linear mapping based on the distances matrix of the subsample and then applies the same mapping to all instances.

Usage

pekalska(D, sample.indices = NULL, Ys = NULL)
pekalska(D, sample.indices = NULL, Ys = NULL)

Arguments

`D`	dist object or distances matrix.
`sample.indices`	The indices of subsamples.
`Ys`	The subsample mapping (k-dimensional).

Value

The low-dimensional representation of the data.

References

Pekalska, E., de Ridder, D., Duin, R. P., & Kraaijveld, M. A. (1999). A new method of generalizing Sammon mapping with application to algorithm speed-up (pp. 221-228).

Part-Linear Multidimensional Projection

Description

Creates a k-dimensional representation of the data. As input, a subsample and its k-dimensional mapping (control points) are required. The method approximates the subsample mapping to a linear mapping and then applies the same mapping to all instances.

Usage

plmp(X, sample.indices = NULL, Ys = NULL, k = 2)
plmp(X, sample.indices = NULL, Ys = NULL, k = 2)

Arguments

`X`	A dataframe or matrix representing the data.
`sample.indices`	The indices of subsamples used as control points.
`Ys`	The control points.
`k`	The target dimensionality.

Value

The low-dimensional representation of the data.

References

Paulovich, F.V.; Silva, C.T.; Nonato, L.G., "Two-Phase Mapping for Projecting Massive Data Sets," Visualization and Computer Graphics, IEEE Transactions on , vol.16, no.6, pp.1281,1290, Nov.-Dec. 2010.

Examples


# Iris example
emb <- plmp(iris[,1:4])
plot(emb, col=iris$Species)

# Iris example
emb <- plmp(iris[,1:4])
plot(emb, col=iris$Species)

t-Distributed Stochastic Neighbor Embedding

Description

Creates a k-dimensional representation of the data by modeling the probability of picking neighbors using a Gaussian for the high-dimensional data and t-Student for the low-dimensional map and then minimizing the KL divergence between them. This implementation uses the same default parameters as defined by the authors.

Usage

tSNE(X, Y = NULL, k = 2, perplexity = 30, n.iter = 1000, eta = 500,
  initial.momentum = 0.5, final.momentum = 0.8, early.exaggeration = 4,
  gain.fraction = 0.2, momentum.threshold.iter = 20,
  exaggeration.threshold.iter = 100, max.binsearch.tries = 50)
tSNE(X, Y = NULL, k = 2, perplexity = 30, n.iter = 1000, eta = 500,
  initial.momentum = 0.5, final.momentum = 0.8, early.exaggeration = 4,
  gain.fraction = 0.2, momentum.threshold.iter = 20,
  exaggeration.threshold.iter = 100, max.binsearch.tries = 50)

Arguments

`X`	A data frame, data matrix, dissimilarity (distance) matrix or dist object.
`Y`	Initial k-dimensional configuration. If NULL, the method uses a random initial configuration.
`k`	Target dimensionality. Avoid anything other than 2 or 3.
`perplexity`	A rough upper bound on the neighborhood size.
`n.iter`	Number of iterations to perform.
`eta`	The "learning rate" for the cost function minimization
`initial.momentum`	The initial momentum used before changing
`final.momentum`	The momentum to use on remaining iterations
`early.exaggeration`	The early exaggeration applied to intial iterations
`gain.fraction`	Undocumented
`momentum.threshold.iter`	Number of iterations before using the final momentum
`exaggeration.threshold.iter`	Number of iterations before using the real probabilities
`max.binsearch.tries`	Maximum number of tries in binary search for parameters to achieve the target perplexity

Value

The k-dimensional representation of the data.

References

L.J.P. van der Maaten and G.E. Hinton. _Visualizing High-Dimensional Data Using t-SNE._ Journal of Machine Learning Research 9(Nov): 2579-2605, 2008.

Examples

# Iris example
emb <- tSNE(iris[, 1:4])
plot(emb, col=iris$Species)

# Iris example
emb <- tSNE(iris[, 1:4])
plot(emb, col=iris$Species)

Package 'mp'

Help Index

Force Scheme Projection

Description

Usage

Arguments

Value

References

See Also

Examples

Tests whether the given matrix is symmetric.

Description

Usage

Arguments

Value

Local Affine Multidimensional Projection

Description

Usage

Arguments

Value

References

Examples

Least-Square Projection

Description

Usage

Arguments

Value

References

Examples

Multidimensional Projection Techniques

Description

Pekalska's approach to speeding up Sammon's mapping.

Description

Usage

Arguments

Value

References

Part-Linear Multidimensional Projection

Description

Usage

Arguments

Value

References

Examples

t-Distributed Stochastic Neighbor Embedding

Description

Usage

Arguments

Value

References

Examples