Title: | R Interface to 'EPP-Lab', a Java Program for Exploratory Projection Pursuit |
---|---|
Description: | An R Interface to 'EPP-lab' v1.0. 'EPP-lab' is a Java program for projection pursuit using genetic algorithms written by Alain Berro and S. Larabi Marie-Sainte and is included in the package. |
Authors: | Daniel Fischer, Alain Berro, Klaus Nordhausen, Anne Ruiz-Gazen |
Maintainer: | Daniel Fischer <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.9.6 |
Built: | 2024-11-07 06:29:35 UTC |
Source: | CRAN |
An interface that gives access to the Java program 'EPP-lab' which implements several biologically inspired optimisation algorithms and several indices for Exploratory Projection Pursuit (PP). The objective of optimizing PP indices and projecting the data on the associated one-dimensional directions is to detect hidden structures such as clusters or outliers in (possibly high dimensional) data sets. The 'EPP-lab' sources and jar-files are available under https://github.com/fischuu/EPP-lab. For a detailed description of 'EPP-lab', see Larabi (2011).
Package: | REPPlab |
Type: | Package |
Version: | 0.9.6 |
Date: | 2023-12-11 |
License: | GPL |
LazyLoad: | yes |
Daniel Fischer, Alain Berro, Klaus Nordhausen, Anne Ruiz-Gazen
Maintainer: Daniel Fischer <[email protected]>
Larabi Marie-Sainte, S., (2011), Biologically inspired algorithms for exploratory projection pursuit, PhD thesis, University of Toulouse.
Ruiz-Gazen, A., Larabi Marie-Sainte, S. and Berro, A. (2010), Detecting multivariate outliers using projection pursuit with particle swarm optimization, COMPSTAT2010, pp. 89-98.
Berro, A., Larabi Marie-Sainte, S. and Ruiz-Gazen, A. (2010). Genetic algorithms and particle swarm optimization for exploratory projection pursuit. Annals of Mathematics and Artifcial Intelligence, 60, 153-178.
Larabi Marie-Sainte, S., Berro, A. and Ruiz-Gazen, A. (2010). An effcient optimization method for revealing local optima of projection pursuit indices. Swarm Intelligence, pp. 60-71.
Extracts the found directions of an epplab
object.
## S3 method for class 'epplab' coef(object, which = 1:ncol(object$PPdir), ...)
## S3 method for class 'epplab' coef(object, which = 1:ncol(object$PPdir), ...)
object |
Object of class |
which |
Specifies which directions are extracted. |
... |
Additional parameters. |
The coef function extracts the directions found from the EPPlab call.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) coef(res)
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) coef(res)
REPPlab optimizes a projection pursuit (PP) index using a Genetic Algorithm (GA) or one of two Particle Swarm Optimisation (PSO) algorithms over several runs, implemented in the Java program EPP-lab. One of the PSO algorithms is a classic one while the other one is a parameter-free extension called Tribes. The parameters of the algorithms (maxiter and individuals for GA and maxiter and particles for PSO) can be modified by the user. The PP indices are the well-known Friedman and Friedman-Tukey indices together with the kurtosis and a so-called discriminant index that is devoted to the detection of groups. At each run, the function finds a local optimum of the PP index and gives the associated projection direction and criterion value.
EPPlab( x, PPindex = "KurtosisMax", PPalg = "GA", n.simu = 20, sphere = FALSE, maxiter = NULL, individuals = NULL, particles = NULL, step_iter = 10, eps = 10^(-6) )
EPPlab( x, PPindex = "KurtosisMax", PPalg = "GA", n.simu = 20, sphere = FALSE, maxiter = NULL, individuals = NULL, particles = NULL, step_iter = 10, eps = 10^(-6) )
x |
Matrix where each row is an observation and each column a dimension. |
PPindex |
The used index, see details. |
PPalg |
The used algorithm, see details. |
n.simu |
Number of simulation runs. |
sphere |
Logical, sphere the data. Default is |
maxiter |
Maximum number of iterations. |
individuals |
Size of the generated population in GA. |
particles |
Number of generated particles in the standard PSO algorithm. |
step_iter |
Convergence criterium parameter, see details. (Default: 10) |
eps |
Convergence criterium parameter, see details. (Default: 10^(-6)) |
The function always centers the data using colMeans
and
divides by the standard deviation. Sphering the data is optional. If
sphering is requested the function WhitenSVD
is used, which
automatically tries to determine the rank of the data.
Currently the function provides the following projection pursuit indices:
KurtosisMax
, Discriminant
, Friedman
,
FriedmanTukey
, KurtosisMin
.
Three algorithms can be used to find the projection directions. These are a
Genetic Algorithm GA
and two Particle Swarm Optimisation algorithms
PSO
and Tribe
.
Since the algorithms might find local optima they are run several times. The function sorts the found directions according to the optimization criterion.
The different algorithms have different default settings. It is for GA:
maxiter=50
and individuals=20
. For PSO: maxiter=20
and
particles=50
. For Tribe: maxiter=20
.
For GA, the size of the generated population is fixed by the user (individuals). The algorithm is based on a tournament section of three participants. It uses a 2-point crossover with a probability of 0.65 and the mutation operator is applied to all the individuals with a probability of 0.05. The termination criterion corresponds to the number of generations and is also fixed by the user (maxiter).
For PSO, the user can give the number of initial generated particles and also the maximum number of iterations. The other parameters are fixed following Clerc (2006) and using a "cosine" neighborhood adapted to PP for the PSO algorithm. For Tribes, only the maximum number of iterations needs to be fixed. The algorithm proposed by Cooren and Clerc (2009) and adapted to PP using a "cosine neighborhood" is used.
The algorithms stop as soon as one of the two following conditions holds:
the maximum number of iterations is reached or the relative difference
between the index value of the present iteration i and the value of
iteration i-step_iter
is less than eps
. In the last situation,
the algorithm is said to converge and EPPlab
will return the number
of iterations needed to attain convergence. If the convergence is not
reached but the maximum number of iterations is attained, the function will
return some warnings. The default values are 10 for step_iter
and
1E-06
for eps
. Note that if few runs have not converged this
might not be problem and even non-converged projections might reveal some
structure.
A list with class 'epplab' containing the following components:
PPdir |
Matrix containing the PP directions as columns, see details. |
PPindexVal |
Vector containing the objective criterion value of each run. |
PPindex |
Name of the used projection index. |
PPiter |
Vector containing the number of iterations of each run. |
PPconv |
Boolean vector. Is TRUE if the run converged and FALSE else. |
PPalg |
Name of the used algorithm. |
maxiter |
Maximum number of iterations, as given in function call. |
x |
Matrix containing the data (centered!). |
sphere |
Logical |
transform |
The transformation matrix from the whitening or standardization step. |
backtransform |
The back-transformation matrix from the whitening or standardization step. |
center |
The mean vector of the data |
Daniel Fischer, Klaus Nordhausen
Larabi Marie-Sainte, S., (2011), Biologically inspired algorithms for exploratory projection pursuit, PhD thesis, University of Toulouse.
Ruiz-Gazen, A., Larabi Marie-Sainte, S. and Berro, A. (2010), Detecting multivariate outliers using projection pursuit with particle swarm optimization, COMPSTAT2010, pp. 89-98.
Berro, A., Larabi Marie-Sainte, S. and Ruiz-Gazen, A. (2010). Genetic algorithms and particle swarm optimization for exploratory projection pursuit. Annals of Mathematics and Artifcial Intelligence, 60, 153-178.
Larabi Marie-Sainte, S., Berro, A. and Ruiz-Gazen, A. (2010). An effcient optimization method for revealing local optima of projection pursuit indices. Swarm Intelligence, pp. 60-71.
Clerc, M. (2006). Particle Swarm Optimization. ISTE, Wiley.
Cooren, Y., Clerc, M. and Siarry, P. (2009). Performance evaluation of TRIBES, an adaptive particle swarm optimization algorithm. Swarm Intelligence, 3(2), 149-178.
library(tourr) data(olive) olivePP <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMax",n.simu=5, maxiter=20) summary(olivePP) library(amap) data(lubisch) X <- lubisch[1:70,2:7] rownames(X) <- lubisch[1:70,1] res <- EPPlab(X,PPalg="PSO",PPindex="FriedmanTukey",n.simu=15, maxiter=20,sphere=TRUE) print(res) summary(res) fitted(res) plot(res) pairs(res) predict(res,data=lubisch[71:74,2:7])
library(tourr) data(olive) olivePP <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMax",n.simu=5, maxiter=20) summary(olivePP) library(amap) data(lubisch) X <- lubisch[1:70,2:7] rownames(X) <- lubisch[1:70,1] res <- EPPlab(X,PPalg="PSO",PPindex="FriedmanTukey",n.simu=15, maxiter=20,sphere=TRUE) print(res) summary(res) fitted(res) plot(res) pairs(res) predict(res,data=lubisch[71:74,2:7])
Function that automatically aggregates the projection directions from one or more epplab
objects.
Three options are available on how to choose the final projection which can have a rank larger than one.
The parameter x
can either be a single object or a list of epplab objects.
Options for method
are inverse
, sq.inverse
and cumulative
.
EPPlabAgg(x, method = "cumulative", percentage = 0.85)
EPPlabAgg(x, method = "cumulative", percentage = 0.85)
x |
An object of class |
method |
The type of method, see details. Options are |
percentage |
Threshold for the relative eigenvalue sum to retain, see details. |
Denote , the projection vectors contained in the list of
epplab
objects
and , the corresponding orthogonal projection matrices (each having rank one).
The method
cumulative
is based on the eigenvalue decomposition of
and transforms as
O
the eigenvectors such that the corresponding
relative eigenvalues sum is at least percentage
.
The number of eigenvectors retained corresponds to the rank k
and
P
is the corresponding orthogonal projection matrix.
The methods inverse
and sq.inverse
are automatic rules to
choose the number of eigenvectors to retain as implemented by the function
AOP
.
A list with class 'epplabagg' containing the following components:
P |
The estimated average orthogonal projection matrix. |
O |
An orthogonal matrix on which P is based upon. |
k |
The rank of the average orthogonal projection matrix. |
eigenvalues |
The relevant eigenvalues, see details. Only given if |
Daniel Fischer, Klaus Nordhausen, Anne Ruiz-Gazen
Liski, E., Nordhausen, K., Oja, H. and Ruiz-Gazen, A. (201?), Combining Linear Dimension Reduction Estimates, to appear in the Proceedings of ICORS 2015, pp. ??-??.
library(tourr) data(olive) # To keep the runtime short, maxiter and n.simu were chosen very # small for demonstration purposes, real life applications would # rather choose larger values, e.g. n.simu=100, maxiter=200 olivePP.kurt.max <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMax",n.simu=10, maxiter=20) olivePP.fried <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="Friedman",n.simu=10, maxiter=20) olivePPs <- list(olivePP.kurt.max, olivePP.fried) EPPlabAgg(olivePP.kurt.max)$k EPPlabAgg(olivePPs, "cum", 0.99)$k pairs(olivePP.kurt.max$x %*% EPPlabAgg(olivePPs, "cum", 0.99)$O, col=olive[,2], pch=olive[,1]) olivAOP.sq <- EPPlabAgg(olivePPs, "inv") oliveProj <- olivePP.kurt.max$x %*% olivAOP.sq$O plot(density(oliveProj)) rug(oliveProj[olive$region==1],col=1) rug(oliveProj[olive$region==2],col=2) rug(oliveProj[olive$region==3],col=3)
library(tourr) data(olive) # To keep the runtime short, maxiter and n.simu were chosen very # small for demonstration purposes, real life applications would # rather choose larger values, e.g. n.simu=100, maxiter=200 olivePP.kurt.max <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMax",n.simu=10, maxiter=20) olivePP.fried <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="Friedman",n.simu=10, maxiter=20) olivePPs <- list(olivePP.kurt.max, olivePP.fried) EPPlabAgg(olivePP.kurt.max)$k EPPlabAgg(olivePPs, "cum", 0.99)$k pairs(olivePP.kurt.max$x %*% EPPlabAgg(olivePPs, "cum", 0.99)$O, col=olive[,2], pch=olive[,1]) olivAOP.sq <- EPPlabAgg(olivePPs, "inv") oliveProj <- olivePP.kurt.max$x %*% olivAOP.sq$O plot(density(oliveProj)) rug(oliveProj[olive$region==1],col=1) rug(oliveProj[olive$region==2],col=2) rug(oliveProj[olive$region==3],col=3)
Function to decide wether observations are considered outliers or not in
specific projection directions of an epplab
object.
EPPlabOutlier(x, which = 1:ncol(x$PPdir), k = 3, location = mean, scale = sd)
EPPlabOutlier(x, which = 1:ncol(x$PPdir), k = 3, location = mean, scale = sd)
x |
An object of class |
which |
The directions in which outliers should be searched. The default is to look at all. |
k |
Numeric value to decide when an observation is considered an outlier or not. Default is 3. See details. |
location |
A function which gives the univariate location as an output.
The default is |
scale |
A function which gives the univariate scale as an output. The
default is |
Denote as the location of the jth projection direction and
analogously
as its scale. Then an observation
is an
outlier in the jth projection direction, if
.
Naturally it is best to use for this purpose robust location and scale
measures like median
and mad
for example.
A list with class 'epplabOutlier' containing the following components:
outlier |
A matrix with only zeros and ones. A value of 1 classifies the observation as an outlier in this projection direction. |
k |
The factor |
location |
The name of the
|
scale |
The name of the |
PPindex |
The name of the |
PPalg |
The name of the |
Klaus Nordhausen
Ruiz-Gazen, A., Larabi Marie-Sainte, S. and Berro, A. (2010), Detecting multivariate outliers using projection pursuit with particle swarm optimization, COMPSTAT2010, pp. 89-98.
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) OUT
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) OUT
Calculates the projections of the data object onto the directions from an
underlying ebblab
object.
## S3 method for class 'epplab' fitted(object, which = 1, ...)
## S3 method for class 'epplab' fitted(object, which = 1, ...)
object |
Object of class |
which |
Onto which direction should the new data be projected. |
... |
Additional parameters |
The default projection direction is the direction with the best objective criterion.
A matrix, having in each column the projection onto the direction of a certain run, and in each row the projected value.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) # Projection to the best direction fitted(res) # Projection to the 1,2,5 best directions: fitted(res,which=c(1,2,5))
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) # Projection to the best direction fitted(res) # Projection to the 1,2,5 best directions: fitted(res,which=c(1,2,5))
Plots a scatterplot matrix of fitted scores of an epplab
object.
## S3 method for class 'epplab' pairs(x, which = 1:10, ...)
## S3 method for class 'epplab' pairs(x, which = 1:10, ...)
x |
Object of class |
which |
Which simulation runs should be taken into account. |
... |
Graphical parameters, see also par(). |
The option which
can restrict the output to certain simulation runs.
In case of many simulations, this might improve the readability.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) pairs(res)
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) pairs(res)
The function offers three informative plots for an epplab
object.
## S3 method for class 'epplab' plot( x, type = "kernel", angles = "radiants", kernel = "biweight", which = 1:10, as.table = TRUE, ... )
## S3 method for class 'epplab' plot( x, type = "kernel", angles = "radiants", kernel = "biweight", which = 1:10, as.table = TRUE, ... )
x |
Object of class |
type |
Type of plot, values are "kernel", "histogram" and "angles". |
angles |
Values are "degree" and "radiants", if |
kernel |
Type of kernel, passed on to |
which |
Which simulation runs should be taken into account. |
as.table |
A logical flag that controls the order in which panels should be displayed. |
... |
Graphical parameters, see also |
The option which
can restrict the output to certain simulation runs.
In case of many simulations, this might improve the readability.
For type="kernel"
, the default, it plots a kernel density estimate
for each of the chosen directions. In the case of type="histogram"
the corresponding histograms. For type="angles"
it plots the angles
of the first chosen direction against all others. Whether the angles are
given in degrees or radiants, depends on the value of angles
.
Daniel Fischer, Klaus Nordhausen
xyplot
, densityplot
,
histogram
, density
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) # Plot with kernel estimator plot(res) # Just the best 5 and then 8 plot(res,which=c(1:5,8)) # Plot as histogram plot(res,type="histogram") # Plot an angles plot plot(res,type="angles")
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) # Plot with kernel estimator plot(res) # Just the best 5 and then 8 plot(res,which=c(1:5,8)) # Plot as histogram plot(res,type="histogram") # Plot an angles plot plot(res,type="angles")
Visualizes which observations are considered as outliers and how often for
an epplabOutlier
object.
## S3 method for class 'epplabOutlier' plot(x, col = c("white", "black"), outlier = TRUE, xlab = "", ylab = "", ...)
## S3 method for class 'epplabOutlier' plot(x, col = c("white", "black"), outlier = TRUE, xlab = "", ylab = "", ...)
x |
Object of class |
col |
Which colors should be used for non-outliers and outliers. Default is white and black. |
outlier |
Logical if only observations considered as outliers at least
once should be plotted or all. Default is |
xlab |
Sets the x-axis label. |
ylab |
Sets the y-axis label. |
... |
Graphical parameters passed on to |
Daniel Fischer and Klaus Nordhausen
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) plot(OUT)
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) plot(OUT)
Calculates the projections of a new data object onto the directions from an
existing ebblab
object.
## S3 method for class 'epplab' predict(object, which = 1, data = NULL, ...)
## S3 method for class 'epplab' predict(object, which = 1, data = NULL, ...)
object |
Object of class |
which |
Onto which direction should the new data be projected. |
data |
The new data object |
... |
Additional parameters |
The default projection direction is the direction with the best objective criterion. In case that no data is given to the function, the fitted scores for the original data will be returned.
A matrix having in each column the projection onto the direction of a certain run and in each row the projected value.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10], PPalg="PSO", PPindex="KurtosisMin", n.simu=10, maxiter=20) newData <- matrix(rnorm(80), ncol=8) # Projection on the best direction predict(res, data=newData) # Projection on the best 3 directions predict(res, which=1:3, data=newData) # Similar with function fitted() when no data is given: predict(res) fitted(res)
library(tourr) data(olive) res <- EPPlab(olive[,3:10], PPalg="PSO", PPindex="KurtosisMin", n.simu=10, maxiter=20) newData <- matrix(rnorm(80), ncol=8) # Projection on the best direction predict(res, data=newData) # Projection on the best 3 directions predict(res, which=1:3, data=newData) # Similar with function fitted() when no data is given: predict(res) fitted(res)
Prints an epplab
object.
## S3 method for class 'epplab' print(x, ...)
## S3 method for class 'epplab' print(x, ...)
x |
Object of class |
... |
Additional parameters |
The print function displays the result with the best value in the objective criterion.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) print(res)
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) print(res)
Prints an epplabOutlier
object.
## S3 method for class 'epplabOutlier' print(x, ...)
## S3 method for class 'epplabOutlier' print(x, ...)
x |
Object of class |
... |
Additional parameters |
The print
function displays only the outlier
matrix from the
epplabOutlier
object.
Klaus Nordhausen
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) OUT
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) OUT
The data set contains 55 variables measured during the production process of 520 units. The problem for this data is to identify the produced items with a fault not detected by standard tests.
A data frame with 520 observations on 55 variables.
The data was anonymized to keep confidentiality.
data(ReliabilityData) summary(ReliabilityData)
data(ReliabilityData) summary(ReliabilityData)
Plots the objective criteria of an epplab
object versus the
simulation runs.
## S3 method for class 'epplab' screeplot( x, type = "lines", which = 1:10, main = "", ylab = "Objective criterion", xlab = "Simulation run", ... )
## S3 method for class 'epplab' screeplot( x, type = "lines", which = 1:10, main = "", ylab = "Objective criterion", xlab = "Simulation run", ... )
x |
Object of class |
type |
Type of screeplot, values are "barplot" and "lines" |
which |
Which simulation runs should be taken into account |
main |
Main title of the plot |
ylab |
Y-axis label |
xlab |
X-axis label |
... |
Graphical parameters, see also par() |
The option which
can restrict the output to certain simulation runs.
In case of many simulations, this might improve the readability. The
barplot
option is often not meaningful, because the objective
criteria are usually large and bars begin by default at zero.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) screeplot(res) # Pretty useless: screeplot(res,type="barplot") screeplot(res,which=1:5)
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) screeplot(res) # Pretty useless: screeplot(res,type="barplot") screeplot(res,which=1:5)
Summarizes and prints an epplab
object in an informative way.
## S3 method for class 'epplab' summary(object, which = 1:10, digits = 4, ...)
## S3 method for class 'epplab' summary(object, which = 1:10, digits = 4, ...)
object |
Object of class |
which |
Summary for |
digits |
Number of displayed decimal digits |
... |
Additional parameters |
The option which
can restrict the output to certain simulation runs.
In case of many simulations, this might improve the readability.
Daniel Fischer
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) summary(res)
library(tourr) data(olive) res <- EPPlab(olive[,3:10],PPalg="PSO",PPindex="KurtosisMin",n.simu=10, maxiter=20) summary(res)
Summarizes and prints an epplabOutlier
object in an informative way.
## S3 method for class 'epplabOutlier' summary(object, ...)
## S3 method for class 'epplabOutlier' summary(object, ...)
object |
Object of class |
... |
Additional parameters |
The main information provided here is a table with names of the observations which are considered outliers and in how many PP directions they are considered outliers. This function is useful if the data has been given row names.
Klaus Nordhausen
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) summary(OUT)
# creating data with 3 outliers n <-300 p <- 10 X <- matrix(rnorm(n*p),ncol=p) X[1,1] <- 9 X[2,4] <- 7 X[3,6] <- 8 # giving the data rownames, obs.1, obs.2 and obs.3 are the outliers. rownames(X) <- paste("obs",1:n,sep=".") PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20) OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad) summary(OUT)
The function whitens a data matrix using the singular value decomposition.
WhitenSVD(x, tol = 1e-06)
WhitenSVD(x, tol = 1e-06)
x |
A numeric data frame or data matrix with at least two rows. |
tol |
Tolerance value to decide the rank of the data. See details for
further information. If set to |
The function whitens the data so that the result has mean zero and identity
covariance matrix using the function svd
. The data can have
here less observations than variables and svd will determine the rank of the
data automatically as the number of singular values larger than the largest
singular value times tol
. If tol=NULL
the rank is set to the
number of singular values, which is not advised when one or more singular
values are close to zero.
The output contains among others as attributes the singular values and the matrix needed to backtransform the whitened data to its original space.
The whitened data.
Klaus Nordhausen
# more observations than variables X <- matrix(rnorm(200),ncol=4) A <- WhitenSVD(X) round(colMeans(A),4) round(cov(A),4) # how to backtransform summary(sweep(A %*% (attr(A,"backtransform")), 2, attr(A,"center"), "+") - X) # fewer observations than variables Y <- cbind(rexp(4),runif(4),rnorm(4), runif(4), rnorm(4), rt(4,4)) B <- WhitenSVD(Y) round(colMeans(B),4) round(cov(B),4) # how to backtransform summary(sweep(B %*% (attr(B,"backtransform")), 2, attr(B,"center"), "+") - Y)
# more observations than variables X <- matrix(rnorm(200),ncol=4) A <- WhitenSVD(X) round(colMeans(A),4) round(cov(A),4) # how to backtransform summary(sweep(A %*% (attr(A,"backtransform")), 2, attr(A,"center"), "+") - X) # fewer observations than variables Y <- cbind(rexp(4),runif(4),rnorm(4), runif(4), rnorm(4), rt(4,4)) B <- WhitenSVD(Y) round(colMeans(B),4) round(cov(B),4) # how to backtransform summary(sweep(B %*% (attr(B,"backtransform")), 2, attr(B,"center"), "+") - Y)