Title: | Excess Mass Calculation and Plots |
---|---|
Description: | Implementation of a function which calculates the empirical excess mass for given \eqn{\lambda} and given maximal number of modes (excessm()). Offering powerful plot features to visualize empirical excess mass (exmplot()). This includes the possibility of drawing several plots (with different maximal number of modes / cut off values) in a single graph. |
Authors: | Marc-Daniel Mildenberger |
Maintainer: | Marc-Daniel Mildenberger <[email protected]> |
License: | LGPL |
Version: | 1.0.1 |
Built: | 2024-12-13 06:48:36 UTC |
Source: | CRAN |
Algorithm which calculates the empirical excess mass for a given and given maximal number of modes.
excessm(x, lambda, M = 1, UpToM = FALSE)
excessm(x, lambda, M = 1, UpToM = FALSE)
x |
data in form of a vector |
lambda |
|
M |
maximal number of modes |
UpToM |
if true, the intervals for modes up to M are returned |
intervals |
Matrix containing the empirical |
excess_mass |
returns a vector with excess masses, the |
Please note that an allowance for modes does not necessarily result in
-clusters. Hence, the number of intervals returned can be smaller than
. In this case a warning will be displayed. The vector
does have less than
entries.
Marc-Daniel Mildenberger [email protected], based on earlier code from Dr. Guenther Sawitzki [email protected]
Muller, D. W. and Sawitzki, G., 09.1991, "Excess Mass Estimates and Tests for Multimodality", Journal of the American Statistical Association , Vol. 86, No. 415, pp. 738–746, http://www.jstor.org/stable/2290406
exmplot, exmsilhouette, mexmsilhouette
library(MASS) attach(geyser) ##calculating excess mass for duration of 'Old Faithful Geyser' for lambda=0.2 allowing for one mode excessm(duration, lambda=0.2) ##same as above, but allowing for up to three modes excessm(duration, lambda=0.2, M=3) #returns the intervals for modes 1,2 and 3 excessm(duration, lambda=0.2, M=3, UpToM=TRUE)
library(MASS) attach(geyser) ##calculating excess mass for duration of 'Old Faithful Geyser' for lambda=0.2 allowing for one mode excessm(duration, lambda=0.2) ##same as above, but allowing for up to three modes excessm(duration, lambda=0.2, M=3) #returns the intervals for modes 1,2 and 3 excessm(duration, lambda=0.2, M=3, UpToM=TRUE)
Implementation of a function which calculates the empirical excess mass for a given and given maximal number of modes (excessm). Offering powerful plot features to visualize empirical excess mass (exmsilhouette). This includes the possibility of drawing several plots (with different maximal number of modes / cut off values) in a single graph. Furthermore, plotting the empirical excess mass against lambda is implemented (exmplot).
Package: | ExcessMass |
Type: | Package |
Version: | 1.0.1 |
Date: | 2017-05-17 |
License: | GPL |
Marc-Daniel Mildenberger [email protected], based on earlier code from Dr. Guenther Sawitzki [email protected]
Muller, D. W. and Sawitzki, G., 09.1991, "Excess Mass Estimates and Tests for Multimodality", Journal of the American Statistical Association , Vol. 86, No. 415, pp. 738–746, http://www.jstor.org/stable/2290406
Muller, D. W., 12.1992, "The Excess Mass Approach in Statistics", Beitraege zur Statistik – StatLab Heidelberg, http://archiv.ub.uni-heidelberg.de/volltextserver/21357/
library(MASS) attach(geyser) excessm(duration, lambda=0.2) x <- rnorm(1000) exmsilhouette(x, M=2, CutOff=0.5) mexmsilhouette(duration, CutOff=c(1,2), steps=60)
library(MASS) attach(geyser) excessm(duration, lambda=0.2) x <- rnorm(1000) exmsilhouette(x, M=2, CutOff=0.5) mexmsilhouette(duration, CutOff=c(1,2), steps=60)
Produces an excess mass lambda plot and calculates the maximal excess mass difference achieved by allowing for an additional mode.
exmplot(xdata, M=1, CutOff=1, steps=50, Lambda=NULL)
exmplot(xdata, M=1, CutOff=1, steps=50, Lambda=NULL)
xdata |
data in form of a vector |
M |
the maximal number of modes |
CutOff |
determines the cut off value and hence the level up to which the |
steps |
number of different |
Lambda |
allows specifying an own vector of |
should not be set too small or too large, as this results in meaningless graphs.
The excess mass for several
s can be calculated by specifying the Lambda.
An Excess Mass Lambda plot is produced. The lines in the plot are sorted by the maximal number of modes from left to right, due to the monotonicity of the excess mass in M.
max_dist |
The i. entry is the maximal distance of the excess mass by allowing for up to i+1 instead of i modes |
max_dist_Lambda |
Shows the |
Marc-Daniel Mildenberger [email protected], based on earlier code from Dr. Guenther Sawitzki [email protected]
Muller, D. W. and Sawitzki, G., 09.1991, "Excess Mass Estimates and Tests for Multimodality", Journal of the American Statistical Association , Vol. 86, No. 415, pp. 738–746, http://www.jstor.org/stable/2290406
excessm, exmsilhouette, mexmsilhouette
library(MASS) attach(geyser) ##calculating the maximal excess mass difference for duration of 'Old Faithful Geyser' for M=3 exmplot(duration, M=3) ##Plotting the excess mass against lambda for modes 1-5, ##increase CutOff value, double the number of steps exmplot(duration, M=5, CutOff=1.2, steps=100) ##Specifying Lambda Lambda=seq(.0,0.5,0.005) exmplot(duration, M=7, Lambda=Lambda)
library(MASS) attach(geyser) ##calculating the maximal excess mass difference for duration of 'Old Faithful Geyser' for M=3 exmplot(duration, M=3) ##Plotting the excess mass against lambda for modes 1-5, ##increase CutOff value, double the number of steps exmplot(duration, M=5, CutOff=1.2, steps=100) ##Specifying Lambda Lambda=seq(.0,0.5,0.005) exmplot(duration, M=7, Lambda=Lambda)
Produces an excess mass plot and the corresponding numerical values if required.
exmsilhouette(xdata, M = 1, CutOff = 1,steps = 50,rug = TRUE, Lambda = NULL,col = FALSE,rdata = FALSE,label = TRUE)
exmsilhouette(xdata, M = 1, CutOff = 1,steps = 50,rug = TRUE, Lambda = NULL,col = FALSE,rdata = FALSE,label = TRUE)
xdata |
data in form of vector |
M |
the maximal number of modes |
CutOff |
determines the cut off value and hence the appearance of the graph |
steps |
number of different |
rug |
draws a rug plot at the bottom of the graph |
Lambda |
allows to specify an own vector of |
col |
lines get colored in purple ( |
rdata |
a numerical output is returned |
label |
allows to reduce labeling |
should not be set too small or too large, as this results in meaningless graphs.
The excess mass for several
s can be calculated by specifying the Lambda.
A plot is always produced. By setting numerical results are returned in form of a two-dimensional list. The first argument specifies
. This means that if
with
(
) you get access to the numerical results for the smallest (largest)
.
In case no -vector is used for each
, the following information is displayed:
[ , 1]
|
value of |
[ , 2]
|
calculated |
[ , 3]
|
excess mass vector |
The last two components are presented in the way known from the excess mass function. In case was set manually the value of
is not returned, as it is known.
Marc-Daniel Mildenberger [email protected], based on earlier code from Dr. Guenther Sawitzki [email protected]
Muller, D. W. and Sawitzki, G., 09.1991, "Excess Mass Estimates and Tests for Multimodality", Journal of the American Statistical Association , Vol. 86, No. 415, pp. 738–746, http://www.jstor.org/stable/2290406
excessm, mexmsilhouette, exmplot
library(MASS) attach(geyser) ##Plot allowing for up to two modes and reduced CutOff value exmsilhouette(duration, M=2, CutOff=1.25) ##Plot with twice the default number of steps, omitting rug plot, ##colorizing the graph and asking for numerical output res <- exmsilhouette(duration, M=2, CutOff=1.25, steps=100, rug=FALSE, col=TRUE, rdata=TRUE) ##Specifying Lambda and requesting numerical output L=seq(.01,0.25,0.005) res <- exmsilhouette(duration, M=3, Lambda=L, col=TRUE, rdata=TRUE)
library(MASS) attach(geyser) ##Plot allowing for up to two modes and reduced CutOff value exmsilhouette(duration, M=2, CutOff=1.25) ##Plot with twice the default number of steps, omitting rug plot, ##colorizing the graph and asking for numerical output res <- exmsilhouette(duration, M=2, CutOff=1.25, steps=100, rug=FALSE, col=TRUE, rdata=TRUE) ##Specifying Lambda and requesting numerical output L=seq(.01,0.25,0.005) res <- exmsilhouette(duration, M=3, Lambda=L, col=TRUE, rdata=TRUE)
Produces a graph with several excess mass plots allowing for different maximal numbers of modes/ cut off values.
mexmsilhouette(xdata, M = 1:3, CutOff = c(1,2,5), steps = 30, Lambda = NULL, col = FALSE, rug = TRUE, rdata = FALSE)
mexmsilhouette(xdata, M = 1:3, CutOff = c(1,2,5), steps = 30, Lambda = NULL, col = FALSE, rug = TRUE, rdata = FALSE)
xdata |
data in form of a vector |
M |
vector containing the max. number of modes |
CutOff |
vector which determines the cut off values and hence the appearance of the graph |
steps |
number of different |
Lambda |
allows to specify an own vector of |
col |
lines get colored in purple ( |
rug |
draws a rug plot at the bottom of the graph |
rdata |
a numerical output is returned |
should not be set too small or too large, as this results in meaningless graphs.
Always a graph with multiple plots is produced. Each column contains another maximal number of modes and each row another CutOff factor.
Setting rdata=TRUE numerical results are returned in form of a list. If the number of modes and the CutOff parameter contain just one element, the output of "mexmsilhouette" and "exmplot" are equal.
Otherwise we can distinguish between two cases. First is not specified, hence the list is four-dimensional. The first element determines the CutOff value of the data by using the sorted CutOff vector (using the plot, this means the row in which the graph is shown). The second element specifies the maximal number of modes by using the sorted mode vector (again using the plot, this means the column of the plot). The third element selects the
of the graph. For each plot and each
, the following information is stored: the value of
, the
-clusters and the excess mass vector. Using the default setting
shows the
-clusters of the fifth smallest
of the
-plot.
If is declared manually, the list is three-dimensional. Hence, the first argument denotes the maximal number of modes (the column of the graph). The second argument indicates the
by the position held by it in the
vector. As in "exmplot" only two information are shown. The
-clusters (
) and the vector of excess mass (
), as the value of
is known.
Marc-Daniel Mildenberger [email protected], based on earlier code from Dr. Guenther Sawitzki [email protected]
Muller, D. W. and Sawitzki, G., 09.1991, "Excess Mass Estimates and Tests for Multimodality", Journal of the American Statistical Association , Vol. 86, No. 415, pp. 738–746, http://www.jstor.org/stable/2290406
excessm, exmplot, exmsilhouette
library(MASS) attach(geyser) ##calculating excess mass plots for duration of 'Old Faithful Geyser', ##specifying CutOff and number of steps manually mexmsilhouette(duration, CutOff=c(1,2), steps=60) ##Allowing for three different maximal number of modes ##and CutOff factors as well as color. ##The rug plot is omitted and numerical data is requested. res=mexmsilhouette(duration, M=c(2,3,7), CutOff=c(0.8,1,2), col=TRUE, rug=FALSE, rdata=TRUE) ##Lambda is specified, color is set to true, numerical data is requested L=seq(.01,.25,0.005) res=mexmsilhouette(duration, M=c(2,3,4), Lambda=L, col=TRUE, rdata=TRUE)
library(MASS) attach(geyser) ##calculating excess mass plots for duration of 'Old Faithful Geyser', ##specifying CutOff and number of steps manually mexmsilhouette(duration, CutOff=c(1,2), steps=60) ##Allowing for three different maximal number of modes ##and CutOff factors as well as color. ##The rug plot is omitted and numerical data is requested. res=mexmsilhouette(duration, M=c(2,3,7), CutOff=c(0.8,1,2), col=TRUE, rug=FALSE, rdata=TRUE) ##Lambda is specified, color is set to true, numerical data is requested L=seq(.01,.25,0.005) res=mexmsilhouette(duration, M=c(2,3,4), Lambda=L, col=TRUE, rdata=TRUE)
Function which gives a rough approximation of maximal .
searchMaxLambda(x, limcount = 4, step = 1.05, trylambda = 0.01)
searchMaxLambda(x, limcount = 4, step = 1.05, trylambda = 0.01)
x |
data in form of a vector |
limcount |
divided by the square root of the number of data points. The result determines the cut off value. |
step |
determines step size |
trylambda |
initial |
Excess mass is calculated for . In case the resulting excess mass is larger (smaller) than the cut off value,
is set as
(respectively
) and excess mass is calculated again until it is smaller (larger) than the cut off value. The corresponding
is returned.
The approximation is done allowing only for one -cluster, as scans including more
-clusters have high computational costs due to the recursive structure of the algorithm.
calculated as described in Details.
Marc-Daniel Mildenberger [email protected], based on earlier code from Dr. Guenther Sawitzki [email protected]
Muller, D. W. and Sawitzki, G., 09.1991, "Excess Mass Estimates and Tests for Multimodality", Journal of the American Statistical Association , Vol. 86, No. 415, pp. 738–746, http://www.jstor.org/stable/2290406
excessm, exmplot, exmsilhouette, mexmsilhouette
library(MASS) attach(geyser) #Calculating Lambda using standard settings searchMaxLambda(duration) #Calculating Lambda, reducing cut off value and step. Setting another initial lambda searchMaxLambda(duration, limcount = 5, step = 1.01, trylambda = 1)
library(MASS) attach(geyser) #Calculating Lambda using standard settings searchMaxLambda(duration) #Calculating Lambda, reducing cut off value and step. Setting another initial lambda searchMaxLambda(duration, limcount = 5, step = 1.01, trylambda = 1)