Title: | (Robust) Canonical Correlation Analysis via Projection Pursuit |
---|---|
Description: | Canonical correlation analysis and maximum correlation via projection pursuit, as well as fast implementations of correlation estimators, with a focus on robust and nonparametric methods. |
Authors: | Andreas Alfons [aut, cre] , David Simcha [ctb] (O(n log(n)) implementation of Kendall correlation) |
Maintainer: | Andreas Alfons <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3.4 |
Built: | 2024-12-04 07:22:03 UTC |
Source: | CRAN |
Canonical correlation analysis and maximum correlation via projection pursuit, as well as fast implementations of correlation estimators, with a focus on robust and nonparametric methods.
The DESCRIPTION file:
Package: | ccaPP |
Type: | Package |
Title: | (Robust) Canonical Correlation Analysis via Projection Pursuit |
Version: | 0.3.4 |
Date: | 2024-09-04 |
Depends: | R (>= 3.2.0), parallel, pcaPP (>= 1.8-1), robustbase |
Imports: | Rcpp (>= 0.11.0) |
LinkingTo: | Rcpp (>= 0.11.0), RcppArmadillo (>= 0.4.100.0) |
Suggests: | knitr, mvtnorm |
VignetteBuilder: | knitr |
Description: | Canonical correlation analysis and maximum correlation via projection pursuit, as well as fast implementations of correlation estimators, with a focus on robust and nonparametric methods. |
License: | GPL (>= 2) |
URL: | https://github.com/aalfons/ccaPP |
BugReports: | https://github.com/aalfons/ccaPP/issues |
LazyLoad: | yes |
Authors@R: | c(person("Andreas", "Alfons", email = "[email protected]", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-2513-3788")), person("David", "Simcha", role = "ctb", comment = "O(n log(n)) implementation of Kendall correlation")) |
Author: | Andreas Alfons [aut, cre] (<https://orcid.org/0000-0002-2513-3788>), David Simcha [ctb] (O(n log(n)) implementation of Kendall correlation) |
Maintainer: | Andreas Alfons <[email protected]> |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | yes |
Packaged: | 2024-09-04 18:34:57 UTC; alfons |
Repository: | CRAN |
Date/Publication: | 2024-09-04 22:20:10 UTC |
Index of help topics:
ccaGrid (Robust) CCA via alternating series of grid searches ccaPP-package (Robust) Canonical Correlation Analysis via Projection Pursuit ccaProj (Robust) CCA via projections through the data points corFunctions Fast implementations of (robust) correlation estimators diabetes Diabetes data fastMAD Fast implementation of the median absolute deviation fastMedian Fast implementation of the median maxCorGrid (Robust) maximum correlation via alternating series of grid searches maxCorProj (Robust) maximum correlation via projections through the data points permTest (Robust) permutation test for no association
Further information is available in the following vignettes:
ccaPP-intro |
Robust Maximum Association Between Data Sets: The R Package ccaPP (source, pdf) |
Andreas Alfons [aut, cre] (<https://orcid.org/0000-0002-2513-3788>), David Simcha [ctb] (O(n log(n)) implementation of Kendall correlation)
Maintainer: Andreas Alfons <[email protected]>
A. Alfons, C. Croux and P. Filzmoser (2016) Robust maximum association between data sets: The R Package ccaPP. Austrian Journal of Statistics, 45(1), 71–79.
A. Alfons, C. Croux and P. Filzmoser (2016) Robust maximum association estimators. Journal of the American Statistical Association, 112(517), 435–445.
Perform canoncial correlation analysis via projection pursuit based on alternating series of grid searches in two-dimensional subspaces of each data set, with a focus on robust and nonparametric methods.
ccaGrid( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), nIterations = 10, nAlternate = 10, nGrid = 25, select = NULL, tol = 1e-06, standardize = TRUE, fallback = FALSE, seed = NULL, ... ) CCAgrid( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), maxiter = 10, maxalter = 10, splitcircle = 25, select = NULL, zero.tol = 1e-06, standardize = TRUE, fallback = FALSE, seed = NULL, ... )
ccaGrid( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), nIterations = 10, nAlternate = 10, nGrid = 25, select = NULL, tol = 1e-06, standardize = TRUE, fallback = FALSE, seed = NULL, ... ) CCAgrid( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), maxiter = 10, maxalter = 10, splitcircle = 25, select = NULL, zero.tol = 1e-06, standardize = TRUE, fallback = FALSE, seed = NULL, ... )
x , y
|
each can be a numeric vector, matrix or data frame. |
k |
an integer giving the number of canonical variables to compute. |
method |
a character string specifying the correlation functional to
maximize. Possible values are |
control |
a list of additional arguments to be passed to the specified
correlation functional. If supplied, this takes precedence over additional
arguments supplied via the |
nIterations , maxiter
|
an integer giving the maximum number of iterations. |
nAlternate , maxalter
|
an integer giving the maximum number of alternate series of grid searches in each iteration. |
nGrid , splitcircle
|
an integer giving the number of equally spaced grid points on the unit circle to use in each grid search. |
select |
optional; either an integer vector of length two or a list
containing two index vectors. In the first case, the first integer gives
the number of variables of |
tol , zero.tol
|
a small positive numeric value to be used for determining convergence. |
standardize |
a logical indicating whether the data should be (robustly) standardized. |
fallback |
logical indicating whether a fallback mode for robust standardization should be used. If a correlation functional other than the Pearson correlation is maximized, the first attempt for standardizing the data is via median and MAD. In the fallback mode, variables whose MADs are zero (e.g., dummy variables) are standardized via mean and standard deviation. Note that if the Pearson correlation is maximized, standardization is always done via mean and standard deviation. |
seed |
optional initial seed for the random number generator (see
|
... |
additional arguments to be passed to the specified correlation functional. Currently, this is only relevant for the M-estimator. For Spearman, Kendall and quadrant correlation, consistency at the normal model is always forced. |
The algorithm is based on alternating series of grid searches in
two-dimensional subspaces of each data set. In each grid search,
nGrid
grid points on the unit circle in the corresponding plane are
obtained, and the directions from the center to each of the grid points are
examined. In the first iteration, equispaced grid points in the interval
are used. In each subsequent
iteration, the angles are halved such that the interval
is used in the second iteration and so
on. If only one data set is multivariate, the algorithm simplifies
to iterative grid searches in two-dimensional subspaces of the corresponding
data set.
In the basic algorithm, the order of the variables in a series of grid
searches for each of the data sets is determined by the average absolute
correlations with the variables of the respective other data set. Since
this requires to compute the full matrix of
absolute correlations, where
denotes the number of variables of
x
and the number of variables of
y
, a faster
modification is available as well. In this modification, the average
absolute correlations are computed over only a subset of the variables of
the respective other data set. It is thereby possible to use randomly
selected subsets of variables, or to specify the subsets of variables
directly.
Note that also the data sets are ordered according to the maximum average absolute correlation with the respective other data set to ensure symmetry of the algorithm.
For higher order canonical correlations, the data are first transformed into suitable subspaces. Then the alternate grid algorithm is applied to the reduced data and the results are back-transformed to the original space.
An object of class "cca"
with the following components:
cor |
a numeric vector giving the canonical correlation measures. |
A |
a numeric matrix in which the columns contain the canonical vectors
for |
B |
a numeric matrix in which the columns contain the canonical vectors
for |
centerX |
a numeric vector giving the center estimates used in
standardization of |
centerY |
a numeric vector giving the center estimates used in
standardization of |
scaleX |
a numeric vector giving the scale estimates used in
standardization of |
scaleY |
a numeric vector giving the scale estimates used in
standardization of |
call |
the matched function call. |
CCAgrid
is a simple wrapper function for ccaGrid
for
more compatibility with package pcaPP concerning function and argument
names.
Andreas Alfons
ccaProj
, maxCorGrid
,
corFunctions
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation ccaGrid(x, y, method = "spearman") ## Pearson correlation ccaGrid(x, y, method = "pearson")
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation ccaGrid(x, y, method = "spearman") ## Pearson correlation ccaGrid(x, y, method = "pearson")
Perform canoncial correlation analysis via projection pursuit based on projections through the data points, with a focus on robust and nonparametric methods.
ccaProj( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), standardize = TRUE, useL1Median = TRUE, fallback = FALSE, ... ) CCAproj( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), standardize = TRUE, useL1Median = TRUE, fallback = FALSE, ... )
ccaProj( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), standardize = TRUE, useL1Median = TRUE, fallback = FALSE, ... ) CCAproj( x, y, k = 1, method = c("spearman", "kendall", "quadrant", "M", "pearson"), standardize = TRUE, useL1Median = TRUE, fallback = FALSE, ... )
x , y
|
each can be a numeric vector, matrix or data frame. |
k |
an integer giving the number of canonical variables to compute. |
method |
a character string specifying the correlation functional to
maximize. Possible values are |
control |
a list of additional arguments to be passed to the specified
correlation functional. If supplied, this takes precedence over additional
arguments supplied via the |
standardize |
a logical indicating whether the data should be (robustly) standardized. |
useL1Median |
a logical indicating whether the |
fallback |
logical indicating whether a fallback mode for robust standardization should be used. If a correlation functional other than the Pearson correlation is maximized, the first attempt for standardizing the data is via median and MAD. In the fallback mode, variables whose MADs are zero (e.g., dummy variables) are standardized via mean and standard deviation. Note that if the Pearson correlation is maximized, standardization is always done via mean and standard deviation. |
... |
additional arguments to be passed to the specified correlation functional. Currently, this is only relevant for the M-estimator. For Spearman, Kendall and quadrant correlation, consistency at the normal model is always forced. |
First the candidate projection directions are defined for each data set
from the respective center through each data point. Then the algorithm
scans all possible combinations for the maximum correlation,
where
is the number of observations.
For higher order canonical correlations, the data are first transformed into suitable subspaces. Then the alternate grid algorithm is applied to the reduced data and the results are back-transformed to the original space.
An object of class "cca"
with the following components:
cor |
a numeric vector giving the canonical correlation measures. |
A |
a numeric matrix in which the columns contain the canonical vectors
for |
B |
a numeric matrix in which the columns contain the canonical vectors
for |
centerX |
a numeric vector giving the center estimates used in
standardization of |
centerY |
a numeric vector giving the center estimates used in
standardization of |
scaleX |
a numeric vector giving the scale estimates used in
standardization of |
scaleY |
a numeric vector giving the scale estimates used in
standardization of |
call |
the matched function call. |
CCAproj
is a simple wrapper function for ccaProj
for
more compatibility with package pcaPP concerning function names.
Andreas Alfons
ccaGrid
, maxCorProj
,
corFunctions
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation ccaProj(x, y, method = "spearman") ## Pearson correlation ccaProj(x, y, method = "pearson")
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation ccaProj(x, y, method = "spearman") ## Pearson correlation ccaProj(x, y, method = "pearson")
Estimate the correlation of two vectors via fast C++ implementations, with a focus on robust and nonparametric methods.
corPearson(x, y) corSpearman(x, y, consistent = FALSE) corKendall(x, y, consistent = FALSE) corQuadrant(x, y, consistent = FALSE) corM( x, y, prob = 0.9, initial = c("quadrant", "spearman", "kendall", "pearson"), tol = 1e-06 )
corPearson(x, y) corSpearman(x, y, consistent = FALSE) corKendall(x, y, consistent = FALSE) corQuadrant(x, y, consistent = FALSE) corM( x, y, prob = 0.9, initial = c("quadrant", "spearman", "kendall", "pearson"), tol = 1e-06 )
x , y
|
numeric vectors. |
consistent |
a logical indicating whether a consistent estimate at the
bivariate normal distribution should be returned (defaults to |
prob |
numeric; probability for the quantile of the
|
initial |
a character string specifying the starting values for the
Huber M-estimator. For |
tol |
a small positive numeric value to be used for determining convergence. |
corPearson
estimates the classical Pearson correlation.
corSpearman
, corKendall
and corQuadrant
estimate the
Spearman, Kendall and quadrant correlation, respectively, which are
nonparametric correlation measures that are somewhat more robust.
corM
estimates the correlation based on a bivariate M-estimator of
location and scatter with a Huber loss function, which is sufficiently
robust in the bivariate case, but loses robustness with increasing dimension.
The nonparametric correlation measures do not estimate the same population
quantities as the Pearson correlation, the latter of which is consistent at
the bivariate normal model. Let denote the population
correlation at the normal model. Then the Spearman correlation estimates
, while the Kendall and
quadrant correlation estimate
. Consistent estimates are
thus easily obtained by taking the corresponding inverse expressions.
The Huber M-estimator, on the other hand, is consistent at the bivariate normal model.
The respective correlation estimate.
The Kendall correlation uses a naive implementation if
and a fast
implementation for
larger values, where
denotes the number of observations.
Functionality for removing observations with missing values is currently not implemented.
Andreas Alfons, implementation of
the Kendall correlation by David Simcha
## generate data library("mvtnorm") set.seed(1234) # for reproducibility sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) xy <- rmvnorm(100, sigma=sigma) x <- xy[, 1] y <- xy[, 2] ## compute correlations # Pearson correlation corPearson(x, y) # Spearman correlation corSpearman(x, y) corSpearman(x, y, consistent=TRUE) # Kendall correlation corKendall(x, y) corKendall(x, y, consistent=TRUE) # quadrant correlation corQuadrant(x, y) corQuadrant(x, y, consistent=TRUE) # Huber M-estimator corM(x, y)
## generate data library("mvtnorm") set.seed(1234) # for reproducibility sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) xy <- rmvnorm(100, sigma=sigma) x <- xy[, 1] y <- xy[, 2] ## compute correlations # Pearson correlation corPearson(x, y) # Spearman correlation corSpearman(x, y) corSpearman(x, y, consistent=TRUE) # Kendall correlation corKendall(x, y) corKendall(x, y, consistent=TRUE) # quadrant correlation corQuadrant(x, y) corQuadrant(x, y, consistent=TRUE) # Huber M-estimator corM(x, y)
Subset of the diabetes data from Andrews & Herzberg (1985).
data(diabetes)
data(diabetes)
A list with components x
and y
. Both components are matrices
with observations on different variables for the same persons.
Component x
is a matrix containing the following variables.
RelativeWeight
relative weight.
PlasmaGlucose
fasting plasma glucose.
Component y
is a matrix containing the following variables.
GlucoseIntolerance
glucose intolerance.
InsulinResponse
insulin response to oral glucose.
InsulinResistance
insulin resistance.
Andrews, D.F. and Herzberg, A.M. (1985) Data. Springer-Verlag. Page 215.
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation maxCorGrid(x, y, method = "spearman") maxCorGrid(x, y, method = "spearman", consistent = TRUE) ## Pearson correlation maxCorGrid(x, y, method = "pearson")
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation maxCorGrid(x, y, method = "spearman") maxCorGrid(x, y, method = "spearman", consistent = TRUE) ## Pearson correlation maxCorGrid(x, y, method = "pearson")
Compute the median absolute deviation with a fast C++ implementation. By default, a multiplication factor is applied for consistency at the normal model.
fastMAD(x, constant = 1.4826)
fastMAD(x, constant = 1.4826)
x |
a numeric vector. |
constant |
a numeric multiplication factor. The default value yields consistency at the normal model. |
A list with the following components:
center |
a numeric value giving the sample median. |
MAD |
a numeric value giving the median absolute deviation. |
Functionality for removing observations with missing values is currently not implemented.
Andreas Alfons
set.seed(1234) # for reproducibility x <- rnorm(100) fastMAD(x)
set.seed(1234) # for reproducibility x <- rnorm(100) fastMAD(x)
Compute the sample median with a fast C++ implementation.
fastMedian(x)
fastMedian(x)
x |
a numeric vector. |
The sample median.
Functionality for removing observations with missing values is currently not implemented.
Andreas Alfons
set.seed(1234) # for reproducibility x <- rnorm(100) fastMedian(x)
set.seed(1234) # for reproducibility x <- rnorm(100) fastMedian(x)
Compute the maximum correlation between two data sets via projection pursuit based on alternating series of grid searches in two-dimensional subspaces of each data set, with a focus on robust and nonparametric methods.
maxCorGrid( x, y, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), nIterations = 10, nAlternate = 10, nGrid = 25, select = NULL, tol = 1e-06, standardize = TRUE, fallback = FALSE, seed = NULL, ... )
maxCorGrid( x, y, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), nIterations = 10, nAlternate = 10, nGrid = 25, select = NULL, tol = 1e-06, standardize = TRUE, fallback = FALSE, seed = NULL, ... )
x , y
|
each can be a numeric vector, matrix or data frame. |
method |
a character string specifying the correlation functional to
maximize. Possible values are |
control |
a list of additional arguments to be passed to the specified
correlation functional. If supplied, this takes precedence over additional
arguments supplied via the |
nIterations |
an integer giving the maximum number of iterations. |
nAlternate |
an integer giving the maximum number of alternate series of grid searches in each iteration. |
nGrid |
an integer giving the number of equally spaced grid points on the unit circle to use in each grid search. |
select |
optional; either an integer vector of length two or a list
containing two index vectors. In the first case, the first integer gives
the number of variables of |
tol |
a small positive numeric value to be used for determining convergence. |
standardize |
a logical indicating whether the data should be (robustly) standardized. |
fallback |
logical indicating whether a fallback mode for robust standardization should be used. If a correlation functional other than the Pearson correlation is maximized, the first attempt for standardizing the data is via median and MAD. In the fallback mode, variables whose MADs are zero (e.g., dummy variables) are standardized via mean and standard deviation. Note that if the Pearson correlation is maximized, standardization is always done via mean and standard deviation. |
seed |
optional initial seed for the random number generator (see
|
... |
additional arguments to be passed to the specified correlation functional. |
The algorithm is based on alternating series of grid searches in
two-dimensional subspaces of each data set. In each grid search,
nGrid
grid points on the unit circle in the corresponding plane are
obtained, and the directions from the center to each of the grid points are
examined. In the first iteration, equispaced grid points in the interval
are used. In each subsequent
iteration, the angles are halved such that the interval
is used in the second iteration and so
on. If only one data set is multivariate, the algorithm simplifies
to iterative grid searches in two-dimensional subspaces of the corresponding
data set.
In the basic algorithm, the order of the variables in a series of grid
searches for each of the data sets is determined by the average absolute
correlations with the variables of the respective other data set. Since
this requires to compute the full matrix of
absolute correlations, where
denotes the number of variables of
x
and the number of variables of
y
, a faster
modification is available as well. In this modification, the average
absolute correlations are computed over only a subset of the variables of
the respective other data set. It is thereby possible to use randomly
selected subsets of variables, or to specify the subsets of variables
directly.
Note that also the data sets are ordered according to the maximum average absolute correlation with the respective other data set to ensure symmetry of the algorithm.
An object of class "maxCor"
with the following components:
cor |
a numeric giving the maximum correlation estimate. |
a |
numeric; the weighting vector for |
b |
numeric; the weighting vector for |
centerX |
a numeric vector giving the center estimates used in
standardization of |
centerY |
a numeric vector giving the center estimates used in
standardization of |
scaleX |
a numeric vector giving the scale estimates used in
standardization of |
scaleY |
a numeric vector giving the scale estimates used in
standardization of |
call |
the matched function call. |
Andreas Alfons
A. Alfons, C. Croux and P. Filzmoser (2016) Robust maximum association between data sets: The R Package ccaPP. Austrian Journal of Statistics, 45(1), 71–79.
A. Alfons, C. Croux and P. Filzmoser (2016) Robust maximum association estimators. Journal of the American Statistical Association, 112(517), 435–445.
maxCorProj
, ccaGrid
,
corFunctions
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation maxCorGrid(x, y, method = "spearman") maxCorGrid(x, y, method = "spearman", consistent = TRUE) ## Pearson correlation maxCorGrid(x, y, method = "pearson")
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation maxCorGrid(x, y, method = "spearman") maxCorGrid(x, y, method = "spearman", consistent = TRUE) ## Pearson correlation maxCorGrid(x, y, method = "pearson")
Compute the maximum correlation between two data sets via projection pursuit based on projections through the data points, with a focus on robust and nonparametric methods.
maxCorProj( x, y, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), standardize = TRUE, useL1Median = TRUE, fallback = FALSE, ... )
maxCorProj( x, y, method = c("spearman", "kendall", "quadrant", "M", "pearson"), control = list(...), standardize = TRUE, useL1Median = TRUE, fallback = FALSE, ... )
x , y
|
each can be a numeric vector, matrix or data frame. |
method |
a character string specifying the correlation functional to
maximize. Possible values are |
control |
a list of additional arguments to be passed to the specified
correlation functional. If supplied, this takes precedence over additional
arguments supplied via the |
standardize |
a logical indicating whether the data should be (robustly) standardized. |
useL1Median |
a logical indicating whether the |
fallback |
logical indicating whether a fallback mode for robust standardization should be used. If a correlation functional other than the Pearson correlation is maximized, the first attempt for standardizing the data is via median and MAD. In the fallback mode, variables whose MADs are zero (e.g., dummy variables) are standardized via mean and standard deviation. Note that if the Pearson correlation is maximized, standardization is always done via mean and standard deviation. |
... |
additional arguments to be passed to the specified correlation functional. |
First the candidate projection directions are defined for each data set
from the respective center through each data point. Then the algorithm
scans all possible combinations for the maximum correlation,
where
is the number of observations.
An object of class "maxCor"
with the following components:
cor |
a numeric giving the maximum correlation estimate. |
a |
numeric; the weighting vector for |
b |
numeric; the weighting vector for |
centerX |
a numeric vector giving the center estimates used in
standardization of |
centerY |
a numeric vector giving the center estimates used in
standardization of |
scaleX |
a numeric vector giving the scale estimates used in
standardization of |
scaleY |
a numeric vector giving the scale estimates used in
standardization of |
call |
the matched function call. |
Andreas Alfons
maxCorGrid
, ccaProj
,
corFunctions
,
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation maxCorProj(x, y, method = "spearman") maxCorProj(x, y, method = "spearman", consistent = TRUE) ## Pearson correlation maxCorProj(x, y, method = "pearson")
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation maxCorProj(x, y, method = "spearman") maxCorProj(x, y, method = "spearman", consistent = TRUE) ## Pearson correlation maxCorProj(x, y, method = "pearson")
Test whether or not there is association betwenn two data sets, with a focus on robust and nonparametric correlation measures.
permTest( x, y, R = 1000, fun = maxCorGrid, permutations = NULL, nCores = 1, cl = NULL, seed = NULL, ... )
permTest( x, y, R = 1000, fun = maxCorGrid, permutations = NULL, nCores = 1, cl = NULL, seed = NULL, ... )
x , y
|
each can be a numeric vector, matrix or data frame. |
R |
an integer giving the number of random permutations to be used. |
fun |
a function to compute a maximum correlation measure between
two data sets, e.g., |
permutations |
an integer matrix in which each column contains the
indices of a permutation. If supplied, this is preferred over |
nCores |
a positive integer giving the number of processor cores to be
used for parallel computing (the default is 1 for no parallelization). If
this is set to |
cl |
a parallel cluster for parallel computing as generated by
|
seed |
optional integer giving the initial seed for the random number
generator (see |
... |
additional arguments to be passed to |
The test generates R
data sets by randomly permuting the observations
of x
, while keeping the observations of y
fixed. In each
replication, a function to compute a maximum correlation measure is
applied to the permuted data sets. The -value of the test is then
given by the percentage of replicates of the maximum correlation measure
that are larger than the maximum correlation measure computed from the
original data.
An object of class "permTest"
with the following components:
pValue |
the |
cor0 |
the value of the test statistic. |
cor |
the values of the test statistic for each of the permutated data sets. |
R |
the number of random permutations. |
seed |
the seed of the random number generator. |
call |
the matched function call. |
Andreas Alfons
A. Alfons, C. Croux and P. Filzmoser (2016) Robust maximum association between data sets: The R Package ccaPP. Austrian Journal of Statistics, 45(1), 71–79.
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation permTest(x, y, R = 100, method = "spearman") permTest(x, y, R = 100, method = "spearman", consistent = TRUE) ## Pearson correlation permTest(x, y, R = 100, method = "pearson")
data("diabetes") x <- diabetes$x y <- diabetes$y ## Spearman correlation permTest(x, y, R = 100, method = "spearman") permTest(x, y, R = 100, method = "spearman", consistent = TRUE) ## Pearson correlation permTest(x, y, R = 100, method = "pearson")