Title: | Generates Multivariate Nonnormal Data and Determines How Many Factors to Retain |
---|---|
Description: | The GenDataSample() and GenDataPopulation() functions create, respectively, a sample or population of multivariate nonnormal data using methods described in Ruscio and Kaczetow (2008). Both of these functions call a FactorAnalysis() function to reproduce a correlation matrix. The EFACompData() function allows users to determine how many factors to retain in an exploratory factor analysis of an empirical data set using a method described in Ruscio and Roche (2012). The latter function uses populations of comparison data created by calling the GenDataPopulation() function. <DOI: 10.1080/00273170802285693>. <DOI: 10.1037/a0025697>. |
Authors: | John Ruscio |
Maintainer: | John Ruscio <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0 |
Built: | 2024-12-16 06:32:24 UTC |
Source: | CRAN |
Comparison data
EFACompData(data, f.max, n.pop = 10000, n.samples = 500, alpha = .30, graph = FALSE, corr.type = "pearson")
EFACompData(data, f.max, n.pop = 10000, n.samples = 500, alpha = .30, graph = FALSE, corr.type = "pearson")
data |
Matrix to store the simulated data (matrix). |
f.max |
Largest number of factors to consider (scalar). |
n.pop |
Size of finite populations of comparison data (scalar, default is 10,000 cases). |
n.samples |
Number of samples drawn from each population (scalar, default is 500). |
alpha |
Alpha level when testing statistical significance of improvement with additional factor (scalar, default is .30) |
graph |
Whether to plot the fit of eigenvalues to those for comparison data (default is FALSE). |
corr.type |
Type of correlation (character, default is "pearson", user can also call "spearman"). |
Nothing, displays number of factors on screen.
John Ruscio
Ruscio & Roche (2011)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # empirically determine number of factors in data matrix x EFACompData(x, f.max = 5)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # empirically determine number of factors in data matrix x EFACompData(x, f.max = 5)
Analyzes comparison data with known factorial structures
FactorAnalysis(data, corr.matrix = FALSE, max.iteration = 50,n.factors = 0, corr.type = "pearson")
FactorAnalysis(data, corr.matrix = FALSE, max.iteration = 50,n.factors = 0, corr.type = "pearson")
data |
Matrix to store the simulated data (matrix). |
corr.matrix |
Correlation matrix (default is FALSE) |
max.iteration |
Maximum number of iterations (scalar, default is 50). |
n.factors |
Number of factors (scalar, default is 0). |
corr.type |
Type of correlation (character, default is "pearson", user can also call "spearman"). |
$loadings |
Factor loadings (vector, if one factor. matrix, if multiple factors) |
$factors |
Number of factors (scalar). |
John Ruscio
Ruscio & Roche (2011)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # perform factor analysis of data matrix x FactorAnalysis(x)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # perform factor analysis of data matrix x FactorAnalysis(x)
Simulates multivariate nonnormal data using an iterative algorithm
GenDataPopulation(supplied.data, n.factors, n.cases, max.trials = 5, initial.multiplier = 1, corr.type = "pearson", seed = 0)
GenDataPopulation(supplied.data, n.factors, n.cases, max.trials = 5, initial.multiplier = 1, corr.type = "pearson", seed = 0)
supplied.data |
Data supplied by user. |
n.factors |
Number of factors (scalar). |
n.cases |
Number of cases (scalar). |
max.trials |
Maximum number of trials (scalar, default is 5). |
initial.multiplier |
Value of initial multiplier (scalar, default is 1). |
corr.type |
Type of correlation (character, default is "pearson", user can also call "spearman"). |
seed |
seed value (scalar, default is 0). |
dataPopulation of data
John Ruscio
Ruscio & Roche (2011)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # generate (finite) population of data reproducing distributions and correlations in x GenDataPopulation(x, n.factors = 3, n.cases = 10000)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # generate (finite) population of data reproducing distributions and correlations in x GenDataPopulation(x, n.factors = 3, n.cases = 10000)
Bootstraps each variable's score distribution from a supplied data set.
GenDataSample(supplied.data, n.factors = 0, max.trials = 5, initial.multiplier = 1, corr.type = "pearson", seed = 0)
GenDataSample(supplied.data, n.factors = 0, max.trials = 5, initial.multiplier = 1, corr.type = "pearson", seed = 0)
supplied.data |
Data supplied by user. |
n.factors |
Number of factors (scalar, default is 0). |
max.trials |
Maximum number of trials (scalar, default is 5). |
initial.multiplier |
Value of initial multiplier (scalar, default is 1). |
corr.type |
Type of correlation (character, default is "pearson", user can also call "spearman"). |
seed |
seed value (scalar, default is 0). |
dataSample of data
John Ruscio
Ruscio & Kaczetow (2008)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # generate sample of data reproducing distributions and correlations in x GenDataSample(x)
# create data matrix x with n = 200 cases, k = 9 variables # 3 variables load onto each of 3 orthogonal factors # all marginal distributions are highly skewed x <- matrix(nrow = 200, ncol = 9) for (i in 1:3) { shared <- rchisq(200, 1) for (j in 1:3) { x[, (i - 1) * 3 + j] <- shared + rchisq(200, 1) } } # generate sample of data reproducing distributions and correlations in x GenDataSample(x)