Title: | Multiple Factor Analysis (MFA) |
---|---|
Description: | Performs Multiple Factor Analysis method for quantitative, categorical, frequency and mixed data, in addition to generating a lot of graphics, also has other useful functions. |
Authors: | Paulo Cesar Ossani [aut, cre] , Marcelo Angelo Cirillo [aut] |
Maintainer: | Paulo Cesar Ossani <[email protected]> |
License: | GPL-3 |
Version: | 2.0 |
Built: | 2024-11-01 11:15:39 UTC |
Source: | CRAN |
Performs multiple factor analysis method for quantitative, categorical, frequency and mixed data.
Package: | MFAg |
Type: | Package |
Version: | 2.0 |
Date: | 2024-06-21 |
License: | GPL (>=2) |
LazyLoad: | yes |
Paulo Cesar Ossani,
Marcelo Angelo Cirillo
Maintainer: Paulo Cesar Ossani <[email protected]>
Abdessemed, L.; Escofier, B. Analyse factorielle multiple de tableaux de frequencies: comparaison avec l'analyse canonique des correspondences. Journal de la Societe de Statistique de Paris, Paris, v. 137, n. 2, p. 3-18, 1996..
Abdi, H. Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD). In: SALKIND, N. J. (Ed.). Encyclopedia of measurement and statistics. Thousand Oaks: Sage, 2007. p. 907-912.
Abdi, H.; Valentin, D. Multiple factor analysis (MFA). In: SALKIND, N. J. (Ed.). Encyclopedia of measurement and statistics. Thousand Oaks: Sage, 2007. p. 657-663.
Abdi, H.; Williams, L. Principal component analysis. WIREs Computational Statatistics, New York, v. 2, n. 4, p. 433-459, July/Aug. 2010.
Abdi, H.; Williams, L.; Valentin, D. Multiple factor analysis: principal component analysis for multitable and multiblock data sets. WIREs Computational Statatistics, New York, v. 5, n. 2, p. 149-179, Feb. 2013.
Becue-Bertaut, M.; Pages, J. A principal axes method for comparing contingency tables: MFACT. Computational Statistics & data Analysis, New York, v. 45, n. 3, p. 481-503, Feb. 2004
Becue-Bertaut, M.; Pages, J. Multiple factor analysis and clustering of a mixture of quantitative, categorical and frequency data. Computational Statistics & data Analysis, New York, v. 52, n. 6, p. 3255-3268, Feb. 2008.
Bezecri, J. Analyse de l'inertie intraclasse par l'analyse d'un tableau de contingence: intra-classinertia analysis through the analysis of a contingency table. Les Cahiers de l'Analyse des Donnees, Paris, v. 8, n. 3, p. 351-358, 1983.
Escofier, B. Analyse factorielle en reference a un modele: application a l'analyse d'un tableau d'echanges. Revue de Statistique Appliquee, Paris, v. 32, n. 4, p. 25-36, 1984.
Escofier, B.; Drouet, D. Analyse des differences entre plusieurs tableaux de frequence. Les Cahiers de l'Analyse des Donnees, Paris, v. 8, n. 4, p. 491-499, 1983.
Escofier, B.; Pages, J. Analyse factorielles simples et multiples. Paris: Dunod, 1990. 267 p.
Escofier, B.; Pages, J. Analyses factorielles simples et multiples: objectifs, methodes et interpretation. 4th ed. Paris: Dunod, 2008. 318 p.
Escofier, B.; Pages, J. Comparaison de groupes de variables definies sur le meme ensemble d'individus: un exemple d'applications. Le Chesnay: Institut National de Recherche en Informatique et en Automatique, 1982. 121 p.
Escofier, B.; Pages, J. Multiple factor analysis (AFUMULT package). Computational Statistics & data Analysis, New York, v. 18, n. 1, p. 121-140, Aug. 1994
Greenacre, M.; Blasius, J. Multiple correspondence analysis and related methods. New York: Taylor and Francis, 2006. 607 p.
Ossani, P. C.; Cirillo, M. A.; Borem, F. M.; Ribeiro, D. E.; Cortez, R. M. Quality of specialty coffees: a sensory evaluation by consumers using the MFACT technique. Revista Ciencia Agronomica (UFC. Online), v. 48, p. 92-100, 2017.
Pages, J. Analyse factorielle multiple appliquee aux variables qualitatives et aux donnees mixtes. Revue de Statistique Appliquee, Paris, v. 50, n. 4, p. 5-37, 2002.
Pages, J.. Multiple factor analysis: main features and application to sensory data. Revista Colombiana de Estadistica, Bogota, v. 27, n. 1, p. 1-26, 2004.
Rencher, A. C. Methods of multivariate analysis. 2th. ed. New York: J.Wiley, 2002. 708 p.
Simulated set of mixed data on consumption of coffee.
data(DataMix)
data(DataMix)
Data set with 10 rows and 7 columns. Being 10 observations described by 7 variables: Cooperatives/Tasters, Average grades given to analyzed coffees, Years of work as a taster, Taster with technical training, Taster exclusively dedicated, Average frequency of the coffees Classified as special, Average frequency of the coffees as commercial.
Paulo Cesar Ossani
Marcelo Angelo Cirillo
data(DataMix) DataMix
data(DataMix) DataMix
Set simulated of qualitative data on consumption of coffee.
data(DataQuali)
data(DataQuali)
Data set simulated with 12 rows and 6 columns. Being 12 observations described by 6 variables: Sex, Age, Smoker, Marital status, Sportsman, Study.
Paulo Cesar Ossani
Marcelo Angelo Cirillo
data(DataQuali) DataQuali
data(DataQuali) DataQuali
Set simulated of quantitative data on grades given to some sensory characteristics of coffees.
data(DataQuan)
data(DataQuan)
Data set with 6 rows and 11 columns. Being 6 observations described by 11 variables: Coffee, Chocolate, Caramelised, Ripe, Sweet, Delicate, Nutty, Caramelised, Chocolate, Spicy, Caramelised.
Paulo Cesar Ossani
Marcelo Angelo Cirillo
data(DataQuan) DataQuan
data(DataQuan) DataQuan
Given the matrix of order
, the generalized singular value decomposition (GSVD) involves the use of two sets of positive square matrices of order
and
respectively. These two matrices express constraints imposed, respectively, on the lines and columns of
.
GSVD(data, plin = NULL, pcol = NULL)
GSVD(data, plin = NULL, pcol = NULL)
data |
Matrix used for decomposition. |
plin |
Weight for rows. |
pcol |
Weight for columns |
If plin or pcol is not used, it will be calculated as the usual singular value decomposition.
d |
Eigenvalues, that is, line vector with singular values of the decomposition. |
u |
Eigenvectors referring rows. |
v |
Eigenvectors referring columns. |
Paulo Cesar Ossani
Marcelo Angelo Cirillo
Abdi, H. Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD). In: SALKIND, N. J. (Ed.). Encyclopedia of measurement and statistics. Thousand Oaks: Sage, 2007. p. 907-912.
data <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12), nrow = 4, ncol = 3) svd(data) # Usual Singular Value Decomposition GSVD(data) # GSVD with the same previous results # GSVD with weights for rows and columns GSVD(data, plin = c(0.1,0.5,2,1.5), pcol = c(1.3,2,0.8))
data <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12), nrow = 4, ncol = 3) svd(data) # Usual Singular Value Decomposition GSVD(data) # GSVD with the same previous results # GSVD with weights for rows and columns GSVD(data, plin = c(0.1,0.5,2,1.5), pcol = c(1.3,2,0.8))
In the indicator matrix the elements are arranged in the form of dummy variables, in other words, 1 for a category chosen as a response variable and 0 for the other categories of the same variable.
IM(data, names = TRUE)
IM(data, names = TRUE)
data |
Categorical data. |
names |
Include the names of the variables in the levels of the Indicator Matrix (default = TRUE). |
mtxIndc |
Returns converted data in the indicator matrix. |
Paulo Cesar Ossani
Marcelo Angelo Cirillo
Rencher, A. C. Methods of multivariate analysis. 2th. ed. New York: J.Wiley, 2002. 708 p.
data <- matrix(c("S","S","N","N",1,2,3,4,"N","S","T","N"), nrow = 4, ncol = 3) IM(data, names = FALSE) data(DataQuali) # qualitative data set IM(DataQuali, names = TRUE)
data <- matrix(c("S","S","N","N",1,2,3,4,"N","S","T","N"), nrow = 4, ncol = 3) IM(data, names = FALSE) data(DataQuali) # qualitative data set IM(DataQuali, names = TRUE)
Function for better position of the labels in the graphs.
LocLab(x, y = NULL, labels = seq(along = x), cex = 1, method = c("SANN", "GA"), allowSmallOverlap = FALSE, trace = FALSE, shadotext = FALSE, doPlot = TRUE, ...)
LocLab(x, y = NULL, labels = seq(along = x), cex = 1, method = c("SANN", "GA"), allowSmallOverlap = FALSE, trace = FALSE, shadotext = FALSE, doPlot = TRUE, ...)
x |
Coordinate x |
y |
Coordinate y |
labels |
The labels |
cex |
cex |
method |
Not used |
allowSmallOverlap |
Boolean |
trace |
Boolean |
shadotext |
Boolean |
doPlot |
Boolean |
... |
Other arguments passed to or from other methods |
See the text of the function.
Perform Multiple Factor Analysis (MFA) on groups of variables. The groups of variables can be quantitative, qualitative, frequency (MFACT) data, or mixed data.
MFA(data, groups, typegroups = rep("n",length(groups)), namegroups = NULL)
MFA(data, groups, typegroups = rep("n",length(groups)), namegroups = NULL)
data |
Data to be analyzed. |
groups |
Number of columns for each group in order following the order of data in 'data'. |
typegroups |
Type of group: |
namegroups |
Names for each group. |
vtrG |
Vector with the sizes of each group. |
vtrNG |
Vector with the names of each group. |
vtrplin |
Vector with the values used to balance the lines of the Z matrix. |
vtrpcol |
Vector with the values used to balance the columns of the Z matrix. |
mtxZ |
Matrix concatenated and balanced. |
mtxA |
Matrix of the eigenvalues (variances) with the proportions and proportions accumulated. |
mtxU |
Matrix U of the singular decomposition of the matrix Z. |
mtxV |
Matrix V of the singular decomposition of the matrix Z. |
mtxF |
Matrix global factor scores where the lines are the observations and the columns the components. |
mtxEFG |
Matrix of the factor scores by group. |
mtxCCP |
Matrix of the correlation of the principal components with original variables. |
mtxEV |
Matrix of the partial inertias / scores of the variables |
Paulo Cesar Ossani
Marcelo Angelo Cirillo
Abdessemed, L.; Escofier, B. Analyse factorielle multiple de tableaux de frequencies: comparaison avec l'analyse canonique des correspondences. Journal de la Societe de Statistique de Paris, Paris, v. 137, n. 2, p. 3-18, 1996..
Abdi, H. Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD). In: SALKIND, N. J. (Ed.). Encyclopedia of measurement and statistics. Thousand Oaks: Sage, 2007. p. 907-912.
Abdi, H.; Valentin, D. Multiple factor analysis (MFA). In: SALKIND, N. J. (Ed.). Encyclopedia of measurement and statistics. Thousand Oaks: Sage, 2007. p. 657-663.
Abdi, H.; Williams, L. Principal component analysis. WIREs Computational Statatistics, New York, v. 2, n. 4, p. 433-459, July/Aug. 2010.
Abdi, H.; Williams, L.; Valentin, D. Multiple factor analysis: principal component analysis for multitable and multiblock data sets. WIREs Computational Statatistics, New York, v. 5, n. 2, p. 149-179, Feb. 2013.
Becue-Bertaut, M.; Pages, J. A principal axes method for comparing contingency tables: MFACT. Computational Statistics & data Analysis, New York, v. 45, n. 3, p. 481-503, Feb. 2004
Becue-Bertaut, M.; Pages, J. Multiple factor analysis and clustering of a mixture of quantitative, categorical and frequency data. Computational Statistics & data Analysis, New York, v. 52, n. 6, p. 3255-3268, Feb. 2008.
Bezecri, J. Analyse de l'inertie intraclasse par l'analyse d'un tableau de contingence: intra-classinertia analysis through the analysis of a contingency table. Les Cahiers de l'Analyse des Donnees, Paris, v. 8, n. 3, p. 351-358, 1983.
Escofier, B. Analyse factorielle en reference a un modele: application a l'analyse d'un tableau d'echanges. Revue de Statistique Appliquee, Paris, v. 32, n. 4, p. 25-36, 1984.
Escofier, B.; Drouet, D. Analyse des differences entre plusieurs tableaux de frequence. Les Cahiers de l'Analyse des Donnees, Paris, v. 8, n. 4, p. 491-499, 1983.
Escofier, B.; Pages, J. Analyse factorielles simples et multiples. Paris: Dunod, 1990. 267 p.
Escofier, B.; Pages, J. Analyses factorielles simples et multiples: objectifs, methodes et interpretation. 4th ed. Paris: Dunod, 2008. 318 p.
Escofier, B.; Pages, J. Comparaison de groupes de variables definies sur le meme ensemble d'individus: un exemple d'applications. Le Chesnay: Institut National de Recherche en Informatique et en Automatique, 1982. 121 p.
Escofier, B.; Pages, J. Multiple factor analysis (AFUMULT package). Computational Statistics & data Analysis, New York, v. 18, n. 1, p. 121-140, Aug. 1994
Greenacre, M.; Blasius, J. Multiple correspondence analysis and related methods. New York: Taylor and Francis, 2006. 607 p.
Ossani, P. C.; Cirillo, M. A.; Borem, F. M.; Ribeiro, D. E.; Cortez, R. M. Quality of specialty coffees: a sensory evaluation by consumers using the MFACT technique. Revista Ciencia Agronomica (UFC. Online), v. 48, p. 92-100, 2017.
Pages, J. Analyse factorielle multiple appliquee aux variables qualitatives et aux donnees mixtes. Revue de Statistique Appliquee, Paris, v. 50, n. 4, p. 5-37, 2002.
Pages, J.. Multiple factor analysis: main features and application to sensory data. Revista Colombiana de Estadistica, Bogota, v. 27, n. 1, p. 1-26, 2004.
data(DataMix) # mixed dataset data <- DataMix[,2:ncol(DataMix)] rownames(data) <- DataMix[1:nrow(DataMix),1] group.names = c("Grade Cafes/Work", "Formation/Dedication", "Coffees") mf <- MFA(data = data, c(2,2,2), typegroups = c("n","c","f"), group.names) # performs MFA print("Principal Component Variances:"); round(mf$mtxA,2) print("Matrix of the Partial Inertia / Score of the Variables:"); round(mf$mtxEV,2)
data(DataMix) # mixed dataset data <- DataMix[,2:ncol(DataMix)] rownames(data) <- DataMix[1:nrow(DataMix),1] group.names = c("Grade Cafes/Work", "Formation/Dedication", "Coffees") mf <- MFA(data = data, c(2,2,2), typegroups = c("n","c","f"), group.names) # performs MFA print("Principal Component Variances:"); round(mf$mtxA,2) print("Matrix of the Partial Inertia / Score of the Variables:"); round(mf$mtxEV,2)
Function that normalizes the data globally, or by column.
NormData(data, type = 1)
NormData(data, type = 1)
data |
Data to be analyzed. |
type |
1 normalizes overall (default), |
dataNorm |
Normalized data. |
Paulo Cesar Ossani
Marcelo Angelo Cirillo
data(DataQuan) # set of quantitative data data <- DataQuan[,2:8] res <- NormData(data, type = 1) # normalizes the data globally res # Globally standardized data sd(res) # overall standard deviation mean(res) # overall mean res <- NormData(data, type = 2) # normalizes the data per column res # standardized data per column apply(res, 2, sd) # standard deviation per column colMeans(res) # column averages
data(DataQuan) # set of quantitative data data <- DataQuan[,2:8] res <- NormData(data, type = 1) # normalizes the data globally res # Globally standardized data sd(res) # overall standard deviation mean(res) # overall mean res <- NormData(data, type = 2) # normalizes the data per column res # standardized data per column apply(res, 2, sd) # standard deviation per column colMeans(res) # column averages
Graphics of the Multiple Factor Analysis (MFA).
Plot.MFA(MFA, titles = NA, xlabel = NA, ylabel = NA, posleg = 2, boxleg = TRUE, size = 1.1, grid = TRUE, color = TRUE, groupscolor = NA, namarr = FALSE, linlab = NA, savptc = FALSE, width = 3236, height = 2000, res = 300, casc = TRUE)
Plot.MFA(MFA, titles = NA, xlabel = NA, ylabel = NA, posleg = 2, boxleg = TRUE, size = 1.1, grid = TRUE, color = TRUE, groupscolor = NA, namarr = FALSE, linlab = NA, savptc = FALSE, width = 3236, height = 2000, res = 300, casc = TRUE)
MFA |
Data of the MFA function. |
titles |
Titles of the graphics, if not set, assumes the default text. |
xlabel |
Names the X axis, if not set, assumes the default text. |
ylabel |
Names the Y axis, if not set, assumes the default text. |
posleg |
1 for caption in the left upper corner, |
boxleg |
Puts frame in legend (default = TRUE). |
size |
Size of the points in the graphs. |
grid |
Put grid on graphs (default = TRUE). |
color |
Colored graphics (default = TRUE). |
groupscolor |
Vector with the colors of the groups. |
namarr |
Puts the points names in the cloud around the centroid in the graph corresponding to the global analysis of the Individuals and Variables (default = FALSE). |
linlab |
Vector with the labels for the observations, if not set, assumes the default text. |
savptc |
Saves graphics images to files (default = FALSE). |
width |
Graphics images width when savptc = TRUE (defaul = 3236). |
height |
Graphics images height when savptc = TRUE (default = 2000). |
res |
Nominal resolution in ppi of the graphics images when savptc = TRUE (default = 300). |
casc |
Cascade effect in the presentation of the graphics (default = TRUE). |
Returns several graphs.
Paulo Cesar Ossani
Marcelo Angelo Cirillo
data(DataMix) # set of mixed data data <- DataMix[,2:ncol(DataMix)] rownames(data) <- DataMix[1:nrow(DataMix),1] group.names = c("Grade Cafes/Work", "Formation/Dedication", "Coffees") mf <- MFA(data, c(2,2,2), typegroups = c("n","c","f"), group.names) # performs MFA tit <- c("Scree-Plot","Observations","Observations/Variables", "Correlation Circle","Inertia of the Variable Groups") Plot.MFA(MFA = mf, titles = tit, xlabel = NA, ylabel = NA, posleg = 2, boxleg = FALSE, color = TRUE, groupscolor = c("blue3","red","goldenrod3"), namarr = FALSE, linlab = NA, savptc = FALSE, width = 3236, height = 2000, res = 300, casc = TRUE) # plotting several graphs on the screen Plot.MFA(MFA = mf, titles = tit, xlabel = NA, ylabel = NA, posleg = 2, boxleg = FALSE, color = TRUE, namarr = FALSE, linlab = rep("A?",10), savptc = FALSE, width = 3236, height = 2000, res = 300, casc = TRUE) # plotting several graphs on the screen
data(DataMix) # set of mixed data data <- DataMix[,2:ncol(DataMix)] rownames(data) <- DataMix[1:nrow(DataMix),1] group.names = c("Grade Cafes/Work", "Formation/Dedication", "Coffees") mf <- MFA(data, c(2,2,2), typegroups = c("n","c","f"), group.names) # performs MFA tit <- c("Scree-Plot","Observations","Observations/Variables", "Correlation Circle","Inertia of the Variable Groups") Plot.MFA(MFA = mf, titles = tit, xlabel = NA, ylabel = NA, posleg = 2, boxleg = FALSE, color = TRUE, groupscolor = c("blue3","red","goldenrod3"), namarr = FALSE, linlab = NA, savptc = FALSE, width = 3236, height = 2000, res = 300, casc = TRUE) # plotting several graphs on the screen Plot.MFA(MFA = mf, titles = tit, xlabel = NA, ylabel = NA, posleg = 2, boxleg = FALSE, color = TRUE, namarr = FALSE, linlab = rep("A?",10), savptc = FALSE, width = 3236, height = 2000, res = 300, casc = TRUE) # plotting several graphs on the screen