Title: | Analysis of the Interpopulation Difference in Degree of Sexual Dimorphism Using Summary Statistics |
---|---|
Description: | Offers a solution for the unavailability of raw data in most anthropological studies by facilitating the calculations of several sexual dimorphism related analyses using the published summary statistics of metric data (mean, standard deviation and sex specific sample size) as illustrated by the works of Relethford, J. H., & Hodges, D. C. (1985) <doi:10.1002/ajpa.1330660105>, Greene, D. L. (1989) <doi:10.1002/ajpa.1330790113> and Konigsberg, L. W. (1991) <doi:10.1002/ajpa.1330840110>. |
Authors: | Bassam A. Abulnoor [aut, cre] , MennattAllah H. Attia [aut] , Iain R. Konigsberg [aut] , Lyle W. Konigsberg [aut] |
Maintainer: | Bassam A. Abulnoor <[email protected]> |
License: | GPL-3 |
Version: | 0.5.8 |
Built: | 2024-12-12 07:10:59 UTC |
Source: | CRAN |
Calculates sex specific one way ANOVA from summary statistics.
aov_ss( x, Pop = 1, pairwise = TRUE, letters = FALSE, es_anova = "none", digits = 4, CI = 0.95 )
aov_ss( x, Pop = 1, pairwise = TRUE, letters = FALSE, es_anova = "none", digits = 4, CI = 0.95 )
x |
A data frame containing summary statistics. |
Pop |
Number of the column containing populations' names, Default: 1 |
pairwise |
Logical; if TRUE runs multiple pairwise comparisons on different populations using Tukey-Kramer's post hoc test, Default: TRUE |
letters |
Logical; if TRUE returns letters for pairwise comparisons where significantly different populations are given different letters, Default: FALSE' |
es_anova |
Type of effect size either "f2" for f squared,"eta2" for eta squared, "omega2" for omega squared or "none", Default:"none". |
digits |
Number of significant digits, Default: 4 |
CI |
confidence interval coverage takes value from 0 to 1, Default: 0.95. |
Data is entered as a data frame of summary statistics where the column containing population names is chosen by position (first by default), other columns of summary data should have specific names (case sensitive) similar to baboon.parms_df
Sex specific ANOVA tables and pairwise comparisons in tidy format.
#For the femur head diameter data
F. Curate, C. Umbelino, A. Perinha, C. Nogueira, A.M. Silva, E. Cunha, Sex determination from the femur in Portuguese populations with classical and machinelearning classifiers, J. Forensic Leg. Med. (2017) , doi:http://dx.doi.org/10.1016/j. jflm.2017.08.011.
O. Gulhan, Skeletal Sexing Standards of Human Remains in Turkey (PhD thesis), Cranfield University, 2017 [Dataset].
P. Timonov, A. Fasova, D. Radoinova, A.Alexandrov, D. Delev, A study of sexual dimorphism in the femur among contemporary Bulgarian population, Euras. J. Anthropol. 5 (2014) 46–53.
E.F. Kranioti, N. Vorniotakis, C. Galiatsou, M.Y. Iscan , M. Michalodimitrakis, Sex identification and software development using digital femoral head radiographs, Forensic Sci. Int. 189 (2009) 113.e1–7.
# Comparisons of femur head diameter in four populations df <- data.frame( Pop = c("Turkish", "Bulgarian", "Greek", "Portuguese"), m = c(150.00, 82.00, 36.00, 34.00), f = c(150.00, 58.00, 34.00, 24.00), M.mu = c(49.39, 48.33, 46.99, 45.20), F.mu = c(42.91, 42.89, 42.44, 40.90), M.sdev = c(3.01, 2.53, 2.47, 2.00), F.sdev = c(2.90, 2.84, 2.26, 2.90) ) aov_ss(x = df)
# Comparisons of femur head diameter in four populations df <- data.frame( Pop = c("Turkish", "Bulgarian", "Greek", "Portuguese"), m = c(150.00, 82.00, 36.00, 34.00), f = c(150.00, 58.00, 34.00, 24.00), M.mu = c(49.39, 48.33, 46.99, 45.20), F.mu = c(42.91, 42.89, 42.44, 40.90), M.sdev = c(3.01, 2.53, 2.47, 2.00), F.sdev = c(2.90, 2.84, 2.26, 2.90) ) aov_ss(x = df)
Raw data from Joseph Birdsell's 1938 survey. The data is from two regions (B1 and B19), see Gilligan and Bulbeck (2007) for a map of the regions. Data downloaded from Dr. Peter Brown's website: https://www.peterbrown-palaeoanthropology.net/resource.html
Australia
Australia
A data frame with 94 rows and 9 variables:
(Region) ("B1" = Southwest Australia, "B19" = Northeast Australia), see Gilligan and Bulbeck (2007)
Sex coded as "F" or "M"
body weight in kilograms
Standing height in millimeters
Humeral length in millimeters
Radius length in millimeters
Femoral length in millimeters
Tibial length in millimeters
Bi-illiac breadth in millimeters
Gilligan, I., & Bulbeck, D. (2007). Environment and morphology in Australian Aborigines: A re-analysis of the Birdsell database. American Journal of Physical Anthropology, 134(1), 75-91.
A dataset containing summary statistics for low density lipoprotein (LDL) and apolipoprotein B (apo B) levels in 604 baboons measured on two different diets: a basal diet and a high cholesterol, saturated fat diet. The baboons were classified into one of two subspecies and a hybrid of the two subspecies (Papio hamadryas anubis, P.h. cynocephalus, or hybrid). Each animal was measured on each of the two diets.
baboon.parms_df
baboon.parms_df
A data frame with 12 rows and 8 variables
Apolipoprotein B and LDL on two diets
Sub-species or hybrid
Means of LDL and apo B in different sub-species for males
Means of LDL and apo B in different sub-species for females
Male sample sizes
Female sample sizes
Standard deviations for males
Standard deviations for females
The baboon data collection were supported by NIH grant HL28972 and NIH contract HV53030 to the Southwest Foundation for Biomedical Research (Now: Texas Biomedical Research Institute), and funds from the Southwest Foundation for Biomedical Research
Konigsberg LW (1991). An historical note on the t-test for differences in sexual dimorphism between populations. American journal of physical anthropology, 84(1), 93–96.
List format for the baboon.parms_df for multivariate analysis
baboon.parms_list
baboon.parms_list
A list of 5 matrices (R.res, M.mu, F.mu, M.sdev, and F.sdev) and two vectors (m and f)
pooled within group correlation matrix
Means of LDL and apo B in different sub-species for males
Means of LDL and apo B in different sub-species for females
Male sample sizes
Female sample sizes
Standard deviations for males
Standard deviations for females
Pooled within group correlation matrix for baboon data
baboon.parms_R
baboon.parms_R
A 4*4 numerical matrix
Part of Table 3 from Cavazzuti et al. (2019).
Cremains_measurements
Cremains_measurements
A data frame with 22 rows and 8 variables:
Measured feature
Means of males
Means of females
Male sample sizes
Female sample sizes
Standard deviations for males
Standard deviations for females
published value for Chakraborty and Majumder's (1982) measure of sexual dimorphism.
Cavazzuti, Claudio, et al. (2019) "Towards a new osteometric method for sexing ancient cremated human remains. Analysis of Late Bronze Age and Iron Age samples from Italy with gendered grave goods." PloS one 14.1: e0209423.
Chakraborty, R., & Majumder, P. P. (1982). On Bennett's measure of sex dimorphism. American journal of physical anthropology, 59(3), 295-298.
Visual and statistical computation of the area of non-overlap in the trait distribution between two sex groups.
D_index( x, plot = FALSE, fill = "female", Trait = 1, B = NULL, verbose = FALSE, CI = 0.95, rand = TRUE, digits = 4 )
D_index( x, plot = FALSE, fill = "female", Trait = 1, B = NULL, verbose = FALSE, CI = 0.95, rand = TRUE, digits = 4 )
x |
A data frame containing summary statistics. |
plot |
logical; if TRUE a plot of densities for both sexes is returned, Default: FALSE |
fill |
Specify which sex's density to be filled with color in the plot; either "male" in blue color, "female" in pink color or "both", Default: 'female' |
Trait |
Number of the column containing names of measured parameters, Default: 1 |
B |
number of bootstrap samples for generating confidence intervals. Higher number means greater accuracy but slower execution. If NULL bootstrap confidence intervals are not produced, Default:NULL |
verbose |
logical; if TRUE number of bootstraps is displayed, Default: FALSE |
CI |
confidence interval coverage takes value from 0 to 1, Default: 0.95. |
rand |
logical; if TRUE, uses random seed. If FALSE, then set.seed(42) for repeatability, Default: TRUE |
digits |
Number of significant digits, Default: 4 |
Chakraborty and Majumder's (1982) D index. The calculations are done using Inman and Bradley's (1989) equations, and the relationship that D = 1 - OVL where OVL is the overlap coefficient described in Inman and Bradley. A parametric bootstrap was used assuming normal distributions. The method is known as the "bias-corrected percentile method" (Efron, 1981) or the "bias-corrected percentile interval" (Tibshirani, 1984)
a table and a graphical representation of the selected traits and their corresponding dissimilarity indices, confidence intervals and significance tests.
Chakraborty, Ranajit, and Partha P. Majumder.(1982) "On Bennett's measure of sex dimorphism." American Journal of Physical Anthropology 59.3 : 295-298.
Inman, Henry F., and Edwin L. Bradley Jr.(1989) "The overlapping coefficient as a measure of agreement between probability distributions and point estimation of the overlap of two normal densities." Communications in Statistics-Theory and Methods 18.10:3851-3874.
Efron, B. (1981). Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2), 139-158.
Tibshirani, R. J. (1984). Bootstrap confidence intervals. Technical Report No. 3, Laboratory for Computational Statistics, Department of Statistics, Stanford University.
# plot and calculation of D run.D <- function() { print(D_index(Cremains_measurements[1, ], plot = TRUE)) cat("Published D value: ", Cremains_measurements[1, 8], "\n") } run.D() ## Not run: # confidence interval with bootstrapping D_index(Cremains_measurements[1, ], rand = FALSE, B = 1000) ## End(Not run)
# plot and calculation of D run.D <- function() { print(D_index(Cremains_measurements[1, ], plot = TRUE)) cat("Published D value: ", Cremains_measurements[1, 8], "\n") } run.D() ## Not run: # confidence interval with bootstrapping D_index(Cremains_measurements[1, ], rand = FALSE, B = 1000) ## End(Not run)
Extract summary data needed for other functions from raw data.
extract_sum(x, Sex = 1, Pop = 2, firstX = 3, test = "tg", run = TRUE, ...)
extract_sum(x, Sex = 1, Pop = 2, firstX = 3, test = "tg", run = TRUE, ...)
x |
Data frame of raw data. |
Sex |
Number of the column containing sex 'M' for male and 'F' for female, Default: 1 |
Pop |
Number of the column containing populations' names, Default: 2 |
firstX |
Number of column containing measured parameters (First of multiple in case of multivariate analysis), Default: 3 |
test |
'tg' for Greene t test t_greene, 'uni' for univariate, 'aov' for sex specific ANOVA aov_ss, 'multi' for multivariate, and 'van' for van_vark, Default: 1 |
run |
Logical; if TRUE runs the corresponding test after data extraction, Default:TRUE |
... |
Additional arguments that could be passed to the test of choice |
Raw data is entered in a wide format data frame similar to Howells data set. The first two columns contain sex 'Sex' (‘M' for male and 'F' for female) (Default: '1') and populations’ names 'Pop' (Default:'2'). Starting from 'firstX' column (Default: '3'), measured parameters are entered each in a separate column.
Input for other functions.
# for multivariate test ## Not run: extract_sum(Howells, test = "multi") # for univariate test on a specific parameter extract_sum(Howells, test = "uni", firstX = 4) ## End(Not run)
# for multivariate test ## Not run: extract_sum(Howells, test = "multi") # for univariate test on a specific parameter extract_sum(Howells, test = "uni", firstX = 4) ## End(Not run)
Heuristic data from Fidler and Thompson (2001)
FT
FT
A data frame with 24 rows and 3 variables:
'M' for male and 'F' for female
Populations' names
Dependent variable
Fidler, Fiona, and Bruce Thompson. "Computing correct confidence intervals for ANOVA fixed-and random-effects effect sizes." Educational and Psychological Measurement 61.4 (2001): 575-604.
quantifies the size of difference between sexes in measured traits.
Hedges_g( x, Trait = 1, CI = 0.95, B = NULL, verbose = FALSE, rand = TRUE, digits = 4 )
Hedges_g( x, Trait = 1, CI = 0.95, B = NULL, verbose = FALSE, rand = TRUE, digits = 4 )
x |
A data frame containing summary statistics. |
Trait |
Number of the column containing names of measured parameters, Default: 1 |
CI |
confidence interval coverage takes value from 0 to 1, Default: 0.95. |
B |
number of bootstrap samples for generating confidence intervals. Higher number means greater accuracy but slower execution. If NULL bootstrap confidence intervals are not produced, Default:NULL |
verbose |
logical; if TRUE number of bootstraps is displayed, Default: FALSE |
rand |
logical; if TRUE, uses random seed. If FALSE, then set.seed(42) for repeatability, Default: TRUE |
digits |
Number of significant digits, Default: 4 |
Calculates Hedges' (1981) g and its confidence intervals using the pooled standard deviation and correcting for bias. See Goulet-Pelletier and Cousineau (2018) for details of the calculations and D_index for description of the bootstrap.
a table of Hedge's g values with confidence interval for different traits.
Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 6(2), 107-128.
Goulet-Pelletier, J.-C., & Cousineau, D. (2018). A review of effect sizes and their confidence intervals, part I: The Cohen's d family. The Quantitative Methods for Psychology, 14(4), 242-265.
library(TestDimorph) data("Cremains_measurements") # Confidence intervals with non-central t distribution Hedges_g(Cremains_measurements[1, ]) ## Not run: # confidence interval with bootstrapping Hedges_g(Cremains_measurements[1, ], rand = FALSE, B = 1000) ## End(Not run)
library(TestDimorph) data("Cremains_measurements") # Confidence intervals with non-central t distribution Hedges_g(Cremains_measurements[1, ]) ## Not run: # confidence interval with bootstrapping Hedges_g(Cremains_measurements[1, ], rand = FALSE, B = 1000) ## End(Not run)
A subset of a dataset that consists of 82 craniometric measurements taken from approximately two thousands and half human crania from 28 geographically diverse populations. The full data set can be found in https://rdrr.io/github/geanes/bioanth/man/howell.html
Howells
Howells
A data frame with 441 rows and 10 variables:
'M' for male and 'F' for female
Populations' names
Glabello occipital length
Nasio occipital length
Bastion nasion length
Basion bregma height
Maximum cranial breadth
Maximum frontal breadth
Bizygomatic breadth
Biauricular breadth
Howells WW. (1989). Skull Shapes and the Map. Craniometric Analyses in the Dispersion of Modern Homo. Papers of the Peabody Museum of Archaeology and Ethnology, vol. 79, pp. 189. Cambridge, Mass.: Peabody Museum.
Howells WW. (1995). Who's Who in Skulls. Ethnic Identification of Crania from Measurements. Papers of the Peabody Museum of Archaeology and Ethnology, vol. 82, pp. 108. Cambridge, Mass.: Peabody Museum.
Howells, W. W. (1973). Cranial Variation in Man: A Study by Multivariate Analysis of Patterns of Difference Among Recent Human Populations (Vol. 67). Cambridge, MA: Peabody Museum of Archaeology and Ethnology.
Howells, W. W. (1996). Notes and Comments: Howells' craniometric data on the internet. American Journal of Physical Anthropology, 101(3), 441-442
Pooled within group correlation matrix for Howells' data
Howells_R
Howells_R
A 8*8 numerical matrix
Summary statistics of the Howells' data subset.
Howells_summary
Howells_summary
A data frame with 32 rows and 8 variables:
Measured feature
Population name
Means of males
Means of females
Male sample sizes
Female sample sizes
Standard deviations for males
Standard deviations for females
List format of Howells_summary for multivariate analysis
Howells_summary_list
Howells_summary_list
A list of 5 matrices (R.res, M.mu, F.mu, M.sdev, and F.sdev) and two vectors (m and f) with structure similar to baboon.parms_list
Pooled within-group variance-covariance matrix for Howells' data
Howells_V
Howells_V
A 8*8 numerical matrix
Ipina and Durand's (2010) mixture intersection (MI) measure of sexual dimorphism. This measure is an overlap coefficient where the sum of the frequency of males and the frequency of females equals 1.0. Ipina and Durand (2010) also define a normal intersection (NI) measure which is the overlap coefficient of two normal distributions (each integrating to 1.0), equivalent to Inman and Bradley's (1989) "overlap coefficient." As a result of this rescaling, the "MI" and "NI" plots will appear identical save for the scale on the y-axis.
MI_index( x, plot = FALSE, Trait = 1, B = NULL, verbose = FALSE, CI = 0.95, p.f = 0, index_type = "MI", rand = TRUE, digits = 4 )
MI_index( x, plot = FALSE, Trait = 1, B = NULL, verbose = FALSE, CI = 0.95, p.f = 0, index_type = "MI", rand = TRUE, digits = 4 )
x |
A data frame containing summary statistics. |
plot |
logical; if TRUE a plot of densities for both sexes is returned, Default: FALSE |
Trait |
Number of the column containing names of measured parameters, Default: 1 |
B |
number of bootstrap samples for generating confidence intervals. Higher number means greater accuracy but slower execution. If NULL bootstrap confidence intervals are not produced, Default:NULL |
verbose |
logical; if TRUE number of bootstraps is displayed, Default: FALSE |
CI |
confidence interval coverage takes value from 0 to 1, Default: 0.95. |
p.f |
proportion of sample that is female (if p.f>0 then p.m=1-p.f, where p.m is the proportion of males and bootstrap won't be available) , Default: 0 |
index_type |
type of coefficient (if "MI" it fits the mixture index. If = "NI" it fits the overlap coefficient for two normal distributions, which is equal to 1 – D_index, Default: 'MI' |
rand |
logical; if TRUE, uses random seed. If FALSE, then set.seed(42) for repeatability, Default: TRUE |
digits |
Number of significant digits, Default: 4 |
see D_index for bootstrap method.
returns a table of Ipina and Durand's (2010) mixture index ("MI") for different traits with graphical representation.
Inman, H. F., & Bradley Jr, E. L. (1989). The overlapping coefficient as a measure of agreement between probability distributions and point estimation of the overlap of two normal densities. Communications in Statistics-Theory and Methods, 18(10), 3851-3874.
Ipina, S. L., & Durand, A. I. (2010). Assessment of sexual dimorphism: a critical discussion in a (paleo-) anthropological context. Human Biology, 82(2), 199-220.
# plot and calculation of MI MI_index(Cremains_measurements[1, ], plot = TRUE) #' #NI index MI_index(Cremains_measurements[1, ], index_type = "NI") 1 - D_index(Cremains_measurements[1, ])$D ## Not run: # confidence interval with bootstrapping MI_index(Cremains_measurements[1, ], rand = FALSE, B = 1000) ## End(Not run)
# plot and calculation of MI MI_index(Cremains_measurements[1, ], plot = TRUE) #' #NI index MI_index(Cremains_measurements[1, ], index_type = "NI") 1 - D_index(Cremains_measurements[1, ])$D ## Not run: # confidence interval with bootstrapping MI_index(Cremains_measurements[1, ], rand = FALSE, B = 1000) ## End(Not run)
Multivariate extension of Greene t test t_greene
multivariate( x, R.res = NULL, Trait = 1, Pop = 2, type_manova = "II", manova_test_statistic = "W", interact_manova = TRUE, es_manova = "none", univariate = FALSE, padjust = "none", ..., lower.tail = FALSE, CI = 0.95, digits = 4 )
multivariate( x, R.res = NULL, Trait = 1, Pop = 2, type_manova = "II", manova_test_statistic = "W", interact_manova = TRUE, es_manova = "none", univariate = FALSE, padjust = "none", ..., lower.tail = FALSE, CI = 0.95, digits = 4 )
x |
Data frame or list containing summary statistics for multiple parameters measured in both sexes in two or more populations. |
R.res |
Pooled within correlation matrix, Default: NULL |
Trait |
Number of the column containing names of measured parameters, Default: 1 |
Pop |
Number of the column containing populations' names, Default: 2 |
type_manova |
type of MANOVA test "I","II" or "III", Default:"II". |
manova_test_statistic |
type of test statistic used either "W" for "Wilks","P" for "Pillai", "HL" for "Hotelling-Lawley" or "R" for "Roy's largest root", Default: "W". |
interact_manova |
Logical; if TRUE calculates MANOVA for the interaction effects,Default: TRUE. |
es_manova |
effect size either ,"eta" for eta squared, or "none"for not reporting an effect size, Default:"none". |
univariate |
Logical; if TRUE conducts multiple univariate analyses on different parameters separately, Default: FALSE |
padjust |
Method of p.value adjustment for multiple comparisons following p.adjust Default: "none". |
... |
Additional arguments that could be passed to univariate |
lower.tail |
Logical; if TRUE probabilities are 'P[X <= x]', otherwise, 'P[X > x]'., Default: FALSE |
CI |
confidence interval coverage for the chosen effect size takes value from 0 to 1, Default: 0.95. |
digits |
Number of significant digits, Default: 4 |
Data can be entered either as a data frame of summary statistics as in baboon.parms_df. In that case the pooled within correlation matrix 'R.res' should be entered as a separate argument as in baboon.parms_R. Another acceptable format is is a named list of matrices and vectors containing different summary statistics as well as the correlation matrix as in baboon.parms_list. By setting the option 'univariate' to 'TRUE', multiple 'ANOVA's can be run on each parameter independently.
MANOVA table. When the term is followed by '(E)' an exact f-value is calculated.
# x is a data frame with separate correlation matrix multivariate(baboon.parms_df, R.res = baboon.parms_R) # x is a list with the correlation matrix included multivariate(baboon.parms_list, univariate = TRUE) # reproduces results from Konigsberg (1991) multivariate(baboon.parms_df, R.res = baboon.parms_R)[3, ] multivariate(baboon.parms_df, R.res = baboon.parms_R, interact_manova = FALSE)
# x is a data frame with separate correlation matrix multivariate(baboon.parms_df, R.res = baboon.parms_R) # x is a list with the correlation matrix included multivariate(baboon.parms_list, univariate = TRUE) # reproduces results from Konigsberg (1991) multivariate(baboon.parms_df, R.res = baboon.parms_R)[3, ] multivariate(baboon.parms_df, R.res = baboon.parms_R, interact_manova = FALSE)
Raw data from 1999-2000 NHANES (National Health and Nutrition Examination Survey). Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2020, https://www.cdc.gov/nchs/nhanes/index.htm
NHANES_1999
NHANES_1999
A data frame with 1430 rows and 5 variables:
(RIAGENDR) Sex coded as "F" or "M"
(RIDRETH1) Self-reported race, coded as "Black" = Non-Hispanic Black, "Mex.Am" = Mexican American, or "White" = Non-Hispanic White
Body weight in kilograms
Standing height in centimeters
Upper arm length in centimeters
This is not the complete dataset. It is selected so that age in years is greater than or equal to 20 and less than or equal to 40
Generates raw data from summary statistics using uni/multivariate truncated normal distribution
raw_gen( x, Trait = 1, Pop = 2, R.res = NULL, lower = -Inf, upper = Inf, verbose = FALSE )
raw_gen( x, Trait = 1, Pop = 2, R.res = NULL, lower = -Inf, upper = Inf, verbose = FALSE )
x |
Data frame or list containing summary statistics for multiple parameters measured in both sexes in two or more populations. |
Trait |
Number of the column containing names of measured parameters, Default: 1 |
Pop |
Number of the column containing populations' names, Default: 2 |
R.res |
Pooled within correlation matrix, Default: NULL |
lower |
scalar of lower bounds, Default: -Inf |
upper |
scalar of upper bounds, Default: Inf |
verbose |
Logical; if TRUE displays a message with the method used for generation , Default: FALSE |
If data generation is desired using multivariate distribution data is entered in the form of a list of summary statistics and pooled within correlation matrix as in baboon.parms_list, or the summary statistics are entered separately in the form of a data frame as in baboon.parms_df with a separate correlation matrix as in baboon.parms_R. If data frame is entered without a correlation matrix, data generation is carried out using univariate distribution.
a data frame of raw data
# Data generation using univariate distributions raw_gen(baboon.parms_df, lower = 0) # another univariate example library(dplyr) data <- Cremains_measurements[1, ] %>% mutate(Pop=c("A")) %>% relocate(Pop,.after=1) raw_gen(data)[, -2] # Data generation using multivariate distribution raw_gen(baboon.parms_list, lower = 0)
# Data generation using univariate distributions raw_gen(baboon.parms_df, lower = 0) # another univariate example library(dplyr) data <- Cremains_measurements[1, ] %>% mutate(Pop=c("A")) %>% relocate(Pop,.after=1) raw_gen(data)[, -2] # Data generation using multivariate distribution raw_gen(baboon.parms_list, lower = 0)
Example data set from Shaw and Mitchell-Olds (1993)
SMO
SMO
A data frame with 11 rows and 3 variables:
'M' for male and 'F' for female
Populations' names
Dependent variable
Shaw, Ruth G., and Thomas Mitchell-Olds. "ANOVA for unbalanced data: an overview. " Ecology 74.6 (1993): 1638-1645.
Calculation and visualization of the differences in degree sexual dimorphism between two populations using summary statistics as input.
t_greene( x, Pop = 1, plot = FALSE, colors = c("#DD5129", "#985F51", "#536D79", "#0F7BA2", "#208D98", "#319F8E", "#43B284", "#7FB274", "#BCB264", "#FAB255"), alternative = c("two.sided", "less", "greater"), padjust = "none", letters = FALSE, digits = 4, CI = 0.95 )
t_greene( x, Pop = 1, plot = FALSE, colors = c("#DD5129", "#985F51", "#536D79", "#0F7BA2", "#208D98", "#319F8E", "#43B284", "#7FB274", "#BCB264", "#FAB255"), alternative = c("two.sided", "less", "greater"), padjust = "none", letters = FALSE, digits = 4, CI = 0.95 )
x |
A data frame containing summary statistics. |
Pop |
Number of the column containing populations' names, Default: 1 |
plot |
Logical; if TRUE graphical matrix of p values, Default: FALSE |
colors |
color palette used in the corrplot |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided", "greater" or "less", Default:"two.sided" |
padjust |
Method of p.value adjustment for multiple comparisons following p.adjust Default: "none". |
letters |
Logical; if TRUE returns letters for pairwise comparisons where significantly different populations are given different letters, Default: FALSE' |
digits |
Number of significant digits, Default: 4 |
CI |
confidence interval coverage takes value from 0 to 1, Default: 0.95. |
The input is a data frame of summary statistics where the column containing population names is chosen by position (first by default), other columns of summary data should have specific names (case sensitive) similar to baboon.parms_df.For the visualization of pairwise comparisons using the corrplot, the rounder the image in the plot grid the lower the p-value (see the color scale for similar information). The default colors used in the corrplot are from the "MetBrewer" "Egypt" palette which is listed under the "colorblind_palettes". Different colors palettes can be selected from "RColorBrewer" package.
data frame of t.test results
# for the t-test
Greene, David Lee. "Comparison of t-tests for differences in sexual dimorphism between populations." American Journal of Physical Anthropology 79.1 (1989): 121-125.
Relethford, John H., and Denise C. Hodges. "A statistical test for differences in sexual dimorphism between populations." American Journal of Physical Anthropology 66.1 (1985): 55-61.
#For the femur head diameter data
F. Curate, C. Umbelino, A. Perinha, C. Nogueira, A.M. Silva, E. Cunha, Sex determination from the femur in Portuguese populations with classical and machinelearning classifiers, J. Forensic Leg. Med. (2017) , doi:http://dx.doi.org/10.1016/j. jflm.2017.08.011.
O. Gulhan, Skeletal Sexing Standards of Human Remains in Turkey (PhD thesis), Cranfield University, 2017 [Dataset].
P. Timonov, A. Fasova, D. Radoinova, A.Alexandrov, D. Delev, A study of sexual dimorphism in the femur among contemporary Bulgarian population, Euras. J. Anthropol. 5 (2014) 46–53.
E.F. Kranioti, N. Vorniotakis, C. Galiatsou, M.Y. Iscan , M. Michalodimitrakis, Sex identification and software development using digital femoral head radiographs, Forensic Sci. Int. 189 (2009) 113.e1–7.
# Comparisons of femur head diameter in four populations df <- data.frame( Pop = c("Turkish", "Bulgarian", "Greek", "Portuguese"), m = c(150.00, 82.00, 36.00, 34.00), f = c(150.00, 58.00, 34.00, 24.00), M.mu = c(49.39, 48.33, 46.99, 45.20), F.mu = c(42.91, 42.89, 42.44, 40.90), M.sdev = c(3.01, 2.53, 2.47, 2.00), F.sdev = c(2.90, 2.84, 2.26, 2.90) ) t_greene( df, plot = TRUE, padjust = "none" )
# Comparisons of femur head diameter in four populations df <- data.frame( Pop = c("Turkish", "Bulgarian", "Greek", "Portuguese"), m = c(150.00, 82.00, 36.00, 34.00), f = c(150.00, 58.00, 34.00, 24.00), M.mu = c(49.39, 48.33, 46.99, 45.20), F.mu = c(42.91, 42.89, 42.44, 40.90), M.sdev = c(3.01, 2.53, 2.47, 2.00), F.sdev = c(2.90, 2.84, 2.26, 2.90) ) t_greene( df, plot = TRUE, padjust = "none" )
Calculation and visualization of the differences in degree sexual dimorphism between multiple populations using a modified one way ANOVA and summary statistics as input
univariate( x, Pop = 1, type_anova = "II", interact_anova = TRUE, es_anova = "none", pairwise = FALSE, padjust = "none", ..., lower.tail = FALSE, CI = 0.95, digits = 4 )
univariate( x, Pop = 1, type_anova = "II", interact_anova = TRUE, es_anova = "none", pairwise = FALSE, padjust = "none", ..., lower.tail = FALSE, CI = 0.95, digits = 4 )
x |
A data frame containing summary statistics. |
Pop |
Number of the column containing populations' names, Default: 1 |
type_anova |
type of ANOVA test "I","II" or "III", Default:"II". |
interact_anova |
Logical; if TRUE calculates interaction effect, Default: TRUE. |
es_anova |
Type of effect size either "f2" for f squared,"eta2" for eta squared, "omega2" for omega squared or "none", Default:"none". |
pairwise |
Logical; if TRUE runs multiple pairwise comparisons on different populations using t_greene Default: FALSE |
padjust |
Method of p.value adjustment for multiple comparisons following p.adjust Default: "none". |
... |
Additional arguments that could be passed to the t_greene function |
lower.tail |
Logical; if TRUE probabilities are 'P[X <= x]', otherwise, 'P[X > x]'., Default: FALSE |
CI |
confidence interval coverage takes value from 0 to 1, Default: 0.95. |
digits |
Number of significant digits, Default: 4 |
Data is entered as a data frame of summary statistics where the column containing population names is chosen by position (first by default), other columns of summary data should have specific names (case sensitive) similar to baboon.parms_df
ANOVA table.
Hector, Andy, Stefanie Von Felten, and Bernhard Schmid. "Analysis of variance with unbalanced data: an update for ecology & evolution." Journal of animal ecology 79.2 (2010): 308-316.
#' # See Tables 6 and 8 and from Fidler and Thompson (2001). # The “eta2” and “omega2” CIs match those in Table 8. # See “FT” dataset for Fidler and Thompson (2001) reference # acquiring summary data FT_sum <- extract_sum(FT, test = "uni", run = FALSE) # univariate analysis on summary data univariate(FT_sum, CI = 0.90, es_anova = "eta2", digits = 5) univariate(FT_sum, CI = 0.90, es_anova = "omega2", digits = 5) # Reproduces Table 2 from Shaw and Mitchell-Olds (1993) using their Table 1. # See “SMO” dataset for Shaw and Mitchell-Olds (1993) reference # Note that Table 2 residual df is incorrectly given as 6, # but is correctly given as 7 in Hector et al. (2010) # acquiring summary data univ_SMO <- extract_sum(SMO, test = "uni", run = FALSE) # univariate analysis on summary data print(univariate(univ_SMO, type_anova = "I")[[1]]) print(univariate(univ_SMO, type_anova = "II")) univariate(univ_SMO, type_anova = "III")
#' # See Tables 6 and 8 and from Fidler and Thompson (2001). # The “eta2” and “omega2” CIs match those in Table 8. # See “FT” dataset for Fidler and Thompson (2001) reference # acquiring summary data FT_sum <- extract_sum(FT, test = "uni", run = FALSE) # univariate analysis on summary data univariate(FT_sum, CI = 0.90, es_anova = "eta2", digits = 5) univariate(FT_sum, CI = 0.90, es_anova = "omega2", digits = 5) # Reproduces Table 2 from Shaw and Mitchell-Olds (1993) using their Table 1. # See “SMO” dataset for Shaw and Mitchell-Olds (1993) reference # Note that Table 2 residual df is incorrectly given as 6, # but is correctly given as 7 in Hector et al. (2010) # acquiring summary data univ_SMO <- extract_sum(SMO, test = "uni", run = FALSE) # univariate analysis on summary data print(univariate(univ_SMO, type_anova = "I")[[1]]) print(univariate(univ_SMO, type_anova = "II")) univariate(univ_SMO, type_anova = "III")
Provides testing for differences in patterning of sexual dimorphism between populations, as well as for evolutionary trends that may characterize other species. The test is based on the computation of the first q canonical variates (q=2 by default) or multiple discriminant functions to develop various tests of sexual dimorphism in any two populations A and B.
van_vark( x, W = NULL, q = 2, Trait = 1, Pop = 2, plot = TRUE, lower.tail = FALSE, digits = 4 )
van_vark( x, W = NULL, q = 2, Trait = 1, Pop = 2, plot = TRUE, lower.tail = FALSE, digits = 4 )
x |
A Data frame of means and sample sizes for different populations or a list of the summary data frame with Pooled within-group variance-covariance matrix. |
W |
Pooled within-group variance-covariance matrix supplied if x is a dataframe , Default:NULL |
q |
Number of canonical variates to retain for chi square test, Default: 2 |
Trait |
number of column containing names of traits Default: 1. |
Pop |
Number of the column containing populations' names, Default: 2 |
plot |
Logical; if TRUE returns a graphical representation of dimorphism differences, Default: TRUE |
lower.tail |
Logical; if TRUE probabilities are 'P[X <= x]', otherwise, 'P[X > x]'., Default: FALSE |
digits |
Number of significant digits, Default: 4 |
Input is a data frame of means and sample sizes similar to Howells_summary with the same naming conventions used throughout the functions but with the standard deviation columns removed.
The output includes a two-dimensional plot that illustrate the existing differences between tested populations and a statistical test of significance for the difference in dimorphism using chi square distribution.
For plot labels to be fully visualized, maximizing image size is advised.
Van Vark, G. N., et al. "Some multivariate tests for differences in sexual dimorphism between human populations." Annals of human biology 16.4 (1989): 301-310.
# selecting means and sample sizes van_vark_data <- Howells_summary[!endsWith( x = names(Howells_summary), suffix = "dev" )] # running the function van_vark(van_vark_data, Howells_V)
# selecting means and sample sizes van_vark_data <- Howells_summary[!endsWith( x = names(Howells_summary), suffix = "dev" )] # running the function van_vark(van_vark_data, Howells_V)