Package 'distdichoR' reference manual

Title:	Distributional Method for the Dichotomisation of Continuous Outcomes
Description:	Contains a range of functions covering the present development of the distributional method for the dichotomisation of continuous outcomes. The method provides estimates with standard error of a comparison of proportions (difference, odds ratio and risk ratio) derived, with similar precision, from a comparison of means. See the URL below or <arXiv:1809.03279> for more information.
Authors:	Odile Sauzet
Maintainer:	Odile Sauzet <odile.sauzet@uni-bielefeld.de>
License:	MIT + file LICENSE
Version:	0.1-1
Built:	2025-03-06 06:51:04 UTC
Source:	CRAN

BMI of 1,781 mothers

Description

A dataset containing the Body Mass Index (BMI) of mothers at the beginning of their pregnancy and a variable showing if it is the first pregnancy (parity=0) or a subsequent (parity>0).

Usage

bmi
bmi

Format

A data frame with 1781 rows and 4 variables:

bmi: Body Mass Index (15.04-45.18)
inv_bmi: inverse Body Mass Index (0.022-0.066)
parity: parity of the mother (primi, multi)
group_par: 0 = primipari, 1 = multipari

BMI of 1,560 mothers

Description

A dataset containing the Body Mass Index (BMI) of employed and unemployed mothers.

Usage

bmi2
bmi2

Format

A data frame with 1560 rows and 4 variables:

bmi: Body Mass Index (15.04-45.18)
inv_bmi: inverse Body Mass Index (0.022-0.066)
employ: employment status (unemployed, employed)
group_emp: 1 = employed, 2 = unemployed

Birth weight of 1,458 babies

Description

A dataset containing the smoking status of the mother during their pregnancy, the gestational age and the birth weight of their babies.

Usage

bwsmoke
bwsmoke

Format

A data frame with 1458 rows and 3 variables:

birthwt: birth weight in gram (1780-4870)

smoke: smoking status of the mother (smoker, non-smoker)
gest: gestational age (37-44.43)

Apgar score of 1755 babies

Description

A dataset containing the apgar score after 5 minutes, the condition of the babies, the birthweight of the babies, the gestational age and the smoking status of the mothers

Usage

bwsmokecompl
bwsmokecompl

Format

A data frame with 1755 rows and 7 variables:

apgar5: Apgar score after 5 minutes
babycon: Condition of the baby
birthwt: Birthweight of the babies
gest: gestational age
smoke: smoking status of the mother (0 non-smoker, 1 smoker)
apgar_10: 10 minus apgar5, so the apgarscore can be seen as gamma distributed
smoke2: Allowing the smoking status to have three characteristics (0 non-smoker, 1 smoker, 2 no information)
momid: ID-number of the mother

normal data

Description

The distributional method for dichotomising normal data allowing for assumptions of unequal variances (based on Sauzet et al. 2014 and Peacock et al. 2012).

Usage

distdicho(x, ...)

## Default S3 method:
distdicho(x, y, cp = 0, tail = c("lower", "upper"),
  R = 1, correction = FALSE, unequal = FALSE, conf.level = 0.95,
  bootci = FALSE, nrep = 2000, ...)

## S3 method for class 'formula'
distdicho(formula, data, exposed, ...)
distdicho(x, ...)

## Default S3 method:
distdicho(x, y, cp = 0, tail = c("lower", "upper"),
  R = 1, correction = FALSE, unequal = FALSE, conf.level = 0.95,
  bootci = FALSE, nrep = 2000, ...)

## S3 method for class 'formula'
distdicho(formula, data, exposed, ...)

Arguments

`x`	A numeric vector of data values.
`...`	Further arguments to be passed to or from methods.
`y`	A numeric vector of data values.
`cp`	A numeric value specifying the cut point under which the distributional proportions are computed.
`tail`	A character string specifying the tail of the distribution in which the proportions are computed. Must be either 'lower' (default) or 'upper'.
`R`	A numeric value indicating the true ratio of variances (R = Var(x)/Var(y)). A value of 0 specifies that the true ratio of variances is unknown.
`correction`	A logical indicating whether to use a correction factor for large effect sizes (>0.7) (valid for difference in proportions only).
`unequal`	A logical variable indicating if a correction for an unknown variance ratio should be used if no assumption can be made about the variance ratio.
`conf.level`	Confidence level of the interval.
`bootci`	A logical variable indicating whether bootstrap bias-corrected confidence intervals are calculated instead of distributional ones.
`nrep`	A numeric value specifying the number of bootstrap replications (nrep must be higher than the number of observations).
`formula`	A formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding exposed and unexposed groups.
`data`	An optional matrix or data frame containing the variables. in the formula. By default, the variables are taken from `environment(formula)`.
`exposed`	A character string specifying the grouping value of the exposed group.

Details

distdicho first returns the results of a two-group unpaired t-test (allowing for unequal variances in the unequal variances cases). Followed by the distributional estimates and their standard errors (see Sauzet et al. 2014 and Peacock et al. 2012) for a difference in proportions, risk ratio and odds ratio. It also provides the distributional confidence intervals for the statistics estimated (this assumes an asymptotic normal distribution of estimates and might not be valid for small sample sizes (see Sauzet et al. 2014 for details)). Estimates are calculated using either assumption of equal variances in both groups (default R = 1) or assumption of unequal variance ratio (R != 1 & R !=0 for known variance ratio and R=0 for correction for unknown variance ratio). The data can either be given as two variables, which provide the outcome in each group or specified as a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding exposed and unexposed groups. In all cases, it is assumed that there are only two groups.

Value

A list with class 'distdicho' containing the following components:

`data.name`	The names of the data.
`arguments`	A list with the specified arguments.
`parameter`	The mean, standard error and number of observations for both groups.
`prop`	The estimated proportions below / above the cut point for both groups.
`dist.estimates`	The difference in proportions, risk ratio and odds ratio of the groups.
`se`	The estimated standard error of the difference in proportions, the risk ratio and the odds ratio.
`ci`	The confidence intervals of the difference in proportions, the risk ratio and the odds ratio.
`method`	A character string indicating the used method.
`ttest`	A list containing the results of a t-test.

References

Peacock J.L., Sauzet O., Ewings S.M., Kerry S.M. Dichotomising continuous data while retaining statistical power using a distributional approach. Statist. Med; 2012;26:3089-3103. Sauzet, O., Peacock, J. L. Estimating dichotomised outcomes in two groups with unequal variances: a distributional approach. Statist. Med; 2014 33 4547-4559 ;DOI: 10.1002/sim.6255. Peacock, J.L., Bland, J.M., Anderson, H.R.: Preterm delivery: effects of socioeconomic factors, psychological stress, smoking, alcohol, and caffeine. BMJ 311(7004), 531-535 (1995).

Examples

## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# Formula interface
distdicho(birthwt ~ smoke, cp = 2500, data = bwsmoke, exposed = 'smoker')
# Data stored in two vectors
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdicho(x = bw_smoker, y = bw_nonsmoker, cp = 2500)


## Inverse Body Mass Index (transformation required to have a normal outcome)
## and parity (data from Peacock et al. 1995). Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in 
## proportions, RR and OR of obese mothers (BMI of >30 kg/m^2) for multiparas 
## (group_par=1) over the odds of obesity in group primiparity (group_par=0).
distdicho(inv_bmi ~ group_par, cp = 0.033, data = bmi, exposed = '1')


## Inverse Body Mass Index (BMI) and employment. Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in
## proportions, RR and OR with correction for unknown variance ratio of obese 
## mothers (BMI of >30 kg/m^2) for group_emp = 2 (mother unemployed) over
## the odds of obesity in group_emp = 1 (mother employed)
distdicho(inv_bmi ~ group_emp, cp = 0.033, R = 0, data = bmi2, exposed = '2')


## Inverse Body Mass Index (BMI) and employment. Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in
## proportions, RR and OR computed under the hypothesis that the ratio of variances
## is equal to 1.3 of obese mothers (BMI of >30 kg/m^2) for group_emp = 2
## (mother unemployed) over the odds of obesity in group_emp = 1 (mother employed)
distdicho(inv_bmi ~ group_emp, cp = 0.033, R = 1.3, data = bmi2, exposed = '2')


## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# Formula interface
distdicho(birthwt ~ smoke, cp = 2500, data = bwsmoke, exposed = 'smoker')
# Data stored in two vectors
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdicho(x = bw_smoker, y = bw_nonsmoker, cp = 2500)


## Inverse Body Mass Index (transformation required to have a normal outcome)
## and parity (data from Peacock et al. 1995). Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in 
## proportions, RR and OR of obese mothers (BMI of >30 kg/m^2) for multiparas 
## (group_par=1) over the odds of obesity in group primiparity (group_par=0).
distdicho(inv_bmi ~ group_par, cp = 0.033, data = bmi, exposed = '1')


## Inverse Body Mass Index (BMI) and employment. Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in
## proportions, RR and OR with correction for unknown variance ratio of obese 
## mothers (BMI of >30 kg/m^2) for group_emp = 2 (mother unemployed) over
## the odds of obesity in group_emp = 1 (mother employed)
distdicho(inv_bmi ~ group_emp, cp = 0.033, R = 0, data = bmi2, exposed = '2')


## Inverse Body Mass Index (BMI) and employment. Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in
## proportions, RR and OR computed under the hypothesis that the ratio of variances
## is equal to 1.3 of obese mothers (BMI of >30 kg/m^2) for group_emp = 2
## (mother unemployed) over the odds of obesity in group_emp = 1 (mother employed)
distdicho(inv_bmi ~ group_emp, cp = 0.033, R = 1.3, data = bmi2, exposed = '2')

normal, skew-normal or gamma distributed data

Description

distdichogen first returns the results of a two-group unpaired t-test. Followed by the distributional estimates and their standard errors (see Sauzet et al. 2014 and Peacock et al. 2012) for a difference in proportions, risk ratio and odds ratio. It also provides the distributional confidence intervals for the statistics estimated. distdicho_gen takes normal (dist = 'normal'), skew normal (dist = 'sk_normal') and gamma (dist = 'gamma') distributed data. The data can either be given as two variables, which provide the outcome in each group or specified as a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding exposed and unexposed groups. In all cases, it is assumed that there are only two groups.

Usage

distdichogen(x, ...)

## Default S3 method:
distdichogen(x, y, cp = 0, tail = c("lower", "upper"),
  conf.level = 0.95, dist = c("normal", "sk_normal", "gamma"),
  bootci = FALSE, nrep = 2000, ...)

## S3 method for class 'formula'
distdichogen(formula, data, exposed, ...)
distdichogen(x, ...)

## Default S3 method:
distdichogen(x, y, cp = 0, tail = c("lower", "upper"),
  conf.level = 0.95, dist = c("normal", "sk_normal", "gamma"),
  bootci = FALSE, nrep = 2000, ...)

## S3 method for class 'formula'
distdichogen(formula, data, exposed, ...)

Arguments

`x`	A numeric vector of data values.
`...`	Further arguments to be passed to or from methods.
`y`	A numeric vector of data values.
`cp`	A numeric value specifying the cut point under which the distributional proportions are computed.
`tail`	A character string specifying the tail of the distribution in which the proportions are computed. Must be either 'lower' (default) or 'upper'.
`conf.level`	Confidence level of the interval.
`dist`	A character string specifying the distribution of the data. Must be either 'normal' (default), 'sk_normal or 'gamma'.
`bootci`	A logical variable indicating whether bootstrap bias-corrected confidence intervals are calculated instead of distributional ones.
`nrep`	A numeric value, specifies the number of bootstrap replications (nrep must be higher than the number of observations).
`formula`	A formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding exposed and unexposed groups.
`data`	An optional matrix or data frame containing the variables in the formula. By default, the variables are taken from `environment(formula)`.
`exposed`	A character string specifying the grouping value of the exposed group.

Value

A list with class 'distdicho' containing the following components:

`data.name`	The names of the data.
`arguments`	A list with the specified arguments.
`parameter`	The mean, standard error and number of observations for both groups.
`prop`	The estimated proportions below / above the cut point for both groups.
`dist.estimates`	The difference in proportions, risk ratio and odds ratio of the groups.
`se`	The estimated standard error of the difference in proportions, the risk ratio and the odds ratio.
`ci`	The confidence intervals of the difference in proportions, the risk ratio and the odds ratio.
`method`	A character string indicating the used method.
`ttest`	A list containing the results of a t-test.

References

Peacock J.L., Sauzet O., Ewings S.M., Kerry S.M. Dichotomising continuous data while retaining statistical power using a distributional approach. Statist. Med; 2012; 26:3089-3103. Sauzet, O., Peacock, J. L. Estimating dichotomised outcomes in two groups with unequal variances: a distributional approach. Statist. Med; 2014 33 4547-4559 ;DOI: 10.1002/sim.6255. Sauzet, O., Ofuya, M., Peacock, J. L. Dichotomisation using a distributional approach when the outcome is skewed BMC Medical Research Methodology 2015, 15:40; doi:10.1186/s12874-015-0028-8. Peacock, J.L., Bland, J.M., Anderson, H.R.: Preterm delivery: effects of socioeconomic factors, psychological stress, smoking, alcohol, and caffeine. BMJ 311(7004), 531-535 (1995).

Examples

## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# Formula interface
distdichogen(birthwt ~ smoke, cp = 2500, data = bwsmoke, exposed = 'smoker',
             dist = 'sk_normal')
# Data stored in two vectors
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdichogen(x = bw_smoker, y = bw_nonsmoker, 
              cp = 2500, tail = 'lower', dist = 'sk_normal')


## Body Mass Index (BMI) and parity. Returns distributional estimates, standard
## errors and distributional confidence intervals for difference in proportions,
## RR and OR of obese mothers (BMI of >30kg/m^2) for group_par=1 (multiparity) 
## over the odds of obesity in group_par=0 (primiparity)
distdichogen(bmi ~ group_par, cp = 30, data = bmi, exposed = '1',
             tail = 'upper', dist = 'sk_normal')




## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# Formula interface
distdichogen(birthwt ~ smoke, cp = 2500, data = bwsmoke, exposed = 'smoker',
             dist = 'sk_normal')
# Data stored in two vectors
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdichogen(x = bw_smoker, y = bw_nonsmoker, 
              cp = 2500, tail = 'lower', dist = 'sk_normal')


## Body Mass Index (BMI) and parity. Returns distributional estimates, standard
## errors and distributional confidence intervals for difference in proportions,
## RR and OR of obese mothers (BMI of >30kg/m^2) for group_par=1 (multiparity) 
## over the odds of obesity in group_par=0 (primiparity)
distdichogen(bmi ~ group_par, cp = 30, data = bmi, exposed = '1',
             tail = 'upper', dist = 'sk_normal')

nomal data (immdediate form, allowing unequal variances)

Description

Immediate form of the distributional method for dichotomising normal data allowing for assumptions of unequal variances (based on Sauzet et al. 2014 and Peacock et al. 2012).

Usage

distdichoi(n1, m1, s1, n2, m2, s2, cp = 0, tail = c("lower", "upper"),
  R = 1, conf.level = 0.95)
distdichoi(n1, m1, s1, n2, m2, s2, cp = 0, tail = c("lower", "upper"),
  R = 1, conf.level = 0.95)

Arguments

`n1`	A number specifying the number of observations in the exposed group.
`m1`	A number specifying the mean of the exposed group.
`s1`	A number specifying the standard deviation of the exposed group.
`n2`	A number specifying the number of observations in the unexposed (reference) group
`m2`	A number specifying the mean of the unexposed (reference) group.
`s2`	A number specifying the standard deviation of the unexposed (reference) group.
`cp`	A numeric value specifying the cut point under or over which the distributional proportions are computed.
`tail`	A character string specifying the tail of the distribution in which the proportions are computed. Must be either 'lower' (default) or 'upper'.
`R`	A numeric value indicating the true ratio of variances (R = Var(group1)/Var(group2). A value of 0 specifies that the true ratio of variances is unknown.
`conf.level`	Confidence level of the interval.

Details

distdichoi takes no data, but the number of observations as well as the mean and standard deviations of both groups. It first returns the results of a two-group unpaired t-test (allowing for unequal variances in the unequal variance cases). Followed by the distributional estimates and their standard errors (see Sauzet et al. 2014 and Peacock et al. 2012) for a difference in proportions, risk ratio and odds ratio. It also provides the distributional confidence intervals for the statistics estimated (this assumes an asymptotic normal distribution of estimates and might not be valid for small sample sizes (see Sauzet et al. 2014 for details)). Estimates are calculated using either assumption of equal variances in both groups (default R = 1) or assumption of unequal variance ratio (R != 1 & R !=0 for known variance ratio and R=0 for correction for unknown variance ratio).

Value

A list with class 'distdicho' containing the following components:

`data.name`	The names of the data.
`arguments`	A list with the specified arguments.
`parameter`	The mean, standard error and number of observations for both groups.
`prop`	The estimated proportions below / above the cut point for both groups.
`dist.estimates`	The difference in proportions, risk ratio and odds ratio of the groups.
`se`	The estimated standard error of the difference in proportions, the risk ratio and the odds ratio.
`ci`	The confidence intervals of the difference in proportions, the risk ratio and the odds ratio.
`method`	A character string indicating the used method.
`ttest`	A list containing the results of a t-test.

References

Examples

# Immediate form of distdicho
distdichoi(n1 = 494, m1 = 3267.4, s1 = 441.3,
           n2 = 983, m2 = 3452, s2 = 435.9,
           cp = 2500, tail = 'upper')

## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight LBW)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# distdicho and distdichoi are returning the same results
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdicho(x = bw_smoker, y = bw_nonsmoker, cp = 2500)
distdichoi(n1 = length(bw_smoker[!is.na(bw_smoker)]), 
           m1 = mean(bw_smoker, na.rm = TRUE), 
           s1 = sd(bw_smoker, na.rm = TRUE),
           n2 = length(bw_nonsmoker[!is.na(bw_smoker)]), 
           m2 = mean(bw_nonsmoker, na.rm = TRUE), 
           s2 = sd(bw_nonsmoker, na.rm = TRUE), 
           cp = 2500)

# Immediate form of distdicho
distdichoi(n1 = 494, m1 = 3267.4, s1 = 441.3,
           n2 = 983, m2 = 3452, s2 = 435.9,
           cp = 2500, tail = 'upper')

## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight LBW)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# distdicho and distdichoi are returning the same results
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdicho(x = bw_smoker, y = bw_nonsmoker, cp = 2500)
distdichoi(n1 = length(bw_smoker[!is.na(bw_smoker)]), 
           m1 = mean(bw_smoker, na.rm = TRUE), 
           s1 = sd(bw_smoker, na.rm = TRUE),
           n2 = length(bw_nonsmoker[!is.na(bw_smoker)]), 
           m2 = mean(bw_nonsmoker, na.rm = TRUE), 
           s2 = sd(bw_nonsmoker, na.rm = TRUE), 
           cp = 2500)

normal, skew-normal or gamma distributed data (immediate form)

Description

Immediate form of the distributional method for dichotomising normal, skew normal or gamma distributed data (based on Sauzet et al. 2015).

Usage

distdichoigen(n1, m1, s1, n2, m2, s2, alpha = 1, cp = 0, tail = c("lower",
  "upper"), conf.level = 0.95, dist = c("normal", "sk_normal", "gamma"))
distdichoigen(n1, m1, s1, n2, m2, s2, alpha = 1, cp = 0, tail = c("lower",
  "upper"), conf.level = 0.95, dist = c("normal", "sk_normal", "gamma"))

Arguments

`n1`	A number specifying the number of observations in the exposed group.
`m1`	A number specifying the mean of the exposed group.
`s1`	A number specifying the standard deviation of the exposed group.
`n2`	A number specifying the number of observations in the unexposed (reference) group.
`m2`	A number specifying the mean of the unexposed (reference) group.
`s2`	A number specifying the standard deviation of the unexposed (reference) group.
`alpha`	A numeric value specifying further parameter of the skew normal / gamma distribution.
`cp`	A numeric value specifying the cut point under which the distributional proportions are computed.
`tail`	A character string specifying the tail of the distribution in which the proportions are computed, must be either 'lower' (default) or 'upper'.
`conf.level`	Confidence level of the interval.
`dist`	A character string specifying the distribution, must be either 'normal' (default), 'sk_normal or 'gamma'.

Details

distdichoigen takes no data, but the number of observations as well as the mean and standard deviations of both groups. It first returns the results of a two-group unpaired t-test. Followed by the distributional estimates and their standard errors (see Sauzet et al. 2014 and Peacock et al. 2012) for a difference in proportions, risk ratio and odds ratio. It also provides the distributional confidence intervals for the statistics estimated. If a skew normal (dist = 'sk_normal') or gamma (dist = 'gamma') distribution is assumed, a third parameter alpha needs to be specified. For (dist = 'sk_normal') alpha is described in psn. For dist = 'gamma' alpha is the shape as described in pgamma.

Value

A list with class 'distdicho' containing the following components:

`data.name`	The names of the data.
`arguments`	A list with the specified arguments.
`parameter`	The mean, standard error and number of observations for both groups.
`prop`	The estimated proportions below / above the cut point for both groups.
`dist.estimates`	The difference in proportions, risk ratio and odds ratio of the groups.
`se`	The estimated standard error of the difference in proportions, the risk ratio and the odds ratio.
`ci`	The confidence intervals of the difference in proportions, the risk ratio and the odds ratio.
`method`	A character string indicating the used method.
`ttest`	A list containing the results of a t-test.

References

Peacock J.L., Sauzet O., Ewings S.M., Kerry S.M. Dichotomising continuous data while retaining statistical power using a distributional approach. Statist. Med; 2012; 26:3089-3103. Sauzet, O., Peacock, J. L. Estimating dichotomised outcomes in two groups with unequal variances: a distributional approach. Statist. Med; 2014 33 4547-4559 ;DOI: 10.1002/sim.6255. Sauzet, O., Ofuya, M., Peacock, J. L. Dichotomisation using a distributional approach when the outcome is skewed BMC Medical Research Methodology 2015, 15:40; doi:10.1186/s12874-015-0028-8. Peacock, J.L., Bland, J.M., Anderson, H.R.: Preterm delivery: effects of socioeconomic factors, psychological stress, smoking, alcohol, and caffeine. BMJ 311(7004), 531-535 (1995).

Examples

# Immediate form of sk_distdicho
distdichoigen(n1 = 75, m1 = 3250, s1 = 450, n2 = 110, m2 = 2950, s2 = 475,
               cp = 2500, tail = 'lower', alpha = -2.3, dist = 'sk_normal')

           
# Immediate form of sk_distdicho
distdichoigen(n1 = 75, m1 = 3250, s1 = 450, n2 = 110, m2 = 2950, s2 = 475,
               cp = 2500, tail = 'lower', alpha = -2.3, dist = 'sk_normal')

normal, skew-normal or gamma distributed data (via linear regression)

Description

Provides adjusted distributional estimates for the comparison of proportions for a dichotomised dependent continuous variable derived from a linear regression of the continuous outcome on the grouping variable and other covariates as described in Sauzet et al. 2015.

Usage

regdistdicho(mod, group_var, cp = 0, tail = c("lower", "upper"),
  conf.level = 0.95, dist = c("normal", "sk_normal", "gamma"), alpha = 1)
regdistdicho(mod, group_var, cp = 0, tail = c("lower", "upper"),
  conf.level = 0.95, dist = c("normal", "sk_normal", "gamma"), alpha = 1)

Arguments

`mod`	A linear model of the form lm(lhs ~ rhs) where lhs is a numeric variable giving the data values and rhs is the grouping variable and other covariates.
`group_var`	A character string specifying the name of the grouping variable.
`cp`	A numeric value specifying the cut point under which the distributional proportions are computed.
`tail`	A character string specifying the tail of the distribution in which the proportions are computed, must be either 'lower' (default) or 'upper'.
`conf.level`	Confidence level of the interval.
`dist`	A character string specifying the distribution of the error variable in the linear regression, must be either 'normal' (default), 'sk_normal or 'gamma'.
`alpha`	A numeric value specifying further parameter of the skew normal / gamma distribution.

Details

regdistdicho returns the distributional estimates and their standard errors (see Sauzet et al. 2014 and Peacock et al. 2012) for a difference in proportions, risk ratio and odds ratio. It also provides the distributional confidence intervals for the statistics estimated. The estimation is based on the marginal means of a linear regression of the outcome on the grouping variable and other covariates.

Value

A list with class 'distdicho' containing the following components:

`data.name`	The names of the data.
`arguments`	A list with the specified arguments.
`parameter`	The marginal mean, standard error and number of observations for both groups.
`prop`	The estimated proportions below / above the cut point for both groups.
`dist.estimates`	The difference in proportions, risk ratio and odds ratio of the groups.
`se`	The estimated standard error of the difference in proportions, the risk ratio and the odds ratio.
`ci`	The confidence intervals of the difference in proportions, the risk ratio and the odds ratio.
`method`	A character string indicating the used method.

References

Peacock J.L., Sauzet O., Ewings S.M., Kerry S.M. Dichotomising continuous data while retaining statistical power using a distributional approach. 2012 Statist. Med; 26:3089-3103. Sauzet, O., Peacock, J. L. Estimating dichotomised outcomes in two groups with unequal variances: a distributional approach. 2014 Statist. Med; 33 4547-4559 ;DOI: 10.1002/sim.6255. Sauzet, O., Brekenkamp, J., Brenne, S. , Borde, T., David, M., Razum, O., Peacock, J.L. 2015. A distributional approach to obtain adjusted differences in population at risk with a comparison with other regressions methods using perinatal data. In preparation. Peacock, J.L., Bland, J.M., Anderson, H.R.: Preterm delivery: effects of socioeconomic factors, psychological stress, smoking, alcohol, and caffeine. BMJ 311(7004), 531-535 (1995).

Examples

## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995)
mod_smoke <- lm(birthwt ~ smoke + gest, data = bwsmoke)
regdistdicho(mod = mod_smoke, group_var = 'smoke', cp = 2500, tail = 'lower')
## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995)
mod_smoke <- lm(birthwt ~ smoke + gest, data = bwsmoke)
regdistdicho(mod = mod_smoke, group_var = 'smoke', cp = 2500, tail = 'lower')

Package 'distdichoR'

Help Index

BMI of 1,781 mothers

Description

Usage

Format

BMI of 1,560 mothers

Description

Usage

Format

Birth weight of 1,458 babies

Description

Usage

Format

Apgar score of 1755 babies

Description

Usage

Format

normal data

Description

Usage

Arguments

Details

Value

References

See Also

Examples

normal, skew-normal or gamma distributed data

Description

Usage

Arguments

Value

References

See Also

Examples

nomal data (immdediate form, allowing unequal variances)

Description

Usage

Arguments

Details

Value

References

See Also

Examples

normal, skew-normal or gamma distributed data (immediate form)

Description

Usage

Arguments

Details

Value

References

See Also

Examples

normal, skew-normal or gamma distributed data (via linear regression)

Description

Usage

Arguments

Details

Value

References

See Also

Examples