Title: | Measuring the Difference Between Two Empirical Distributions |
---|---|
Description: | Provides a function for measuring the difference between two independent or non-independent empirical distributions and returning a significance level of the difference. |
Authors: | Hideo Aizaki |
Maintainer: | Hideo Aizaki <[email protected]> |
License: | CC0 |
Version: | 0.1-2 |
Built: | 2024-12-08 06:54:31 UTC |
Source: | CRAN |
The package provides a function for measuring the difference between two independent or non-independent empirical distributions and returning a significance level of the difference.
I would like to thank Professor Gregory L. Poe for his kindness.
Recommended citations:
Aizaki H (2014). mded: Measuring the difference between two empirical distributions, R package version 0.1-1. URL http://CRAN.R-project.org/package=mded.
Poe GL, Giraud KL, Loomis JB (2005). Computational methods for measuring the difference of empirical distributions. American Journal of Agricultural Economics, 87, 353–365.
Poe GL, Welsh MP, Champ PA (1997). Measuring the difference in mean willingness to pay when dichotomous choice contingent valuation responses are not independent. Land Economics, 73, 255–267.
Hideo Aizaki
Poe GL, Giraud KL, Loomis JB (2005). Computational methods for measuring the difference of empirical distributions. American Journal of Agricultural Economics, 87, 353–365.
Poe GL, Severance-Lossin EK, Welsh WP (1994). Measuring the difference (X - Y) of simulated distributions: A convolutions approach. American Journal of Agricultural Economics, 76, 904–915.
Poe GL, Welsh MP, Champ PA (1997). Measuring the difference in mean willingness to pay when dichotomous choice contingent valuation responses are not independent. Land Economics, 73, 255–267.
The function measures the difference between two independent or non-independent empirical distributions and returns a significance level of the difference.
mded(distr1, distr2, detail = FALSE, independent = TRUE) ## S3 method for class 'mded' print(x, digits = max(3, getOption("digits") - 3), ...)
mded(distr1, distr2, detail = FALSE, independent = TRUE) ## S3 method for class 'mded' print(x, digits = max(3, getOption("digits") - 3), ...)
distr1 |
A vector of empirical distribution. |
distr2 |
A vector of empirical distribution. |
detail |
If |
independent |
Set as |
x |
An object of S3 class 'mded.' |
digits |
A number of significant digits. |
... |
Arguments passed to the function |
The function measures the difference between two independent or non-independent empirical distributions and returns a significance level of the difference on the basis of the methods proposed by Poe et al. (1997, 2005). Such calculations are frequently needed in empirical econometric studies wherein (marginal) willingness-to-pay distributions that are estimated using contingent valuation methods or discrete choice experiments have to be compared to each other.
Let us assume that X and Y are empirical distributions, which are depicted by the vector x = (x1, x2, ..., xm), and y = (y1, y2, ..., yn). The null hypothesis (H0) is X - Y = 0, while the alternative hypothesis (H1) is X - Y > 0. When X and Y are independent of each other, the complete combinatorial method (Poe et al. 2005) provides the one-sided significance level of H0 that is calculated by #{xi - yj <= 0} / m * n, where #{cond} provides the number of times that cond is true. When X and Y are not independent of each other, the paird difference method (Poe et al. 1997) provides the one-sided significance level of H0 that is calculated by #{xi - yi <= 0} / m, where m is equal to n.
Note that the function may take quite long, and would require large amount of memory to calculate the difference between two independent distributions if the argument detail
is set as TRUE
because the resulting difference is stored as a vector. For example, when distr1
and distr2
each contain 10,000 elements (observations), the vector of the difference contains 100,000,000 elements. If memory is lacking, R would stop running the function, showing an error message related to memory limitaion.
stat |
One-side significance level of the difference between |
means |
A vector of mean values of |
cases |
A vector of integer values describing a number of cases wherein the cond is true and that is false. |
distr1 |
A vector assigned to |
distr2 |
A vector assigned to |
distr.names |
A vector of the names of objects assigned to |
diff |
A vector of the difference. If |
Hideo Aizaki
Poe GL, Giraud KL, Loomis JB (2005). Computational methods for measuring the difference of empirical distributions. American Journal of Agricultural Economics, 87, 353–365.
Poe GL, Severance-Lossin EK, Welsh WP (1994). Measuring the difference (X - Y) of simulated distributions: A convolutions approach. American Journal of Agricultural Economics, 76, 904–915.
Poe GL, Welsh MP, Champ PA (1997). Measuring the difference in mean willingness to pay when dichotomous choice contingent valuation responses are not independent. Land Economics, 73, 255–267.
set.seed(123) x <- rnorm(100, 3) y <- rnorm(100, 1) out <- mded(distr1 = x, distr2 = y, detail = TRUE) out
set.seed(123) x <- rnorm(100, 3) y <- rnorm(100, 1) out <- mded(distr1 = x, distr2 = y, detail = TRUE) out