Title: | Replicate Tariff Method for Verbal Autopsy |
---|---|
Description: | Implement the Tariff algorithm for coding cause-of-death from verbal autopsies. The Tariff method was originally proposed in James et al (2011) <DOI:10.1186/1478-7954-9-31> and later refined as Tariff 2.0 in Serina, et al. (2015) <DOI:10.1186/s12916-015-0527-9>. Note that this package was not developed by authors affiliated with the Institute for Health Metrics and Evaluation and thus unintentional discrepancies may exist between the this implementation and the implementation available from IHME. |
Authors: | Zehang Li, Tyler McCormick, Sam Clark |
Maintainer: | Zehang Li <[email protected]> |
License: | GPL-2 |
Version: | 1.0.5 |
Built: | 2024-11-05 06:38:13 UTC |
Source: | CRAN |
This function plots the CSMF of the fitted results.
## S3 method for class 'tariff' plot(x, top = NULL, min.prob = 0, ...)
## S3 method for class 'tariff' plot(x, top = NULL, min.prob = 0, ...)
x |
fitted object from |
top |
maximum causes to plot |
min.prob |
minimum fraction for the causes plotted |
... |
Arguments to be passed to/from graphic function |
data("RandomVA3") test <- RandomVA3[1:200, ] train <- RandomVA3[201:400, ] allcauses <- unique(train$cause) fit <- tariff(causes.train = "cause", symps.train = train, symps.test = test, causes.table = allcauses) plot(fit, top = 10, main = "Top 5 population COD distribution") plot(fit, min.prob = 0.05, main = "Ppulation COD distribution (at least 5%)")
data("RandomVA3") test <- RandomVA3[1:200, ] train <- RandomVA3[201:400, ] allcauses <- unique(train$cause) fit <- tariff(causes.train = "cause", symps.train = train, symps.test = test, causes.table = allcauses) plot(fit, top = 10, main = "Top 5 population COD distribution") plot(fit, min.prob = 0.05, main = "Ppulation COD distribution (at least 5%)")
This function prints the summary message of the fitted results.
## S3 method for class 'tariff_summary' print(x, ...)
## S3 method for class 'tariff_summary' print(x, ...)
x |
summary object for Tariff fit |
... |
not used |
This is a dataset consisting of 400 arbitrary sample input deaths randomly sampled from cleaned PHMRC data.
400 arbitrary input records.
data(RandomVA3) head(RandomVA3$train) head(RandomVA3$test)
data(RandomVA3) head(RandomVA3$train) head(RandomVA3$test)
This is a matrix specifying a default grouping of the causes used in RandomVA3.
17 by 2 matrix
data(SampleCategory3) SampleCategory3
data(SampleCategory3) SampleCategory3
This function prints the summary message of the fitted results.
## S3 method for class 'tariff' summary(object, top = 5, id = NULL, ...)
## S3 method for class 'tariff' summary(object, top = 5, id = NULL, ...)
object |
fitted object from |
top |
number of top CSMF to show |
id |
the ID of a specific death to show |
... |
not used |
data("RandomVA3") test <- RandomVA3[1:200, ] train <- RandomVA3[201:400, ] allcauses <- unique(train$cause) fit <- tariff(causes.train = "cause", symps.train = train, symps.test = test, causes.table = allcauses) correct <- which(fit$causes.test[,2] == test$cause) accuracy <- length(correct) / dim(test)[1] summary(fit) summary(fit, top = 10) summary(fit, id = "p849", top = 3)
data("RandomVA3") test <- RandomVA3[1:200, ] train <- RandomVA3[201:400, ] allcauses <- unique(train$cause) fit <- tariff(causes.train = "cause", symps.train = train, symps.test = test, causes.table = allcauses) correct <- which(fit$causes.test[,2] == test$cause) accuracy <- length(correct) / dim(test)[1] summary(fit) summary(fit, top = 10) summary(fit, id = "p849", top = 3)
This function implements Tariff method.
tariff(causes.train, symps.train, symps.test, causes.table = NULL, use.rank = TRUE, nboot.rank = 1, use.sig = TRUE, nboot.sig = 500, use.top = FALSE, ntop = 40, ...)
tariff(causes.train, symps.train, symps.test, causes.table = NULL, use.rank = TRUE, nboot.rank = 1, use.sig = TRUE, nboot.sig = 500, use.top = FALSE, ntop = 40, ...)
causes.train |
character vector of causes, or the column name of cause in the training data |
symps.train |
N.train by S matrix |
symps.test |
N.test by S matrix |
causes.table |
list of causes in the data |
use.rank |
logical indicator for whether using ranks instead of scores |
nboot.rank |
number of re-sampling for baseline rank comparison. Default to 1, which resamples training data to have a uniform cause distribution of the same size. Set this to 0 removes bootstrapping the training dataset. |
use.sig |
logical indicator for whether using significant Tariff only |
nboot.sig |
number of re-sampling for testing significance. |
use.top |
logical indicator for whether the tariff matrix should be cleaned to have only top symptoms |
ntop |
number of top tariff kept for each cause |
... |
not used |
score |
matrix of score for each cause within each death |
causes.train |
vector of most likely causes in training data |
causes.test |
vector of most likely causes in testing data |
csmf |
vector of CSMF |
causes.table |
cause list used for output, i.e., list of existing causes in the training data |
use.rank |
logical indicator for whether using ranks instead of scores |
Zehang Li, Tyler McCormick, Sam Clark
Maintainer: Zehang Li <[email protected]>
James, S. L., Flaxman, A. D., Murray, C. J., & Population Health Metrics Research Consortium. (2011). Performance of the Tariff Method: validation of a simple additive algorithm for analysis of verbal autopsies. Population Health Metrics, 9(1), 1-16.
Serina, P., Riley, I., Stewart, A., James, S. L., Flaxman, A. D., Lozano, R., ... & Ahuja, R. (2015). Improving performance of the Tariff Method for assigning causes of death to verbal autopsies. BMC medicine, 13(1), 1.
Tyler H. McCormick, Zehang R. Li, Clara Calvert, Amelia C. Crampin, Kathleen Kahn and Samuel J. Clark(2016) Probabilistic cause-of-death assignment using verbal autopsies, http://arxiv.org/abs/1411.3042 To appear, Journal of the American Statistical Association
data("RandomVA3") test <- RandomVA3[1:200, ] train <- RandomVA3[201:400, ] allcauses <- unique(train$cause) fit <- tariff(causes.train = "cause", symps.train = train, symps.test = test, causes.table = allcauses) correct <- which(fit$causes.test[,2] == test$cause) accuracy <- length(correct) / dim(test)[1]
data("RandomVA3") test <- RandomVA3[1:200, ] train <- RandomVA3[201:400, ] allcauses <- unique(train$cause) fit <- tariff(causes.train = "cause", symps.train = train, symps.test = test, causes.table = allcauses) correct <- which(fit$causes.test[,2] == test$cause) accuracy <- length(correct) / dim(test)[1]