Title: | Tools for the Continuous Convolution Trick in Nonparametric Estimation |
---|---|
Description: | Implements the uniform scaled beta distribution and the continuous convolution kernel density estimator. |
Authors: | Thomas Nagler [aut, cre] |
Maintainer: | Thomas Nagler <[email protected]> |
License: | GPL-3 |
Version: | 0.1.2 |
Built: | 2024-12-25 06:46:35 UTC |
Source: | CRAN |
Implements the uniform scaled beta distribution dusb()
, a generic function
for continuous convolution cont_conv()
, and the continuous convolution
kernel density estimator cckde()
.
Thomas Nagler
Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457
The continuous convolution kernel density estimator is defined as the
classical kernel density estimator based on continuously convoluted data (see
cont_conv()
). cckde()
fits the estimator (including bandwidth selection),
dcckde()
and predict.cckde()
can be used to evaluate the estimator.
cckde(x, bw = NULL, mult = 1, theta = 0, nu = 5, ...) dcckde(x, object) ## S3 method for class 'cckde' predict(object, newdata, ...)
cckde(x, bw = NULL, mult = 1, theta = 0, nu = 5, ...) dcckde(x, object) ## S3 method for class 'cckde' predict(object, newdata, ...)
x |
a matrix or data frame containing the data (or evaluation points). |
bw |
vector of bandwidth parameter; if |
mult |
bandwidth multiplier; either a positive number or a vector of such. Each bandwidth parameter is multiplied with the corresponding multiplier. |
theta |
scale parameter of the USB distribution (see, |
nu |
smoothness parameter of the USB distribution (see, |
... |
unused. |
object |
|
newdata |
matrix or data frame containing evaluation points. |
If a variable should be treated as ordered discrete, declare it as
ordered()
, factors are expanded into discrete dummy codings.
Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457
# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(10, 4, 0.1), 0:4), Z1 = ordered(rbinom(10, 5, 0.5), 0:5), Z2 = ordered(rpois(10, 1), 0:10), X1 = rnorm(10), X2 = rexp(10) ) fit <- cckde(dat) # fit estimator dcckde(dat, fit) # evaluate density predict(fit, dat) # equivalent
# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(10, 4, 0.1), 0:4), Z1 = ordered(rbinom(10, 5, 0.5), 0:5), Z2 = ordered(rpois(10, 1), 0:10), X1 = rnorm(10), X2 = rexp(10) ) fit <- cckde(dat) # fit estimator dcckde(dat, fit) # evaluate density predict(fit, dat) # equivalent
Applies the continuous convolution trick, i.e. adding continuous noise to all
discrete variables. If a variable should be treated as discrete, declare it
as ordered()
(passed to expand_as_numeric()
).
cont_conv(x, theta = 0, nu = 5, quasi = TRUE)
cont_conv(x, theta = 0, nu = 5, quasi = TRUE)
x |
data; numeric matrix or data frame. |
theta |
scale parameter of the USB distribution (see, |
nu |
smoothness parameter of the USB distribution (see, |
quasi |
logical indicating whether quasi random numbers sholuld be used
( |
The UPSB distribution (dusb()
) is used as the noise distribution.
Discrete variables are assumed to be integer-valued.
A data frame with noise added to each discrete variable (ordered columns).
Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457
# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(10, 4, 0.1), 0:4), Z1 = ordered(rbinom(10, 5, 0.5), 0:5), Z2 = ordered(rpois(10, 1), 0:10), X1 = rnorm(10), X2 = rexp(10) ) pairs(dat) pairs(expand_as_numeric(dat)) # expanded variables without noise pairs(cont_conv(dat)) # continuously convoluted data
# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(10, 4, 0.1), 0:4), Z1 = ordered(rbinom(10, 5, 0.5), 0:5), Z2 = ordered(rpois(10, 1), 0:10), X1 = rnorm(10), X2 = rexp(10) ) pairs(dat) pairs(expand_as_numeric(dat)) # expanded variables without noise pairs(cont_conv(dat)) # continuously convoluted data
The uniform scaled beta (USB) distribution describes the distribution of the random variable
where is a
random variable,
is a
random variable, and
.
dusb(x, theta = 0, nu = 5) rusb(n, theta = 0, nu = 5, quasi = FALSE)
dusb(x, theta = 0, nu = 5) rusb(n, theta = 0, nu = 5, quasi = FALSE)
x |
vector of quantiles. |
theta |
scale parameter of the USB distribution. |
nu |
smoothness parameter of the USB distribution. |
n |
number of observations. |
quasi |
logical indicating whether quasi random numbers
( |
Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457
# plot distribution sq <- seq(-0.8, 0.8, by = 0.01) plot(sq, dusb(sq), type = "l") lines(sq, dusb(sq, theta = 0.25), col = 2) lines(sq, dusb(sq, theta = 0.25, nu = 10), col = 3) # simulate from the distribution x <- rusb(100, theta = 0.3, nu = 0)
# plot distribution sq <- seq(-0.8, 0.8, by = 0.01) plot(sq, dusb(sq), type = "l") lines(sq, dusb(sq, theta = 0.25), col = 2) lines(sq, dusb(sq, theta = 0.25, nu = 10), col = 3) # simulate from the distribution x <- rusb(100, theta = 0.3, nu = 0)
Turns ordered variables into integers and expands factors as binary dummy
codes. cont_conv()
additionally adds noise to discrete variables, but this is only
useful for estimation. [cc_prepare()]
can be used to evaluate an already
fitted estimate.
expand_as_numeric(x)
expand_as_numeric(x)
x |
a vector or data frame with numeric, ordered, or factor columns. |
A numeric matrix containing the expanded variables. It has additional
type expanded_as_numeric
and attr(, "i_disc")
cntains the indices of
discrete variables.
# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(100, 4, 0.1), 0:4), Z1 = as.ordered(rbinom(100, 5, 0.5)), Z2 = as.ordered(rpois(100, 1)), X1 = rnorm(100), X2 = rexp(100) ) pairs(dat) pairs(expand_as_numeric(dat)) # expanded variables without noise pairs(cont_conv(dat)) # continuously convoluted data
# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(100, 4, 0.1), 0:4), Z1 = as.ordered(rbinom(100, 5, 0.5)), Z2 = as.ordered(rpois(100, 1)), X1 = rnorm(100), X2 = rexp(100) ) pairs(dat) pairs(expand_as_numeric(dat)) # expanded variables without noise pairs(cont_conv(dat)) # continuously convoluted data
Expands each element according to the factor expansions of columns in
expand_as_numeric()
.
expand_names(x)
expand_names(x)
x |
as in |
A vector of size ncol(expand_as_numeric(x))
.
Expands each element according to the factor expansions of columns in
expand_as_numeric()
.
expand_vec(y, x)
expand_vec(y, x)
y |
a vector of length 1 or |
x |
as in |
A vector of size ncol(expand_as_numeric(x))
.