Title: | Fitting Deep Distributional Regression |
---|---|
Description: | Allows for the specification of semi-structured deep distributional regression models which are fitted in a neural network as proposed by Ruegamer et al. (2023) <doi:10.18637/jss.v105.i02>. Predictors can be modeled using structured (penalized) linear effects, structured non-linear effects or using an unstructured deep network model. |
Authors: | David Ruegamer [aut, cre], Florian Pfisterer [ctb], Philipp Baumann [ctb], Chris Kolb [ctb], Lucas Kook [ctb] |
Maintainer: | David Ruegamer <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2024-11-15 06:45:20 UTC |
Source: | CRAN |
If you encounter problems with installing the required python modules
please make sure, that a correct python version is configured using
py_discover_config
and change the python version if required.
Internally uses keras::install_keras
.
check_and_install(force = FALSE)
check_and_install(force = FALSE)
force |
if TRUE, forces the installations |
Function that checks if a Python environment is available and contains TensorFlow. If not the recommended version is installed.
Method for extracting ensemble coefficient estimates
## S3 method for class 'drEnsemble' coef(object, which_param = 1, type = NULL, ...)
## S3 method for class 'drEnsemble' coef(object, which_param = 1, type = NULL, ...)
object |
object of class |
which_param |
integer, indicating for which distribution parameter coefficients should be returned (default is first parameter) |
type |
either NULL (all types of coefficients are returned), "linear" for linear coefficients or "smooth" for coefficients of smooth terms |
... |
further arguments supplied to |
list of coefficient estimates of all ensemble members
Function to combine two penalties
combine_penalties(penalties, dims)
combine_penalties(penalties, dims)
penalties |
a list of penalties |
dims |
dimensions of the parameters to penalize |
a TensorFlow penalty combining the two penalties
Function to create (custom) family
create_family(tfd_dist, trafo_list, output_dim = 1L)
create_family(tfd_dist, trafo_list, output_dim = 1L)
tfd_dist |
a tensorflow probability distribution |
trafo_list |
list of transformations h for each parameter
(e.g, |
output_dim |
integer defining the size of the response |
a function that can be used by
tfp$layers$DistributionLambda
to create a new
distribuional layer
Function to create mgcv-type penalty
create_penalty(evaluated_gam_term, df, controls, Z = NULL)
create_penalty(evaluated_gam_term, df, controls, Z = NULL)
evaluated_gam_term |
a list resulting from a smoothConstruct call |
df |
integer; specified degrees-of-freedom for the gam term |
controls |
list; further arguments defining the smooth |
Z |
matrix; matrix for constraint(s) |
a list with penalty parameter and penalty matrix
Generic cv function
cv(x, ...)
cv(x, ...)
x |
model to do cv on |
... |
further arguments passed to the class-specific function |
Fitting Semi-Structured Deep Distributional Regression
deepregression( y, list_of_formulas, list_of_deep_models = NULL, family = "normal", data, tf_seed = as.integer(1991 - 5 - 4), return_prepoc = FALSE, subnetwork_builder = subnetwork_init, model_builder = keras_dr, fitting_function = utils::getFromNamespace("fit.keras.engine.training.Model", "keras"), additional_processors = list(), penalty_options = penalty_control(), orthog_options = orthog_control(), weight_options = weight_control(), formula_options = form_control(), output_dim = 1L, verbose = FALSE, ... )
deepregression( y, list_of_formulas, list_of_deep_models = NULL, family = "normal", data, tf_seed = as.integer(1991 - 5 - 4), return_prepoc = FALSE, subnetwork_builder = subnetwork_init, model_builder = keras_dr, fitting_function = utils::getFromNamespace("fit.keras.engine.training.Model", "keras"), additional_processors = list(), penalty_options = penalty_control(), orthog_options = orthog_control(), weight_options = weight_control(), formula_options = form_control(), output_dim = 1L, verbose = FALSE, ... )
y |
response variable |
list_of_formulas |
a named list of right hand side formulas,
one for each parameter of the distribution specified in |
list_of_deep_models |
a named list of functions specifying a keras model. See the examples for more details. |
family |
a character specifying the distribution. For information on
possible distribution and parameters, see |
data |
data.frame or named list with input features |
tf_seed |
a seed for TensorFlow (only works with R version >= 2.2.0) |
return_prepoc |
logical; if TRUE only the pre-processed data and layers are returned (default FALSE). |
subnetwork_builder |
function to build each subnetwork (network for each distribution parameter;
per default |
model_builder |
function to build the model based on additive predictors
(per default |
fitting_function |
function to fit the instantiated model when calling |
additional_processors |
a named list with additional processors to convert the formula(s).
Can have an attribute |
penalty_options |
options for smoothing and penalty terms defined by |
orthog_options |
options for the orthgonalization defined by |
weight_options |
options for layer weights defined by |
formula_options |
options for formula parsing (mainly used to make calculation more efficiently) |
output_dim |
dimension of the output, per default 1L |
verbose |
logical; whether to print progress of model initialization to console |
... |
further arguments passed to the |
Ruegamer, D. et al. (2023): deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression. doi:10.18637/jss.v105.i02.
library(deepregression) n <- 1000 data = data.frame(matrix(rnorm(4*n), c(n,4))) colnames(data) <- c("x1","x2","x3","xa") formula <- ~ 1 + deep_model(x1,x2,x3) + s(xa) + x1 deep_model <- function(x) x %>% layer_dense(units = 32, activation = "relu", use_bias = FALSE) %>% layer_dropout(rate = 0.2) %>% layer_dense(units = 8, activation = "relu") %>% layer_dense(units = 1, activation = "linear") y <- rnorm(n) + data$xa^2 + data$x1 mod <- deepregression( list_of_formulas = list(loc = formula, scale = ~ 1), data = data, y = y, list_of_deep_models = list(deep_model = deep_model) ) if(!is.null(mod)){ # train for more than 10 epochs to get a better model mod %>% fit(epochs = 10, early_stopping = TRUE) mod %>% fitted() %>% head() cvres <- mod %>% cv() mod %>% get_partial_effect(name = "s(xa)") mod %>% coef() mod %>% plot() } mod <- deepregression( list_of_formulas = list(loc = ~ 1 + s(xa) + x1, scale = ~ 1, dummy = ~ -1 + deep_model(x1,x2,x3) %OZ% 1), data = data, y = y, list_of_deep_models = list(deep_model = deep_model), mapping = list(1,2,1:2) )
library(deepregression) n <- 1000 data = data.frame(matrix(rnorm(4*n), c(n,4))) colnames(data) <- c("x1","x2","x3","xa") formula <- ~ 1 + deep_model(x1,x2,x3) + s(xa) + x1 deep_model <- function(x) x %>% layer_dense(units = 32, activation = "relu", use_bias = FALSE) %>% layer_dropout(rate = 0.2) %>% layer_dense(units = 8, activation = "relu") %>% layer_dense(units = 1, activation = "linear") y <- rnorm(n) + data$xa^2 + data$x1 mod <- deepregression( list_of_formulas = list(loc = formula, scale = ~ 1), data = data, y = y, list_of_deep_models = list(deep_model = deep_model) ) if(!is.null(mod)){ # train for more than 10 epochs to get a better model mod %>% fit(epochs = 10, early_stopping = TRUE) mod %>% fitted() %>% head() cvres <- mod %>% cv() mod %>% get_partial_effect(name = "s(xa)") mod %>% coef() mod %>% plot() } mod <- deepregression( list_of_formulas = list(loc = ~ 1 + s(xa) + x1, scale = ~ 1, dummy = ~ -1 + deep_model(x1,x2,x3) %OZ% 1), data = data, y = y, list_of_deep_models = list(deep_model = deep_model), mapping = list(1,2,1:2) )
Function to define output distribution based on dist_fun
distfun_to_dist(dist_fun, preds)
distfun_to_dist(dist_fun, preds)
dist_fun |
a distribution function as defined by |
preds |
tensors with predictions |
a symbolic tfp distribution
Generic deep ensemble function
ensemble(x, ...)
ensemble(x, ...)
x |
model to ensemble |
... |
further arguments passed to the class-specific function |
Ensemblind deepregression models
## S3 method for class 'deepregression' ensemble( x, n_ensemble = 5, reinitialize = TRUE, mylapply = lapply, verbose = FALSE, patience = 20, plot = TRUE, print_members = TRUE, stop_if_nan = TRUE, save_weights = TRUE, callbacks = list(), save_fun = NULL, seed = seq_len(n_ensemble), ... )
## S3 method for class 'deepregression' ensemble( x, n_ensemble = 5, reinitialize = TRUE, mylapply = lapply, verbose = FALSE, patience = 20, plot = TRUE, print_members = TRUE, stop_if_nan = TRUE, save_weights = TRUE, callbacks = list(), save_fun = NULL, seed = seq_len(n_ensemble), ... )
x |
object of class |
n_ensemble |
numeric; number of ensemble members to fit |
reinitialize |
logical; if |
mylapply |
lapply function to be used; defaults to |
verbose |
whether to print training in each fold |
patience |
number of patience for early stopping |
plot |
whether to plot the resulting losses in each fold |
print_members |
logical; print results for each member |
stop_if_nan |
logical; whether to stop CV if NaN values occur |
save_weights |
whether to save final weights of each ensemble member;
defaults to |
callbacks |
a list of callbacks used for fitting |
save_fun |
function applied to the model in each fold to be stored in the final result |
seed |
seed for reproducibility |
... |
further arguments passed to |
object of class "drEnsemble"
, containing the original
"deepregression"
model together with a list of ensembling
results (training history and, if save_weights
is TRUE
,
the trained weights of each ensemble member)
Extract the smooth term from a deepregression term specification
extract_pure_gam_part(term, remove_other_options = TRUE)
extract_pure_gam_part(term, remove_other_options = TRUE)
term |
term specified in a formula |
remove_other_options |
logical; whether to remove other options withing the smooth term |
pure gam part of term
Convenience function to extract penalty matrix and value
extract_S(x)
extract_S(x)
x |
evaluated smooth term object |
Formula helpers
extractval(term, name, default_for_missing = FALSE, default = NULL) extractlen(term, data) form2text(form)
extractval(term, name, default_for_missing = FALSE, default = NULL) extractlen(term, data) form2text(form)
term |
formula term |
name |
character; the value to extract |
default_for_missing |
logical; if TRUE, returns |
default |
value returned when missing |
data |
a data.frame or list |
form |
formula that is converted to a character string |
the value used for name
extractval("s(a, la = 2)", "la")
extractval("s(a, la = 2)", "la")
Extract variable from term
extractvar(term, allow_ia = FALSE)
extractvar(term, allow_ia = FALSE)
term |
term specified in formula |
allow_ia |
logical; whether to allow interaction of terms
using the |
variable as string
Character-tfd mapping function
family_to_tfd(family)
family_to_tfd(family)
family |
character defining the distribution |
a tfp distribution
Character-to-transformation mapping function
family_to_trafo(family, add_const = 1e-08)
family_to_trafo(family, add_const = 1e-08)
family |
character defining the distribution |
add_const |
see |
a list of transformation for each distribution parameter
Method for extracting the fitted values of an ensemble
## S3 method for class 'drEnsemble' fitted(object, apply_fun = tfd_mean, ...)
## S3 method for class 'drEnsemble' fitted(object, apply_fun = tfd_mean, ...)
object |
a deepregression model |
apply_fun |
function applied to fitted distribution,
per default |
... |
arguments passed to the |
list of fitted values for each ensemble member
Options for formula parsing
form_control(precalculate_gamparts = TRUE, check_form = TRUE)
form_control(precalculate_gamparts = TRUE, check_form = TRUE)
precalculate_gamparts |
logical; if TRUE (default), additive parts are pre-calculated and can later be used more efficiently. Set to FALSE only if no smooth effects are in the formula(s) and a formula is very large so that extracting all terms takes long or might fail |
check_form |
logical; if TRUE (default), the formula is checked in |
Returns a list with options
Function to transform a distritbution layer output into a loss function
from_dist_to_loss( family, ind_fun = function(x) tfd_independent(x), weights = NULL )
from_dist_to_loss( family, ind_fun = function(x) tfd_independent(x), weights = NULL )
family |
see |
ind_fun |
function applied to the model output before calculating the
log-likelihood. Per default independence is assumed by applying |
weights |
sample weights |
loss function
Define Predictor of a Deep Distributional Regression Model
from_preds_to_dist( list_pred_param, family = NULL, output_dim = 1L, mapping = NULL, from_family_to_distfun = make_tfd_dist, from_distfun_to_dist = distfun_to_dist, add_layer_shared_pred = function(x, units) layer_dense(x, units = units, use_bias = FALSE), trafo_list = NULL )
from_preds_to_dist( list_pred_param, family = NULL, output_dim = 1L, mapping = NULL, from_family_to_distfun = make_tfd_dist, from_distfun_to_dist = distfun_to_dist, add_layer_shared_pred = function(x, units) layer_dense(x, units = units, use_bias = FALSE), trafo_list = NULL )
list_pred_param |
list of input-output(-lists) generated from
|
family |
see |
output_dim |
dimension of the output |
mapping |
a list of integers. The i-th list item defines which element
elements of |
from_family_to_distfun |
function to create a |
from_distfun_to_dist |
function creating a tfp distribution based on the
prediction tensors and |
add_layer_shared_pred |
layer to extend shared layers defined in |
trafo_list |
a list of transformation function to convert the scale of the additive predictors to the respective distribution parameter |
a list with input tensors and output tensors that can be passed
to, e.g., keras_model
used by gam_processor
gam_plot_data(pp, weights, grid_length = 40, pe_fun = pe_gen)
gam_plot_data(pp, weights, grid_length = 40, pe_fun = pe_gen)
pp |
processed term |
weights |
layer weights |
grid_length |
length for grid for evaluating basis |
pe_fun |
function used to generate partial effects |
Function to return the fitted distribution
get_distribution(x, data = NULL, force_float = FALSE)
get_distribution(x, data = NULL, force_float = FALSE)
x |
the fitted deepregression object |
data |
an optional data set |
force_float |
forces conversion into float tensors |
Obtain the conditional ensemble distribution
get_ensemble_distribution(object, data = NULL, topK = NULL, ...)
get_ensemble_distribution(object, data = NULL, topK = NULL, ...)
object |
object of class |
data |
data for which to return the fitted distribution |
topK |
not implemented yet |
... |
further arguments currently ignored |
tfd_distribution
of the ensemble, i.e., a mixture of the
ensemble member's predicted distributions conditional on data
Extract gam part from wrapped term
get_gam_part(term, wrapper = "vc")
get_gam_part(term, wrapper = "vc")
term |
character; gam model term |
wrapper |
character; function name that is wrapped around the gam part |
Extract property of gamdata
get_gamdata( term, param_nr, gamdata, what = c("data_trafo", "predict_trafo", "input_dim", "partial_effect", "sp_and_S", "df") )
get_gamdata( term, param_nr, gamdata, what = c("data_trafo", "predict_trafo", "input_dim", "partial_effect", "sp_and_S", "df") )
term |
term in formula |
param_nr |
integer; number of the distribution parameter |
gamdata |
list as returned by |
what |
string specifying what to return |
property of the gamdata object as defined by what
Extract number in matching table of reduced gam term
get_gamdata_reduced_nr(term, param_nr, gamdata)
get_gamdata_reduced_nr(term, param_nr, gamdata)
term |
term in formula |
param_nr |
integer; number of the distribution parameter |
gamdata |
list as returned by |
integer with number of gam term in matching table
Function to return layer given model and name
get_layer_by_opname(mod, name, partial_match = FALSE)
get_layer_by_opname(mod, name, partial_match = FALSE)
mod |
deepregression model |
name |
character |
partial_match |
logical; whether to also check for a partial match |
Function to return layer number given model and name
get_layernr_by_opname(mod, name, partial_match = FALSE)
get_layernr_by_opname(mod, name, partial_match = FALSE)
mod |
deepregression model |
name |
character |
partial_match |
logical; whether to also check for a partial match |
Function to return layer numbers with trainable weights
get_layernr_trainable(mod, logic = FALSE)
get_layernr_trainable(mod, logic = FALSE)
mod |
deepregression model |
logic |
logical; TRUE: return logical vector; FALSE (default) index |
Extract term names from the parsed formula content
get_names_pfc(pfc)
get_names_pfc(pfc)
pfc |
parsed formula content |
vector of term names
Return partial effect of one smooth term
get_partial_effect( object, names = NULL, return_matrix = FALSE, which_param = 1, newdata = NULL, ... )
get_partial_effect( object, names = NULL, return_matrix = FALSE, which_param = 1, newdata = NULL, ... )
object |
deepregression object |
names |
string; for partial match with smooth term |
return_matrix |
logical; whether to return the design matrix or |
which_param |
integer; which distribution parameter
the partial effect ( |
newdata |
data.frame; new data (optional) |
... |
arguments passed to |
Extract processor name from term
get_processor_name(term)
get_processor_name(term)
term |
term in formula |
processor name as string
Extract terms defined by specials in formula
get_special(term, specials, simplify = FALSE)
get_special(term, specials, simplify = FALSE)
term |
term in formula |
specials |
string(s); special name(s) |
simplify |
logical; shortcut for returning only the name of the special in term |
specials in formula
Function to subset parsed formulas
get_type_pfc(pfc, type = NULL)
get_type_pfc(pfc, type = NULL)
pfc |
list of parsed formulas |
type |
either NULL (all types of coefficients are returned), "linear" for linear coefficients or "smooth" for coefficients of |
Function to retrieve the weights of a structured layer
get_weight_by_name(mod, name, param_nr = 1, postfixes = "")
get_weight_by_name(mod, name, param_nr = 1, postfixes = "")
mod |
fitted deepregression object |
name |
name of partial effect |
param_nr |
distribution parameter number |
postfixes |
character (vector) appended to layer name |
weight matrix
Function to return weight given model and name
get_weight_by_opname(mod, name, partial_match = FALSE)
get_weight_by_opname(mod, name, partial_match = FALSE)
mod |
deepregression model |
name |
character |
partial_match |
logical; whether to also check for a partial match |
Function to define smoothness and call mgcv's smooth constructor
handle_gam_term(object, data, controls)
handle_gam_term(object, data, controls)
object |
character defining the model term |
data |
data.frame or list |
controls |
controls for penalization |
constructed smooth term
Compile a Deep Distributional Regression Model
keras_dr( list_pred_param, weights = NULL, optimizer = tf$keras$optimizers$Adam(), model_fun = keras_model, monitor_metrics = list(), from_preds_to_output = from_preds_to_dist, loss = from_dist_to_loss(family = list(...)$family, weights = weights), additional_penalty = NULL, ... )
keras_dr( list_pred_param, weights = NULL, optimizer = tf$keras$optimizers$Adam(), model_fun = keras_model, monitor_metrics = list(), from_preds_to_output = from_preds_to_dist, loss = from_dist_to_loss(family = list(...)$family, weights = weights), additional_penalty = NULL, ... )
list_pred_param |
list of input-output(-lists) generated from
|
weights |
vector of positive values; optional (default = 1 for all observations) |
optimizer |
optimizer used. Per default Adam |
model_fun |
which function to use for model building (default |
monitor_metrics |
Further metrics to monitor |
from_preds_to_output |
function taking the list_pred_param outputs and transforms it into a single network output |
loss |
the model's loss function; per default evaluated based on
the arguments |
additional_penalty |
a penalty that is added to the negative log-likelihood; must be a function of model$trainable_weights with suitable subsetting |
... |
arguments passed to |
a list with input tensors and output tensors that can be passed
to, e.g., keras_model
set.seed(24) n <- 500 x <- runif(n) %>% as.matrix() z <- runif(n) %>% as.matrix() y <- x - z data <- data.frame(x = x, z = z, y = y) # change loss to mse and adapt # \code{from_preds_to_output} to work # only on the first output column mod <- deepregression( y = y, data = data, list_of_formulas = list(loc = ~ 1 + x + z, scale = ~ 1), list_of_deep_models = NULL, family = "normal", from_preds_to_output = function(x, ...) x[[1]], loss = "mse" )
set.seed(24) n <- 500 x <- runif(n) %>% as.matrix() z <- runif(n) %>% as.matrix() y <- x - z data <- data.frame(x = x, z = z, y = y) # change loss to mse and adapt # \code{from_preds_to_output} to work # only on the first output column mod <- deepregression( y = y, data = data, list_of_formulas = list(loc = ~ 1 + x + z, scale = ~ 1), list_of_deep_models = NULL, family = "normal", from_preds_to_output = function(x, ...) x[[1]], loss = "mse" )
Convenience layer function
layer_add_identity(inputs) layer_concatenate_identity(inputs)
layer_add_identity(inputs) layer_concatenate_identity(inputs)
inputs |
list of tensors |
convenience layers to work with list of inputs where inputs
can also have length one
tensor
Function that creates layer for each processor
layer_generator( term, output_dim, param_nr, controls, layer_class = tf$keras$layers$Dense, without_layer = tf$identity, name = makelayername(term, param_nr), further_layer_args = NULL, layer_args_names = NULL, units = as.integer(output_dim), ... ) int_processor(term, data, output_dim, param_nr, controls) lin_processor(term, data, output_dim, param_nr, controls) gam_processor(term, data, output_dim, param_nr, controls)
layer_generator( term, output_dim, param_nr, controls, layer_class = tf$keras$layers$Dense, without_layer = tf$identity, name = makelayername(term, param_nr), further_layer_args = NULL, layer_args_names = NULL, units = as.integer(output_dim), ... ) int_processor(term, data, output_dim, param_nr, controls) lin_processor(term, data, output_dim, param_nr, controls) gam_processor(term, data, output_dim, param_nr, controls)
term |
character; term in the formula |
output_dim |
integer; number of units in the layer |
param_nr |
integer; identifier for models with more than one additive predictor |
controls |
list; control arguments which allow to pass further information |
layer_class |
a tf or keras layer function |
without_layer |
function to be used as
layer if |
name |
character; name of layer.
if NULL, |
further_layer_args |
named list; further arguments passed to the layer |
layer_args_names |
character vector; if NULL, default layer args will be used. Needs to be set for layers that do not provide the arguments of a default Dense layer. |
units |
integer; number of units for layer |
... |
other keras layer parameters |
data |
data frame; the data used in processors |
a basic processor list structure
Sparse 2D Convolutional layer
layer_sparse_conv_2d(filters, kernel_size, lam = NULL, depth = 2, ...)
layer_sparse_conv_2d(filters, kernel_size, lam = NULL, depth = 2, ...)
filters |
number of filters |
kernel_size |
size of convolutional filter |
lam |
regularization strength |
depth |
depth of weight factorization |
... |
arguments passed to TensorFlow layer |
layer object
Function to define spline as TensorFlow layer
layer_spline( units = 1L, P, name, trainable = TRUE, kernel_initializer = "glorot_uniform" )
layer_spline( units = 1L, P, name, trainable = TRUE, kernel_initializer = "glorot_uniform" )
units |
integer; number of output units |
P |
matrix; penalty matrix |
name |
string; string defining the layer's name |
trainable |
logical; whether layer is trainable |
kernel_initializer |
initializer; for basis coefficients |
TensorFlow layer
Function to return the log_score
log_score( x, data = NULL, this_y = NULL, ind_fun = function(x) tfd_independent(x), convert_fun = as.matrix, summary_fun = function(x) x )
log_score( x, data = NULL, this_y = NULL, ind_fun = function(x) tfd_independent(x), convert_fun = as.matrix, summary_fun = function(x) x )
x |
the fitted deepregression object |
data |
an optional data set |
this_y |
new y for optional data |
ind_fun |
function indicating the dependency; per default (iid assumption)
|
convert_fun |
function that converts Tensor; per default |
summary_fun |
function summarizing the output; per default the identity |
Function to loop through parsed formulas and apply data trafo
loop_through_pfc_and_call_trafo(pfc, newdata = NULL)
loop_through_pfc_and_call_trafo(pfc, newdata = NULL)
pfc |
list of processor transformed formulas |
newdata |
list in the same format as the original data |
list of matrices or arrays
Generate folds for CV out of one hot encoded matrix
make_folds(mat, val_train = 0, val_test = 1)
make_folds(mat, val_train = 0, val_test = 1)
mat |
matrix with columns corresponding to folds and entries corresponding to a one hot encoding |
val_train |
the value corresponding to train, per default 0 |
val_test |
the value corresponding to test, per default 1 |
val_train
and val_test
can both be a set of value
creates a generator for training
make_generator( input_x, input_y = NULL, batch_size, sizes, shuffle = TRUE, seed = 42L )
make_generator( input_x, input_y = NULL, batch_size, sizes, shuffle = TRUE, seed = 42L )
input_x |
list of matrices |
input_y |
list of matrix |
batch_size |
integer |
sizes |
sizes of the image including colour channel |
shuffle |
logical for shuffling data |
seed |
seed for shuffling in generators |
generator for all x and y
Creates a Python Class that internally iterates over the data.
make_generator_from_matrix( x, y = NULL, generator = image_data_generator(), batch_size = 32L, shuffle = TRUE, seed = 1L )
make_generator_from_matrix( x, y = NULL, generator = image_data_generator(), batch_size = 32L, shuffle = TRUE, seed = 1L )
x |
matrix; |
y |
vector; |
generator |
generator as e.g. obtained from 'keras::image_data_generator'. Used for consistent train-test splits. |
batch_size |
integer |
shuffle |
logical; Should data be shuffled? |
seed |
integer; seed for shuffling data. |
Families for deepregression
make_tfd_dist(family, add_const = 1e-08, output_dim = 1L, trafo_list = NULL)
make_tfd_dist(family, add_const = 1e-08, output_dim = 1L, trafo_list = NULL)
family |
character vector |
add_const |
small positive constant to stabilize calculations |
output_dim |
number of output dimensions of the response (larger 1 for multivariate case) |
trafo_list |
list of transformations for each distribution parameter. Per default the transformation listed in details is applied. |
To specify a custom distribution, define the a function as follows
function(x) do.call(your_tfd_dist, lapply(1:ncol(x)[[1]],
function(i)
your_trafo_list_on_inputs[[i]](
x[,i,drop=FALSE])))
and pass it to deepregression
via the dist_fun
argument.
Currently the following distributions are supported
with parameters (and corresponding inverse link function in brackets):
"normal": normal distribution with location (identity), scale (exp)
"bernoulli": bernoulli distribution with logits (identity)
"bernoulli_prob": bernoulli distribution with probabilities (sigmoid)
"beta": beta with concentration 1 = alpha (exp) and concentration 0 = beta (exp)
"betar": beta with mean (sigmoid) and scale (sigmoid)
"cauchy": location (identity), scale (exp)
"chi2": cauchy with df (exp)
"chi": cauchy with df (exp)
"exponential": exponential with lambda (exp)
"gamma": gamma with concentration (exp) and rate (exp)
"gammar": gamma with location (exp) and scale (exp), following
gamlss.dist::GA
, which implies that the expectation is the location,
and the variance of the distribution is the location^2 scale^2
"gumbel": gumbel with location (identity), scale (exp)
"half_cauchy": half cauchy with location (identity), scale (exp)
"half_normal": half normal with scale (exp)
"horseshoe": horseshoe with scale (exp)
"inverse_gamma": inverse gamma with concentation (exp) and rate (exp)
"inverse_gamma_ls": inverse gamma with location (exp) and variance (1/exp)
"inverse_gaussian": inverse Gaussian with location (exp) and concentation (exp)
"laplace": Laplace with location (identity) and scale (exp)
"log_normal": Log-normal with location (identity) and scale (exp) of underlying normal distribution
"logistic": logistic with location (identity) and scale (exp)
"negbinom": neg. binomial with count (exp) and prob (sigmoid)
"negbinom_ls": neg. binomail with mean (exp) and clutter factor (exp)
"pareto": Pareto with concentration (exp) and scale (1/exp)
"pareto_ls": Pareto location scale version with mean (exp) and scale (exp), which corresponds to a Pareto distribution with parameters scale = mean and concentration = 1/sigma, where sigma is the scale in the pareto_ls version
"poisson": poisson with rate (exp)
"poisson_lograte": poisson with lograte (identity))
"student_t": Student's t with df (exp)
"student_t_ls": Student's t with df (exp), location (identity) and scale (exp)
"uniform": uniform with upper and lower (both identity)
"zinb": Zero-inflated negative binomial with mean (exp), variance (exp) and prob (sigmoid)
"zip": Zero-inflated poisson distribution with mean (exp) and prob (sigmoid)
Convenience layer function
makeInputs(pp, param_nr)
makeInputs(pp, param_nr)
pp |
processed predictors |
param_nr |
integer for the parameter |
input tensors with appropriate names
Function that takes term and create layer name
makelayername(term, param_nr, truncate = 60)
makelayername(term, param_nr, truncate = 60)
term |
term in formula |
param_nr |
integer; defining number of the distribution's parameter |
truncate |
integer; value from which on names are truncated |
name (string) for layer
Function to define an optimizer combining multiple optimizers
multioptimizer(optimizers_and_layers)
multioptimizer(optimizers_and_layers)
optimizers_and_layers |
a list if |
an optimizer
Returns the parameter names for a given family
names_families(family)
names_families(family)
family |
character specifying the family as defined by |
vector of parameter names
Options for orthogonalization
orthog_control( split_fun = split_model, orthog_type = c("tf", "manual"), orthogonalize = options()$orthogonalize, identify_intercept = options()$identify_intercept, deep_top = NULL, orthog_fun = NULL, deactivate_oz_at_test = TRUE )
orthog_control( split_fun = split_model, orthog_type = c("tf", "manual"), orthogonalize = options()$orthogonalize, identify_intercept = options()$identify_intercept, deep_top = NULL, orthog_fun = NULL, deactivate_oz_at_test = TRUE )
split_fun |
a function separating the deep neural network in two parts
so that the orthogonalization can be applied to the first part before
applying the second network part; per default, the function |
orthog_type |
one of two options; If |
orthogonalize |
logical; if set to |
identify_intercept |
whether to orthogonalize the deep network w.r.t. the intercept to make the intercept identifiable |
deep_top |
function; optional function to put on top of the deep network instead
of splitting the function using |
orthog_fun |
function; for custom orthogonaliuation. if NULL, |
deactivate_oz_at_test |
logical; whether to deactive the orthogonalization cell
at test time when using |
Returns a list with options
Function to compute adjusted penalty when orthogonalizing
orthog_P(P, Z)
orthog_P(P, Z)
P |
matrix; original penalty matrix |
Z |
matrix; constraint matrix |
adjusted penalty matrix
Orthogonalize a Semi-Structured Model Post-hoc
orthog_post_fitting(mod, name_penult, param_nr = 1)
orthog_post_fitting(mod, name_penult, param_nr = 1)
mod |
deepregression model |
name_penult |
character name of the penultimate layer of the deep part part |
param_nr |
integer; number of the parameter to be returned |
a deepregression
object with weights frozen and
deep part specified by name_penult
orthogonalized
Orthogonalize structured term by another matrix
orthog_structured_smooths_Z(S, L)
orthog_structured_smooths_Z(S, L)
S |
matrix; matrix to orthogonalize |
L |
matrix; matrix which defines the projection
and its orthogonal complement, in which |
constraint matrix
Options for penalty setup in the pre-processing
penalty_control( defaultSmoothing = NULL, df = 10, null_space_penalty = FALSE, absorb_cons = FALSE, anisotropic = TRUE, zero_constraint_for_smooths = TRUE, no_linear_trend_for_smooths = FALSE, hat1 = FALSE, sp_scale = function(x) ifelse(is.list(x) | is.data.frame(x), 1/NROW(x[[1]]), 1/NROW(x)) )
penalty_control( defaultSmoothing = NULL, df = 10, null_space_penalty = FALSE, absorb_cons = FALSE, anisotropic = TRUE, zero_constraint_for_smooths = TRUE, no_linear_trend_for_smooths = FALSE, hat1 = FALSE, sp_scale = function(x) ifelse(is.list(x) | is.data.frame(x), 1/NROW(x[[1]]), 1/NROW(x)) )
defaultSmoothing |
function applied to all s-terms, per default (NULL)
the minimum df of all possible terms is used. Must be a function the smooth term
from mgcv's smoothCon and an argument |
df |
degrees of freedom for all non-linear structural terms (default = 7);
either one common value or a list of the same length as number of parameters;
if different df values need to be assigned to different smooth terms,
use df as an argument for |
null_space_penalty |
logical value;
if TRUE, the null space will also be penalized for smooth effects.
Per default, this is equal to the value give in |
absorb_cons |
logical; adds identifiability constraint to the basis.
See |
anisotropic |
whether or not use anisotropic smoothing (default is TRUE) |
zero_constraint_for_smooths |
logical; the same as absorb_cons,
but done explicitly. If true a constraint is put on each smooth to have zero mean. Can
be a vector of |
no_linear_trend_for_smooths |
logical; see |
hat1 |
logical; if TRUE, the smoothing parameter is defined by the trace of the hat matrix sum(diag(H)), else sum(diag(2*H-HH)) |
sp_scale |
function of response; for scaling the penalty (1/n per default) |
Returns a list with options
Plot CV results from deepregression
plot_cv(x, what = c("loss", "weight"), ...)
plot_cv(x, what = c("loss", "weight"), ...)
x |
|
what |
character indicating what to plot (currently supported 'loss' or 'weights') |
... |
further arguments passed to |
Generic functions for deepregression models
Predict based on a deepregression object
Function to extract fitted distribution
Fit a deepregression model (pendant to fit for keras)
Extract layer weights / coefficients from model
Print function for deepregression model
Cross-validation for deepgression objects
mean of model fit
Standard deviation of fit distribution
Calculate the distribution quantiles
## S3 method for class 'deepregression' plot( x, which = NULL, which_param = 1, only_data = FALSE, grid_length = 40, main_multiple = NULL, type = "b", get_weight_fun = get_weight_by_name, ... ) ## S3 method for class 'deepregression' predict( object, newdata = NULL, batch_size = NULL, apply_fun = tfd_mean, convert_fun = as.matrix, ... ) ## S3 method for class 'deepregression' fitted(object, apply_fun = tfd_mean, ...) ## S3 method for class 'deepregression' fit( object, batch_size = 32, epochs = 10, early_stopping = FALSE, early_stopping_metric = "val_loss", verbose = TRUE, view_metrics = FALSE, patience = 20, save_weights = FALSE, validation_data = NULL, validation_split = ifelse(is.null(validation_data), 0.1, 0), callbacks = list(), convertfun = function(x) tf$constant(x, dtype = "float32"), ... ) ## S3 method for class 'deepregression' coef(object, which_param = 1, type = NULL, ...) ## S3 method for class 'deepregression' print(x, ...) ## S3 method for class 'deepregression' cv( x, verbose = FALSE, patience = 20, plot = TRUE, print_folds = TRUE, cv_folds = 5, stop_if_nan = TRUE, mylapply = lapply, save_weights = FALSE, callbacks = list(), save_fun = NULL, ... ) ## S3 method for class 'deepregression' mean(x, data = NULL, ...) ## S3 method for class 'deepregression' stddev(x, data = NULL, ...) ## S3 method for class 'deepregression' quant(x, data = NULL, probs, ...)
## S3 method for class 'deepregression' plot( x, which = NULL, which_param = 1, only_data = FALSE, grid_length = 40, main_multiple = NULL, type = "b", get_weight_fun = get_weight_by_name, ... ) ## S3 method for class 'deepregression' predict( object, newdata = NULL, batch_size = NULL, apply_fun = tfd_mean, convert_fun = as.matrix, ... ) ## S3 method for class 'deepregression' fitted(object, apply_fun = tfd_mean, ...) ## S3 method for class 'deepregression' fit( object, batch_size = 32, epochs = 10, early_stopping = FALSE, early_stopping_metric = "val_loss", verbose = TRUE, view_metrics = FALSE, patience = 20, save_weights = FALSE, validation_data = NULL, validation_split = ifelse(is.null(validation_data), 0.1, 0), callbacks = list(), convertfun = function(x) tf$constant(x, dtype = "float32"), ... ) ## S3 method for class 'deepregression' coef(object, which_param = 1, type = NULL, ...) ## S3 method for class 'deepregression' print(x, ...) ## S3 method for class 'deepregression' cv( x, verbose = FALSE, patience = 20, plot = TRUE, print_folds = TRUE, cv_folds = 5, stop_if_nan = TRUE, mylapply = lapply, save_weights = FALSE, callbacks = list(), save_fun = NULL, ... ) ## S3 method for class 'deepregression' mean(x, data = NULL, ...) ## S3 method for class 'deepregression' stddev(x, data = NULL, ...) ## S3 method for class 'deepregression' quant(x, data = NULL, probs, ...)
x |
a deepregression object |
which |
character vector or number(s) identifying the effect to plot; default plots all effects |
which_param |
integer, indicating for which distribution parameter coefficients should be returned (default is first parameter) |
only_data |
logical, if TRUE, only the data for plotting is returned |
grid_length |
the length of an equidistant grid at which a two-dimensional function is evaluated for plotting. |
main_multiple |
vector of strings; plot main titles if multiple plots are selected |
type |
either NULL (all types of coefficients are returned), "linear" for linear coefficients or "smooth" for coefficients of smooth terms |
get_weight_fun |
function to extract weight from model given |
... |
arguments passed to the |
object |
a deepregression model |
newdata |
optional new data, either data.frame or list |
batch_size |
integer, the batch size used for mini-batch training |
apply_fun |
function applied to fitted distribution,
per default |
convert_fun |
how should the resulting tensor be converted,
per default |
epochs |
integer, the number of epochs to fit the model |
early_stopping |
logical, whether early stopping should be user. |
early_stopping_metric |
character, based on which metric should early stopping be trigged (default: "val_loss") |
verbose |
whether to print training in each fold |
view_metrics |
logical, whether to trigger the Viewer in RStudio / Browser. |
patience |
number of patience for early stopping |
save_weights |
logical, whether to save weights in each epoch. |
validation_data |
optional specified validation data |
validation_split |
float in [0,1] defining the amount of data used for validation |
callbacks |
a list of callbacks used for fitting |
convertfun |
function to convert R into Tensor object |
plot |
whether to plot the resulting losses in each fold |
print_folds |
whether to print the current fold |
cv_folds |
an integer; can also be a list of lists with train and test data sets per fold |
stop_if_nan |
logical; whether to stop CV if NaN values occur |
mylapply |
lapply function to be used; defaults to |
save_fun |
function applied to the model in each fold to be stored in the final result |
data |
either |
probs |
the quantile value(s) |
Returns an object drCV
, a list, one list element for each fold
containing the model fit and the weighthistory
.
Pre-calculate all gam parts from the list of formulas
precalc_gam(lof, data, controls)
precalc_gam(lof, data, controls)
lof |
list of formulas |
data |
the data list |
controls |
controls from deepregression |
a list of length 2 with a matching table to link every unique gam term to formula entries and the respective data transformation functions
Generator function for deepregression objects
predict_gen( object, newdata = NULL, batch_size = NULL, apply_fun = tfd_mean, convert_fun = as.matrix, ret_dist = FALSE )
predict_gen( object, newdata = NULL, batch_size = NULL, apply_fun = tfd_mean, convert_fun = as.matrix, ret_dist = FALSE )
object |
deepregression model; |
newdata |
data.frame or list; for (optional) new data |
batch_size |
integer; |
apply_fun |
see |
convert_fun |
see |
ret_dist |
logical; whether to return the whole distribution or only the (mean) prediction |
matrix or list of distributions
Function to prepare data based on parsed formulas
prepare_data(pfc, gamdata = NULL)
prepare_data(pfc, gamdata = NULL)
pfc |
list of processor transformed formulas |
gamdata |
processor for gam part |
list of matrices or arrays
Function to prepare new data based on parsed formulas
prepare_newdata(pfc, newdata, gamdata = NULL)
prepare_newdata(pfc, newdata, gamdata = NULL)
pfc |
list of processor transformed formulas |
newdata |
list in the same format as the original data |
gamdata |
processor for gam part |
list of matrices or arrays
Control function to define the processor for terms in the formula
process_terms( form, data, controls, output_dim, param_nr, parsing_options, specials_to_oz = c(), automatic_oz_check = TRUE, identify_intercept = FALSE, ... )
process_terms( form, data, controls, output_dim, param_nr, parsing_options, specials_to_oz = c(), automatic_oz_check = TRUE, identify_intercept = FALSE, ... )
form |
the formula to be processed |
data |
the data for the terms in the formula |
controls |
controls for gam terms |
output_dim |
the output dimension of the response |
param_nr |
integer; identifier for the distribution parameter |
parsing_options |
options |
specials_to_oz |
specials that should be automatically checked for |
automatic_oz_check |
logical; whether to automatically check for DNNs to be orthogonalized |
identify_intercept |
logical; whether to make the intercept automatically identifiable |
... |
further processors |
returns a processor function
Generic quantile function
quant(x, ...)
quant(x, ...)
x |
object |
... |
further arguments passed to the class-specific function |
Genereic function to re-intialize model weights
reinit_weights(object, seed)
reinit_weights(object, seed)
object |
model to re-initialize |
seed |
seed for reproducibility |
"deepregression"
modelMethod to re-initialize weights of a "deepregression"
model
## S3 method for class 'deepregression' reinit_weights(object, seed)
## S3 method for class 'deepregression' reinit_weights(object, seed)
object |
object of class |
seed |
seed for reproducibility |
invisible NULL
Function to define orthogonalization connections in the formula
separate_define_relation( form, specials, specials_to_oz, automatic_oz_check = TRUE, identify_intercept = FALSE, simplify = FALSE )
separate_define_relation( form, specials, specials_to_oz, automatic_oz_check = TRUE, identify_intercept = FALSE, simplify = FALSE )
form |
a formula for one distribution parameter |
specials |
specials in formula to handle separately |
specials_to_oz |
parts of the formula to orthogonalize |
automatic_oz_check |
logical; automatically check if terms must be orthogonalized |
identify_intercept |
logical; whether to make the intercept identifiable |
simplify |
logical; if FALSE, formulas are parsed more carefully. |
Returns a list of formula components with ids and assignments for orthogonalization
Generic sd function
stddev(x, ...)
stddev(x, ...)
x |
object |
... |
further arguments passed to the class-specific function |
Function to get the stoppting iteration from CV
stop_iter_cv_result( res, thisFUN = mean, loss = "validloss", whichFUN = which.min )
stop_iter_cv_result( res, thisFUN = mean, loss = "validloss", whichFUN = which.min )
res |
result of cv call |
thisFUN |
aggregating function applied over folds |
loss |
which loss to use for decision |
whichFUN |
which function to use for decision |
Initializes a Subnetwork based on the Processed Additive Predictor
subnetwork_init( pp, deep_top = NULL, orthog_fun = orthog_tf, split_fun = split_model, shared_layers = NULL, param_nr = 1, selectfun_in = function(pp) pp[[param_nr]], selectfun_lay = function(pp) pp[[param_nr]], gaminputs, summary_layer = layer_add_identity )
subnetwork_init( pp, deep_top = NULL, orthog_fun = orthog_tf, split_fun = split_model, shared_layers = NULL, param_nr = 1, selectfun_in = function(pp) pp[[param_nr]], selectfun_lay = function(pp) pp[[param_nr]], gaminputs, summary_layer = layer_add_identity )
pp |
list of processed predictor lists from |
deep_top |
keras layer if the top part of the deep network after orthogonalization is different to the one extracted from the provided network |
orthog_fun |
function used for orthogonalization |
split_fun |
function to split the network to extract head |
shared_layers |
list defining shared weights within one predictor; each list item is a vector of characters of terms as given in the parameter formula |
param_nr |
integer number for the distribution parameter |
selectfun_in , selectfun_lay
|
functions defining which subset of pp to
take as inputs and layers for this subnetwork; per default the |
gaminputs |
input tensors for gam terms |
summary_layer |
keras layer that combines inputs (typically adding or concatenating) |
returns a list of input and output for this additive predictor
TensorFlow repeat function which is not available for TF 2.0
tf_repeat(a, dim)
tf_repeat(a, dim)
a |
tensor |
dim |
dimension for repeating |
Row-wise tensor product using TensorFlow
tf_row_tensor(a, b, ...)
tf_row_tensor(a, b, ...)
a , b
|
tensor |
... |
arguments passed to TensorFlow layer |
a TensorFlow layer
Split tensor in multiple parts
tf_split_multiple(A, len)
tf_split_multiple(A, len)
A |
tensor |
len |
integer; defines the split lengths |
list of tensors
Function to index tensors columns
tf_stride_cols(A, start, end = NULL)
tf_stride_cols(A, start, end = NULL)
A |
tensor |
start |
first index |
end |
last index (equals start index if NULL) |
sliced tensor
Function to index tensors last dimension
tf_stride_last_dim_tensor(A, start, end = NULL)
tf_stride_last_dim_tensor(A, start, end = NULL)
A |
tensor |
start |
first index |
end |
last index (equals start index if NULL) |
sliced tensor
For using mean squared error via TFP
tfd_mse(mean)
tfd_mse(mean)
mean |
parameter for the mean |
deepregression
allows to train based on the
MSE by using loss = "mse"
as argument to deepregression
.
This tfd function just provides a dummy family
a TFP distribution
Implementation of a zero-inflated negbinom distribution for TFP
tfd_zinb(mu, r, probs)
tfd_zinb(mu, r, probs)
mu , r
|
parameter of the negbin_ls distribution |
probs |
vector of probabilites of length 2 (probability for poisson and probability for 0s) |
Implementation of a zero-inflated poisson distribution for TFP
tfd_zip(lambda, probs)
tfd_zip(lambda, probs)
lambda |
scalar value for rate of poisson distribution |
probs |
vector of probabilites of length 2 (probability for poisson and probability for 0s) |
Hadamard-type layers
tib_layer(units, la, ...) simplyconnected_layer(la, ...) inverse_group_lasso_pen(la) regularizer_group_lasso(la, group_idx) tibgroup_layer(units, group_idx, la, ...) layer_hadamard(units = 1, la = 0, depth = 3, ...) layer_group_hadamard(units, la, group_idx, depth, ...) layer_hadamard_diff( units, la, initu = "glorot_uniform", initv = "glorot_uniform", ... ) layer_hadamard(units = 1, la = 0, depth = 3, ...)
tib_layer(units, la, ...) simplyconnected_layer(la, ...) inverse_group_lasso_pen(la) regularizer_group_lasso(la, group_idx) tibgroup_layer(units, group_idx, la, ...) layer_hadamard(units = 1, la = 0, depth = 3, ...) layer_group_hadamard(units, la, group_idx, depth, ...) layer_hadamard_diff( units, la, initu = "glorot_uniform", initv = "glorot_uniform", ... ) layer_hadamard(units = 1, la = 0, depth = 3, ...)
units |
integer; number of units |
la |
numeric; regularization value (> 0) |
... |
arguments passed to TensorFlow layer |
group_idx |
list of group indices |
depth |
integer; depth of weight factorization |
initu , initv
|
initializers for parameters |
layer object
Function to update miniconda and packages
update_miniconda_deepregression( python = VERSIONPY, uninstall = TRUE, also_packages = TRUE )
update_miniconda_deepregression( python = VERSIONPY, uninstall = TRUE, also_packages = TRUE )
python |
string; version of python |
uninstall |
logical; whether to uninstall previous conda env |
also_packages |
logical; whether to install also all required packages |
Options for weights of layers
weight_control( specific_weight_options = NULL, general_weight_options = list(activation = NULL, use_bias = FALSE, trainable = TRUE, kernel_initializer = "glorot_uniform", bias_initializer = "zeros", kernel_regularizer = NULL, bias_regularizer = NULL, activity_regularizer = NULL, kernel_constraint = NULL, bias_constraint = NULL), warmstart_weights = NULL, shared_layers = NULL )
weight_control( specific_weight_options = NULL, general_weight_options = list(activation = NULL, use_bias = FALSE, trainable = TRUE, kernel_initializer = "glorot_uniform", bias_initializer = "zeros", kernel_regularizer = NULL, bias_regularizer = NULL, activity_regularizer = NULL, kernel_constraint = NULL, bias_constraint = NULL), warmstart_weights = NULL, shared_layers = NULL )
specific_weight_options |
specific options for certain
weight terms; must be a list of length |
general_weight_options |
default options for layers |
warmstart_weights |
While all keras layer options are availabe, the user can further specify a list for each distribution parameter with list elements corresponding to term names with values as vectors corresponding to start weights of the respective weights |
shared_layers |
list for each distribution parameter; each list item can be again a list of character vectors specifying terms which share layers |
Returns a list with options