Title: | Implementation of Bayesian Neural Networks |
---|---|
Description: | Implementation of 'BayesFlux.jl' for R; It extends the famous 'Flux.jl' machine learning library to Bayesian Neural Networks. The goal is not to have the fastest production ready library, but rather to allow more people to be able to use and research on Bayesian Neural Networks. |
Authors: | Enrico Wegner [aut, cre] |
Maintainer: | Enrico Wegner <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.3 |
Built: | 2024-11-05 06:23:39 UTC |
Source: | CRAN |
Installs Julia packages if needed
.install_pkg(...)
.install_pkg(...)
... |
strings of package names |
Obtain the status of the current Julia project
.julia_project_status()
.julia_project_status()
Set a seed both in Julia and R
.set_seed(seed)
.set_seed(seed)
seed |
seed to be used |
No return value, called for side effects.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) .set_seed(123) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) .set_seed(123) ## End(Not run)
Loads Julia packages
.using(...)
.using(...)
... |
strings of package names |
This was proposed in Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015, June). Weight uncertainty in neural network. In International conference on machine learning (pp. 1613-1622). PMLR.
bayes_by_backprop( bnn, batchsize, epochs, mc_samples = 1, opt = opt.ADAM(), n_samples_convergence = 10 )
bayes_by_backprop( bnn, batchsize, epochs, mc_samples = 1, opt = opt.ADAM(), n_samples_convergence = 10 )
bnn |
a BNN obtained using |
batchsize |
batch size |
epochs |
number of epochs to run for |
mc_samples |
samples to use in each iteration for the MC approximation usually one is enough. |
opt |
An optimiser. These all start with 'opt.'. See for example |
n_samples_convergence |
At the end of each iteration convergence is checked using this many MC samples. |
a list containing
'juliavar' - julia variable storing VI
'juliacode' - julia representation of function call
'params' - variational family parameters for each iteration
'losses' - BBB loss in each iteration
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) data <- matrix(rnorm(10*1000), ncol = 10) # Choosing sequences of length 10 and predicting one period ahead tensor <- tensor_embed_mat(data, 10+1) x <- tensor[1:10, , , drop = FALSE] # Last value in each sequence is the target value y <- tensor[11,,] bnn <- BNN(x, y, like, prior, init) vi <- bayes_by_backprop(bnn, 100, 100) vi_samples <- vi.get_samples(vi, n = 1000) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) data <- matrix(rnorm(10*1000), ncol = 10) # Choosing sequences of length 10 and predicting one period ahead tensor <- tensor_embed_mat(data, 10+1) x <- tensor[1:10, , , drop = FALSE] # Last value in each sequence is the target value y <- tensor[11,,] bnn <- BNN(x, y, like, prior, init) vi <- bayes_by_backprop(bnn, 100, 100) vi_samples <- vi.get_samples(vi, n = 1000) ## End(Not run)
This will set up a new Julia environment in the current working directory or another folder if provided. This environment will then be set with all Julia dependencies needed.
BayesFluxR_setup( pkg_check = TRUE, nthreads = 4, seed = NULL, env_path = getwd(), installJulia = FALSE, ... )
BayesFluxR_setup( pkg_check = TRUE, nthreads = 4, seed = NULL, env_path = getwd(), installJulia = FALSE, ... )
pkg_check |
(Default=TRUE) Check whether needed Julia packages are installed |
nthreads |
(Default=4) How many threads to make available to Julia |
seed |
Seed to be used. |
env_path |
The path to were the Julia environment should be created. By default, this is the current working directory. |
installJulia |
(Default=TRUE) Whether to install Julia |
... |
Other parameters passed on to |
No return value, called for side effects.
## Not run: ## Time consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) ## End(Not run)
## Not run: ## Time consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) ## End(Not run)
Create a Bayesian Neural Network
BNN(x, y, like, prior, init)
BNN(x, y, like, prior, init)
x |
For a Feedforward structure, this must be a matrix of dimensions variables x observations; For a recurrent structure, this must be a tensor of dimensions sequence_length x number_variables x number_sequences; In general, the last dimension is always the dimension over which will be batched. |
y |
A vector or matrix with observations. |
like |
Likelihood; See for example |
prior |
Prior; See for example |
init |
Initialiser; See for example |
List with the following content
'juliavar' - the julia variable containing the BNN
'juliacode' - the string representation of the BNN
'x' - x
'juliax' - julia variable holding x
'y' - y
'juliay' - julia variable holding y
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Obtain the total parameters of the BNN
BNN.totparams(bnn)
BNN.totparams(bnn)
bnn |
A BNN formed using |
The total number of parameters in the BNN
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
Chain various layers together to form a network
Chain(...)
Chain(...)
... |
Comma separated layers |
List with the following content
juliavar - the julia variable containing the network
specification - the string representation of the network
nc - the julia variable for the network constructor
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) Chain(LSTM(5, 5)) Chain(RNN(5, 5, "tanh")) Chain(Dense(1, 5)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) Chain(LSTM(5, 5)) Chain(RNN(5, 5, "tanh")) Chain(Dense(1, 5)) ## End(Not run)
Create a Dense layer with 'in_size' inputs and 'out_size' outputs using 'act' activation function
Dense(in_size, out_size, act = c("identity", "sigmoid", "tanh", "relu"))
Dense(in_size, out_size, act = c("identity", "sigmoid", "tanh", "relu"))
in_size |
Input size |
out_size |
Output size |
act |
Activation function |
A list with the following content
in_size - Input Size
out_size - Output Size
activation - Activation Function
julia - Julia code representing the Layer
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 5, "relu")) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 5, "relu")) ## End(Not run)
Find the MAP of a BNN using SGD
find_mode(bnn, optimiser, batchsize, epochs)
find_mode(bnn, optimiser, batchsize, epochs)
bnn |
a BNN obtained using |
optimiser |
an optimiser. These start with 'opt.'.
See for example |
batchsize |
batch size |
epochs |
number of epochs to run for |
Returns a vector. Use posterior_predictive
to obtain a prediction using this MAP estimate.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.RMSProp(), 10, 100) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.RMSProp(), 10, 100) ## End(Not run)
Creates a Gamma prior in Julia using Distributions.jl
Gamma(shape = 2, scale = 2)
Gamma(shape = 2, scale = 2)
shape |
shape parameter |
scale |
scale parameter |
A list with the following content
juliavar - julia variable containing the distribution
juliacode - julia code used to create the distribution
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) ## End(Not run)
Creates a random string that is used as variable in julia
get_random_symbol()
get_random_symbol()
Initialises all parameters of the network, all hyper parameters of the prior and all additional parameters of the likelihood by drawing random values from 'dist'.
initialise.allsame(dist, like, prior)
initialise.allsame(dist, like, prior)
dist |
A distribution; See for example |
like |
A likelihood; See for example |
prior |
A prior; See for example |
A list containing the following
'juliavar' - julia variable storing the initialiser
'juliacode' - julia code used to create the initialiser
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
Creates and Inverse Gamma prior in Julia using Distributions.jl
InverseGamma(shape = 2, scale = 2)
InverseGamma(shape = 2, scale = 2)
shape |
shape parameter |
scale |
scale parameter |
A list with the following content
juliavar - julia variable containing the distribution
juliacode - julia code used to create the distribution
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, InverseGamma(2.0, 0.5)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, InverseGamma(2.0, 0.5)) ## End(Not run)
This creates a likelihood of the form
where the is fed through the network in a standard feedforward way.
likelihood.feedforward_normal(chain, sig_prior)
likelihood.feedforward_normal(chain, sig_prior)
chain |
Network structure obtained using |
sig_prior |
A prior distribution for sigma defined using
|
A list containing the following
juliavar - julia variable containing the likelihood
juliacode - julia code used to create the likelihood
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
This creates a likelihood of the form
where the is fed through the network in the standard feedforward way.
likelihood.feedforward_tdist(chain, sig_prior, nu = 30)
likelihood.feedforward_tdist(chain, sig_prior, nu = 30)
chain |
Network structure obtained using |
sig_prior |
A prior distribution for sigma defined using
|
nu |
DF of TDist |
see likelihood.feedforward_normal
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_tdist(net, Gamma(2.0, 0.5), nu=8) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_tdist(net, Gamma(2.0, 0.5), nu=8) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
This creates a likelihood of the form
Here is a subsequence which will be fed through the recurrent
network to obtain the final output
. Thus, if
one has a single time series, and splits the single time series into subsequences
of length K which are then used to predict the next output of the time series, then
each
consists of K consecutive observations of the time series. In a sense
one constraints the maximum memory length of the network this way.
likelihood.seqtoone_normal(chain, sig_prior)
likelihood.seqtoone_normal(chain, sig_prior)
chain |
Network structure obtained using |
sig_prior |
A prior distribution for sigma defined using
|
see likelihood.feedforward_normal
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- array(rnorm(5*100*10), dim=c(10,5,100)) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- array(rnorm(5*100*10), dim=c(10,5,100)) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
See likelihood.seqtoone_normal
and likelihood.feedforward_tdist
for details,
likelihood.seqtoone_tdist(chain, sig_prior, nu = 30)
likelihood.seqtoone_tdist(chain, sig_prior, nu = 30)
chain |
Network structure obtained using |
sig_prior |
A prior distribution for sigma defined using
|
nu |
DF of TDist |
see likelihood.feedforward_normal
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_tdist(net, Gamma(2.0, 0.5), nu=5) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- array(rnorm(5*100*10), dim=c(10,5,100)) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_tdist(net, Gamma(2.0, 0.5), nu=5) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- array(rnorm(5*100*10), dim=c(10,5,100)) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
Create an LSTM layer with 'in_size' input size, and 'out_size' hidden state size
LSTM(in_size, out_size)
LSTM(in_size, out_size)
in_size |
Input size |
out_size |
Output size |
A list with the following content
in_size - Input Size
out_size - Output Size
julia - Julia code representing the Layer
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(LSTM(5, 5)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(LSTM(5, 5)) ## End(Not run)
Use the diagonal of sample covariance matrix as inverse mass matrix.
madapter.DiagCov(adapt_steps, windowlength, kappa = 0.5, epsilon = 1e-06)
madapter.DiagCov(adapt_steps, windowlength, kappa = 0.5, epsilon = 1e-06)
adapt_steps |
Number of adaptation steps |
windowlength |
Lookback window length for calculation of covariance |
kappa |
How much to shrink towards the identity |
epsilon |
Small value to add to diagonal so as to avoid numerical non-pos-def problem |
list containing 'juliavar' and 'juliacode' and all given arguments.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.DiagCov(100, 10) sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.DiagCov(100, 10) sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Use a fixed mass matrix
madapter.FixedMassMatrix(mat = NULL)
madapter.FixedMassMatrix(mat = NULL)
mat |
(Default=NULL); inverse mass matrix; If 'NULL', then identity matrix will be used |
list with 'juliavar' and 'juliacode' and given matrix or 'NULL'
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.FixedMassMatrix() sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) # Providing a non-sense weight matrix weight_matrix <- matrix(runif(BNN.totparams(bnn)^2, 0, 1), nrow = BNN.totparams(bnn)) madapter2 <- madapter.FixedMassMatrix(weight_matrix) sampler2 <- sampler.GGMC(madapter = madapter2) ch2 <- mcmc(bnn, 10, 1000, sampler2) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.FixedMassMatrix() sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) # Providing a non-sense weight matrix weight_matrix <- matrix(runif(BNN.totparams(bnn)^2, 0, 1), nrow = BNN.totparams(bnn)) madapter2 <- madapter.FixedMassMatrix(weight_matrix) sampler2 <- sampler.GGMC(madapter = madapter2) ch2 <- mcmc(bnn, 10, 1000, sampler2) ## End(Not run)
Use the full covariance matrix as inverse mass matrix
madapter.FullCov(adapt_steps, windowlength, kappa = 0.5, epsilon = 1e-06)
madapter.FullCov(adapt_steps, windowlength, kappa = 0.5, epsilon = 1e-06)
adapt_steps |
Number of adaptation steps |
windowlength |
Lookback window length for calculation of covariance |
kappa |
How much to shrink towards the identity |
epsilon |
Small value to add to diagonal so as to avoid numerical non-pos-def problem |
see madapter.DiagCov
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.FullCov(100, 10) sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.FullCov(100, 10) sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Use RMSProp as a preconditions/mass matrix adapter. This was proposed in Li, C., Chen, C., Carlson, D., & Carin, L. (2016, February). Preconditioned stochastic gradient Langevin dynamics for deep neural networks. In Thirtieth AAAI Conference on Artificial Intelligence for the use in SGLD and related methods.
madapter.RMSProp(adapt_steps, lambda = 1e-05, alpha = 0.99)
madapter.RMSProp(adapt_steps, lambda = 1e-05, alpha = 0.99)
adapt_steps |
number of adaptation steps |
lambda |
see above paper |
alpha |
see above paper |
list with 'juliavar' and 'juliacode' and all given arguments
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.RMSProp(100) sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) madapter <- madapter.RMSProp(100) sampler <- sampler.GGMC(madapter = madapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Sample from a BNN using MCMC
mcmc( bnn, batchsize, numsamples, sampler = sampler.SGLD(stepsize_a = 1), continue_sampling = FALSE, start_value = NULL )
mcmc( bnn, batchsize, numsamples, sampler = sampler.SGLD(stepsize_a = 1), continue_sampling = FALSE, start_value = NULL )
bnn |
A BNN obtained using |
batchsize |
batchsize to use; Most samplers allow for batching. For some, theoretical justifications are missing (HMC) |
numsamples |
Number of mcmc samples |
sampler |
Sampler to use; See for example |
continue_sampling |
Do not start new sampling, but rather continue sampling For this, numsamples must be greater than the already sampled number. |
start_value |
Values to start from. By default these will be sampled using the initialiser in 'bnn'. |
a list containing the 'samples' and the 'sampler' used.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGNHTS(1e-3) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGNHTS(1e-3) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Creates a Normal prior in Julia using Distributions.jl. This can
then be truncated using Truncated
to obtain a prior
that could then be used as a variance prior.
Normal(mu = 0, sigma = 1)
Normal(mu = 0, sigma = 1)
mu |
Mean |
sigma |
Standard Deviation |
see Gamma
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Truncated(Normal(0, 0.5), 0, Inf)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Truncated(Normal(0, 0.5), 0, Inf)) ## End(Not run)
ADAM optimiser
opt.ADAM(eta = 0.001, beta = c(0.9, 0.999), eps = 1e-08)
opt.ADAM(eta = 0.001, beta = c(0.9, 0.999), eps = 1e-08)
eta |
stepsize |
beta |
momentum decays; must be a list of length 2 |
eps |
Flux does not document this |
see opt.Descent
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.ADAM(), 10, 100) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.ADAM(), 10, 100) ## End(Not run)
Standard gradient descent
opt.Descent(eta = 0.1)
opt.Descent(eta = 0.1)
eta |
stepsize |
list containing
'julivar' - julia variable holding the optimiser
'juliacode' - string representation
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.Descent(1e-5), 10, 100) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.Descent(1e-5), 10, 100) ## End(Not run)
RMSProp optimiser
opt.RMSProp(eta = 0.001, rho = 0.9, eps = 1e-08)
opt.RMSProp(eta = 0.001, rho = 0.9, eps = 1e-08)
eta |
learning rate |
rho |
momentum |
eps |
not documented by Flux |
see opt.Descent
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.RMSProp(), 10, 100) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) find_mode(bnn, opt.RMSProp(), 10, 100) ## End(Not run)
Draw from the posterior predictive distribution
posterior_predictive(bnn, posterior_samples, x = NULL)
posterior_predictive(bnn, posterior_samples, x = NULL)
bnn |
a BNN obtained using |
posterior_samples |
a vector or matrix containing posterior
samples. This can be obtained using |
x |
input variables. If 'NULL' (default), training values will be used. |
A matrix whose columns are the posterior predictive draws.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) pp <- posterior_predictive(bnn, ch$samples) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) pp <- posterior_predictive(bnn, ch$samples) ## End(Not run)
Sample from the prior predictive of a Bayesian Neural Network
prior_predictive(bnn, n = 1)
prior_predictive(bnn, n = 1)
bnn |
BNN obtained using |
n |
Number of samples |
matrix of prior predictive samples; Columns are the different samples
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) pp <- prior_predictive(bnn, n = 10) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) pp <- prior_predictive(bnn, n = 10) ## End(Not run)
Use a Multivariate Gaussian prior for all network parameters. Covariance matrix is set to be equal 'sigma * I' with 'I' being the identity matrix. Mean is zero.
prior.gaussian(chain, sigma)
prior.gaussian(chain, sigma)
chain |
Chain obtained using |
sigma |
Standard deviation of Gaussian prior |
a list containing the following
'juliavar' the julia variable used to store the prior
'juliacode' the julia code
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Uses a scale mixture of Gaussian for each network parameter. That is, the prior is given by
prior.mixturescale(chain, sigma1, sigma2, pi1)
prior.mixturescale(chain, sigma1, sigma2, pi1)
chain |
Chain obtained using |
sigma1 |
Standard deviation of first Gaussian |
sigma2 |
Standard deviation of second Gaussian |
pi1 |
Weight of first Gaussian |
a list containing the following
'juliavar' the julia variable used to store the prior
'juliacode' the julia code
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.mixturescale(net, 10, 0.1, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.mixturescale(net, 10, 0.1, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Create a RNN layer with 'in_size' input, 'out_size' hidden state and 'act' activation function
RNN(in_size, out_size, act = c("sigmoid", "tanh", "identity", "relu"))
RNN(in_size, out_size, act = c("sigmoid", "tanh", "identity", "relu"))
in_size |
Input size |
out_size |
Output size |
act |
Activation function |
A list with the following content
in_size - Input Size
out_size - Output Size
activation - Activation Function
julia - Julia code representing the Layer
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 5, "tanh")) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 5, "tanh")) ## End(Not run)
Use a constant stepsize in mcmc
sadapter.Const(l)
sadapter.Const(l)
l |
stepsize |
list with 'juliavar', 'juliacode' and the given arguments
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.Const(1e-5) sampler <- sampler.GGMC(sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.Const(1e-5) sampler <- sampler.GGMC(sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Use Dual Averaging like in STAN to tune stepsize
sadapter.DualAverage( adapt_steps, initial_stepsize = 1, target_accept = 0.65, gamma = 0.05, t0 = 10, kappa = 0.75 )
sadapter.DualAverage( adapt_steps, initial_stepsize = 1, target_accept = 0.65, gamma = 0.05, t0 = 10, kappa = 0.75 )
adapt_steps |
number of adaptation steps |
initial_stepsize |
initial stepsize |
target_accept |
target acceptance ratio |
gamma |
See STAN manual NUTS paper |
t0 |
See STAN manual or NUTS paper |
kappa |
See STAN manual or NUTS paper |
list with 'juliavar', 'juliacode', and all given arguments
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.DualAverage(100) sampler <- sampler.GGMC(sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.DualAverage(100) sampler <- sampler.GGMC(sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Haario, H., Saksman, E., & Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli, 223-242.
sampler.AdaptiveMH(bnn, t0, sd, eps = 1e-06)
sampler.AdaptiveMH(bnn, t0, sd, eps = 1e-06)
bnn |
BNN obtained using |
t0 |
Number of iterators before covariance adaptation will be started. Also the lookback period for covariance adaptation. |
sd |
Tuning parameter; See paper |
eps |
Used for numerical reasons. Increase this if pos-def-error thrown. |
a list with 'juliavar', 'juliacode', and all given arguments
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.AdaptiveMH(bnn, 10, 1) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.AdaptiveMH(bnn, 10, 1) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Proposed in Garriga-Alonso, A., & Fortuin, V. (2021). Exact langevin dynamics with stochastic gradients. arXiv preprint arXiv:2102.01691.
sampler.GGMC( beta = 0.1, l = 1, sadapter = sadapter.DualAverage(1000), madapter = madapter.FixedMassMatrix(), steps = 3 )
sampler.GGMC( beta = 0.1, l = 1, sadapter = sadapter.DualAverage(1000), madapter = madapter.FixedMassMatrix(), steps = 3 )
beta |
See paper |
l |
stepsize |
sadapter |
Stepsize adapter; Not used in original paper |
madapter |
Mass adapter; Not used in ogirinal paper |
steps |
Number of steps before accept/reject |
a list with 'juliavar', 'juliacode' and all provided arguments.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.DualAverage(100) sampler <- sampler.GGMC(sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.DualAverage(100) sampler <- sampler.GGMC(sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Allows for the use of stochastic gradients, but the validity of doing so is not clear.
sampler.HMC( l, path_len, sadapter = sadapter.DualAverage(1000), madapter = madapter.FixedMassMatrix() )
sampler.HMC( l, path_len, sadapter = sadapter.DualAverage(1000), madapter = madapter.FixedMassMatrix() )
l |
stepsize |
path_len |
number of leapfrog steps |
sadapter |
Stepsize adapter |
madapter |
Mass adapter |
This is motivated by parts of the discussion in Neal, R. M. (1996). Bayesian Learning for Neural Networks (Vol. 118). Springer New York. https://doi.org/10.1007/978-1-4612-0745-0
a list with 'juliavar', 'juliacode', and all given arguments
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.DualAverage(100) sampler <- sampler.HMC(1e-3, 3, sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sadapter <- sadapter.DualAverage(100) sampler <- sampler.HMC(1e-3, 3, sadapter = sadapter) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Stepsizes will be adapted according to
sampler.SGLD( stepsize_a = 0.1, stepsize_b = 0, stepsize_gamma = 0.55, min_stepsize = -Inf )
sampler.SGLD( stepsize_a = 0.1, stepsize_b = 0, stepsize_gamma = 0.55, min_stepsize = -Inf )
stepsize_a |
See eq. above |
stepsize_b |
See eq. above |
stepsize_gamma |
see eq. above |
min_stepsize |
Do not decrease stepsize beyond this |
a list with 'juliavar', 'juliacode', and all given arguments
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Proposed in Leimkuhler, B., & Shang, X. (2016). Adaptive thermostats for noisy gradient systems. SIAM Journal on Scientific Computing, 38(2), A712-A736.
sampler.SGNHTS( l, sigmaA = 1, xi = 1, mu = 1, madapter = madapter.FixedMassMatrix() )
sampler.SGNHTS( l, sigmaA = 1, xi = 1, mu = 1, madapter = madapter.FixedMassMatrix() )
l |
Stepsize |
sigmaA |
Diffusion factor |
xi |
Thermostat |
mu |
Free parameter of thermostat |
madapter |
Mass Adapter; Not used in original paper and thus has no theoretical backing |
This is similar to SGNHT as proposed in Ding, N., Fang, Y., Babbush, R., Chen, C., Skeel, R. D., & Neven, H. (2014). Bayesian sampling using stochastic gradient thermostats. Advances in neural information processing systems, 27.
a list with 'juliavar', 'juliacode' and all arguments provided
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGNHTS(1e-3) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGNHTS(1e-3) ch <- mcmc(bnn, 10, 1000, sampler) ## End(Not run)
Print a summary of a BNN
## S3 method for class 'BNN' summary(object, ...)
## S3 method for class 'BNN' summary(object, ...)
object |
A BNN created using |
... |
Not used |
This is used when working with recurrent networks, especially in the case of seq-to-one modelling. Creates overlapping subsequences of the data with length 'len_seq'. Returned dimensions are seq_len x num_vars x num_subsequences.
tensor_embed_mat(mat, len_seq)
tensor_embed_mat(mat, len_seq)
mat |
Matrix of time series |
len_seq |
subsequence length |
A tensor of dimension: len_seq x num_vars x num_subsequences
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) data <- matrix(rnorm(5*1000), ncol = 5) # Choosing sequences of length 10 and predicting one period ahead tensor <- tensor_embed_mat(data, 10+1) x <- tensor[1:10, , , drop = FALSE] # Last value in each sequence is the target value y <- tensor[11,1,] bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) data <- matrix(rnorm(5*1000), ncol = 5) # Choosing sequences of length 10 and predicting one period ahead tensor <- tensor_embed_mat(data, 10+1) x <- tensor[1:10, , , drop = FALSE] # Last value in each sequence is the target value y <- tensor[11,1,] bnn <- BNN(x, y, like, prior, init) BNN.totparams(bnn) ## End(Not run)
BayesFluxR returns draws in a matrix of dimension params x draws. This cannot be used with the 'bayesplot' package which expects an array of dimensions draws x chains x params.
to_bayesplot(ch, param_names = NULL)
to_bayesplot(ch, param_names = NULL)
ch |
Chain of draws obtained using |
param_names |
If 'NULL', the parameter names will be of the form 'param_1', 'param_2', etc. If 'param_names' is a string, the parameter names will start with the string with the number of the parameter attached to it. If 'param_names' is a vector, it has to provide a name for each paramter in the chain. |
Returns an array of dimensions draws x chains x params.
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ch <- to_bayesplot(ch) library(bayesplot) mcmc_intervals(ch, pars = paste0("param_", 1:10)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) x <- matrix(rnorm(5*100), nrow = 5) y <- rnorm(100) bnn <- BNN(x, y, like, prior, init) sampler <- sampler.SGLD() ch <- mcmc(bnn, 10, 1000, sampler) ch <- to_bayesplot(ch) library(bayesplot) mcmc_intervals(ch, pars = paste0("param_", 1:10)) ## End(Not run)
Truncates a Julia Distribution between 'lower' and 'upper'.
Truncated(dist, lower, upper)
Truncated(dist, lower, upper)
dist |
A Julia Distribution created using |
lower |
lower bound |
upper |
upper bound |
see Gamma
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Truncated(Normal(0, 0.5), 0, Inf)) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(Dense(5, 1)) like <- likelihood.feedforward_normal(net, Truncated(Normal(0, 0.5), 0, Inf)) ## End(Not run)
Draw samples form a variational family.
vi.get_samples(vi, n = 1)
vi.get_samples(vi, n = 1)
vi |
obtained using |
n |
number of samples |
a matrix whose columns are draws from the variational posterior
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) data <- matrix(rnorm(10*1000), ncol = 10) # Choosing sequences of length 10 and predicting one period ahead tensor <- tensor_embed_mat(data, 10+1) x <- tensor[1:10, , , drop = FALSE] # Last value in each sequence is the target value y <- tensor[11,,] bnn <- BNN(x, y, like, prior, init) vi <- bayes_by_backprop(bnn, 100, 100) vi_samples <- vi.get_samples(vi, n = 1000) pp <- posterior_predictive(bnn, vi_samples) ## End(Not run)
## Not run: ## Needs previous call to `BayesFluxR_setup` which is time ## consuming and requires Julia and BayesFlux.jl BayesFluxR_setup(installJulia=TRUE, seed=123) net <- Chain(RNN(5, 1)) like <- likelihood.seqtoone_normal(net, Gamma(2.0, 0.5)) prior <- prior.gaussian(net, 0.5) init <- initialise.allsame(Normal(0, 0.5), like, prior) data <- matrix(rnorm(10*1000), ncol = 10) # Choosing sequences of length 10 and predicting one period ahead tensor <- tensor_embed_mat(data, 10+1) x <- tensor[1:10, , , drop = FALSE] # Last value in each sequence is the target value y <- tensor[11,,] bnn <- BNN(x, y, like, prior, init) vi <- bayes_by_backprop(bnn, 100, 100) vi_samples <- vi.get_samples(vi, n = 1000) pp <- posterior_predictive(bnn, vi_samples) ## End(Not run)