Title: | Fit and Predict a Gaussian Process Model with (Time-Series) Binary Response |
---|---|
Description: | Allows the estimation and prediction for binary Gaussian process model. The mean function can be assumed to have time-series structure. The estimation methods for the unknown parameters are based on penalized quasi-likelihood/penalized quasi-partial likelihood and restricted maximum likelihood. The predicted probability and its confidence interval are computed by Metropolis-Hastings algorithm. More details can be seen in Sung et al (2017) <arXiv:1705.02511>. |
Authors: | Chih-Li Sung |
Maintainer: | Chih-Li Sung <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 0.2 |
Built: | 2024-11-07 06:41:18 UTC |
Source: | CRAN |
The function fits Gaussian process models with binary response. The models can also include time-series term if a time-series binary response is observed. The estimation methods are based on PQL/PQPL and REML (for mean function and correlation parameter, respectively).
binaryGP_fit(X, Y, R = 0, L = 0, corr = list(type = "exponential", power = 2), nugget = 1e-10, constantMean = FALSE, orthogonalGP = FALSE, converge.tol = 1e-10, maxit = 15, maxit.PQPL = 20, maxit.REML = 100, xtol_rel = 1e-10, standardize = FALSE, verbose = TRUE)
binaryGP_fit(X, Y, R = 0, L = 0, corr = list(type = "exponential", power = 2), nugget = 1e-10, constantMean = FALSE, orthogonalGP = FALSE, converge.tol = 1e-10, maxit = 15, maxit.PQPL = 20, maxit.REML = 100, xtol_rel = 1e-10, standardize = FALSE, verbose = TRUE)
X |
a design matrix with dimension |
Y |
a response matrix with dimension |
R |
a positive integer specifying the order of autoregression only if time-series is included. The default is 1. |
L |
a positive integer specifying the order of interaction between |
corr |
a list of parameters for the specifing the correlation to be used. Either exponential correlation function or Matern correlation function can be used. See |
nugget |
a positive value to use for the nugget. If |
constantMean |
logical. |
orthogonalGP |
logical. |
converge.tol |
convergence tolerance. It converges when relative difference with respect to |
maxit |
a positive integer specifying the maximum number of iterations for estimation to be performed before the estimates are convergent. The default is 15. |
maxit.PQPL |
a positive integer specifying the maximum number of iterations for PQL/PQPL estimation to be performed before the estimates are convergent. The default is 20. |
maxit.REML |
a positive integer specifying the maximum number of iterations in |
xtol_rel |
a postive value specifying the convergence tolerance for |
standardize |
logical. If |
verbose |
logical. If |
Consider the model
where and
is a Gaussian process with zero mean, variance
, and correlation function
with unknown correlation parameters
. The power exponential correlation function is used and defined by
where is given by
power
. The parameters in the mean functions including are estimated by PQL/PQPL, depending on whether the mean function has the time-series structure. The parameters in the Gaussian process including
and
are estimated by REML. If
orthogonalGP
is TRUE
, then orthogonal covariance function (Plumlee and Joseph, 2016) is employed. More details can be seen in Sung et al. (unpublished).
alpha_hat |
a vector of coefficient estimates for the linear relationship with X. |
varphi_hat |
a vector of autoregression coefficient estimates. |
gamma_hat |
a vector of the interaction effect estimates. |
theta_hat |
a vector of correlation parameters. |
sigma_hat |
a value of sigma estimate (standard deviation). |
nugget_hat |
if |
orthogonalGP |
|
X |
data |
Y |
data |
R |
order of autoregression. |
L |
order of interaction between X and previous Y. |
corr |
a list of parameters for the specifing the correlation to be used. |
Model.mat |
model matrix. |
eta_hat |
eta_hat. |
standardize |
|
mean.x |
mean of each column of |
scale.x |
|
Chih-Li Sung <[email protected]>
predict.binaryGP
for prediction of the binary Gaussian process, print.binaryGP
for the list of estimation results, and summary.binaryGP
for summary of significance results.
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results
The function computes the predicted response and its variance as well as its confidence interval.
## S3 method for class 'binaryGP' predict(object, xnew, conf.level = 0.95, sim.number = 101, ...)
## S3 method for class 'binaryGP' predict(object, xnew, conf.level = 0.95, sim.number = 101, ...)
object |
a class binaryGP object estimated by |
xnew |
a testing matrix with dimension |
conf.level |
a value from 0 to 1 specifying the level of confidence interval. The default is 0.95. |
sim.number |
a positive integer specifying the simulation number for Monte-Carlo method. The default is 101. |
... |
for compatibility with generic method |
mean |
a matrix with dimension |
var |
a matrix with dimension |
upper.bound |
a matrix with dimension |
lower.bound |
a matrix with dimension |
y_pred |
a matrix with dimension |
Chih-Li Sung <[email protected]>
binaryGP_fit
for estimation of the binary Gaussian process.
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) X.test <- matrix(runif(d * n.test), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector test.out <- test_function(X.test, 1) Y.test <- test.out$Y P.true <- test.out$P # fitting binaryGP.model <- binaryGP_fit(X = X, Y = Y) # prediction binaryGP.prediction <- predict(binaryGP.model, xnew = X.test) print(binaryGP.prediction$mean) print(binaryGP.prediction$var) print(binaryGP.prediction$upper.bound) print(binaryGP.prediction$lower.bound) ##### with time-series ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns test.out <- test_function(X.test, 10) Y.test <- test.out$Y P.true <- test.out$P # fitting binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) # prediction binaryGP.prediction <- predict(binaryGP.model, xnew = X.test) print(binaryGP.prediction$mean) print(binaryGP.prediction$var) print(binaryGP.prediction$upper.bound) print(binaryGP.prediction$lower.bound)
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) X.test <- matrix(runif(d * n.test), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector test.out <- test_function(X.test, 1) Y.test <- test.out$Y P.true <- test.out$P # fitting binaryGP.model <- binaryGP_fit(X = X, Y = Y) # prediction binaryGP.prediction <- predict(binaryGP.model, xnew = X.test) print(binaryGP.prediction$mean) print(binaryGP.prediction$var) print(binaryGP.prediction$upper.bound) print(binaryGP.prediction$lower.bound) ##### with time-series ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns test.out <- test_function(X.test, 10) Y.test <- test.out$Y P.true <- test.out$P # fitting binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) # prediction binaryGP.prediction <- predict(binaryGP.model, xnew = X.test) print(binaryGP.prediction$mean) print(binaryGP.prediction$var) print(binaryGP.prediction$upper.bound) print(binaryGP.prediction$lower.bound)
The function shows the estimation results by binaryGP_fit
.
## S3 method for class 'binaryGP' print(x, ...)
## S3 method for class 'binaryGP' print(x, ...)
x |
a class binaryGP object estimated by binaryGP_fit. |
... |
for compatibility with generic method |
a list of estimates by binaryGP_fit.
Chih-Li Sung <[email protected]>
binaryGP_fit
for estimation of the binary Gaussian process.
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results
The function summarizes estimation and significance results by binaryGP_fit
.
## S3 method for class 'binaryGP' summary(object, ...)
## S3 method for class 'binaryGP' summary(object, ...)
object |
a class binaryGP object estimated by |
... |
for compatibility with generic method |
A table including the estimates by binaryGP_fit
, and the correponding standard deviations, Z-values and p-values.
Chih-Li Sung <[email protected]>
binaryGP_fit
for estimation of the binary Gaussian process.
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results
library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results