Package 'EzGP' reference manual

Title:	Easy-to-Interpret Gaussian Process Models for Computer Experiments
Description:	Fit model for datasets with easy-to-interpret Gaussian process modeling, predict responses for new inputs. The input variables of the datasets can be quantitative, qualitative/categorical or mixed. The output variable of the datasets is a scalar (quantitative). The optimization of the likelihood function can be chosen by the users (see the documentation of EzGP_fit()). The modeling method is published in "EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors" by Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (2022) <doi:10.1137/19M1288462>.
Authors:	Jiayi Li [cre, aut], Qian Xiao [aut], Abhyuday Mandal [aut], C. Devon Lin [aut], Xinwei Deng [aut]
Maintainer:	Jiayi Li <jiayili0123@outlook.com>
License:	GPL-2
Version:	0.1.0
Built:	2025-03-24 06:41:35 UTC
Source:	CRAN

The Function for Constructing the Covariance Matrix in `EzGP` Package

Description

Builds the covariance matrix for the given dataset according to different models.

Usage

cov_m(X, p, q, m, n, parv, tau = 0, models = 0)
cov_m(X, p, q, m, n, parv, tau = 0, models = 0)

Arguments

`X`	Matrix or data frame containing the inputs of training data. Each row represents the input setting of a data point and the columns are values of quantitative variables and qualitative variables.
`p`	Number of quantitative factors in the given dataset `X`.
`q`	Number of qualitative factors in the given dataset `X`.
`m`	A vector containing numbers of levels in qualitative factors.
`n`	Number of training data points
`parv`	Parameters in the EzGP/EEzGP model
`tau`	Nugget if needed. The default nugget is 0, otherwise it has to be a non-negative real value.
`models`	Model indicator that indicates which model the covariance matrix is built for. 0 for EzGP model, 1 for EEzGP model. The default setting is 0.

Details

EzGP_fit, EzGP_predict, EEzGP_fit, EEzGP_predict, LEzGP_fit, and LLF_gradients will call this function.

Value

The covariance matrix for the given dataset.

Note

This function is used inside other functions in this package and is NOT exported once the EzGP package is loaded.

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# see the examples in the documentation of the function EzGP_fit.
# see the examples in the documentation of the function EzGP_fit.

The Fitting Function of `EEzGP` Model

Description

Fits an Efficient Easy-to-Interpret Gaussian process (EEzGP) model to a dataset as described in reference 1. The input variables are mixed (with both quantitative and qualitative inputs). The output variable is quantitative and scalar.

Usage

EEzGP_fit(
  X,
  Y,
  p,
  q,
  m,
  tau = 0,
  lb = "T",
  ub = "T",
  x0 = "T",
  xtol_rel = 1e-05,
  maxeval = 100,
  algorithm = "NLOPT_LD_LBFGS"
)
EEzGP_fit(
  X,
  Y,
  p,
  q,
  m,
  tau = 0,
  lb = "T",
  ub = "T",
  x0 = "T",
  xtol_rel = 1e-05,
  maxeval = 100,
  algorithm = "NLOPT_LD_LBFGS"
)

Arguments

`X`	Matrix or data frame containing the inputs of training data. Each row represents the input setting of a data point and the columns are values of quantitative variables and qualitative variables.
`Y`	Vector containing the outputs of training data points.
`p`	Number of quantitative factors in the given dataset `X`.
`q`	Number of qualitative factors in the given dataset `X`.
`m`	A vector containing numbers of levels in qualitative factors.
`tau`	Nugget if needed. The default nugget is 0, otherwise it has to be a non-negative real value.
`lb`	Vector with lower bounds of the parameter estimation. "T" for applying the default setting of lb (a vector of length number of parameters whose elements are all 0.1), otherwise one must provide a vector with the length being the number of parameters.
`ub`	Vector with upper bounds of the parameter estimation. "T" for applying the default setting of ub (a vector of length number of parameters whose first `q+1` elements are 100 and the rest `number of parameters - q - 1` elements are 10), otherwise one must provide a vector with length of the number of parameters.
`x0`	Vector with starting values for the optimization. "T" for applying the default setting of x0 (a vector made by `(lb + ub)/2`), otherwise one must provide a vector with the length being the number of parameters.
`xtol_rel`	Stopping criterion for relative change reached.
`maxeval`	Termination condition by specifying a maximum number of function.
`algorithm`	Optimization algorithm. See NLopt Algorithms for more availiable algorithms.

Value

A model of class "EzGP model" list of the following items:

param A list containing the estimated parameters
data A list containing the fitted dataset and the information for fitting

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# Example with 3 quantitative and 3 qualitative variables (dataset included in the package):
#     Fit an EEzGP model (with default settings), and then perform the prediction.
p = 3
q = 3
m=c(3,3,3)
tau = 0
X = EzGP_data[1:25, 1:(p+q)]
Y = EzGP_data[1:25, p+q+1]
X_new = EzGP_data[26:30, 1:(p+q)]
# EEzGP Model and Prediction
model <- EEzGP_fit(X, Y, p, q, m)
pred <- EEzGP_predict(X_new, model, MSE_on = 1)
result <- LLF_gradients(X, Y, p, q, m, model$param, tau = 0, models = 1)
# Results showing
model
pred
result
# Example with 3 quantitative and 3 qualitative variables (dataset included in the package):
#     Fit an EEzGP model (with default settings), and then perform the prediction.
p = 3
q = 3
m=c(3,3,3)
tau = 0
X = EzGP_data[1:25, 1:(p+q)]
Y = EzGP_data[1:25, p+q+1]
X_new = EzGP_data[26:30, 1:(p+q)]
# EEzGP Model and Prediction
model <- EEzGP_fit(X, Y, p, q, m)
pred <- EEzGP_predict(X_new, model, MSE_on = 1)
result <- LLF_gradients(X, Y, p, q, m, model$param, tau = 0, models = 1)
# Results showing
model
pred
result

The Prediction Function of `EEzGP` Model

Description

Predicts the output of the EEzGP model fitted by EEzGP_fit.

Usage

EEzGP_predict(X_new, model, MSE_on = 0)
EEzGP_predict(X_new, model, MSE_on = 0)

Arguments

`X_new`	Matrix or vector containing the input(s) where the predictions are to be made. Each row is an input vector.
`model`	The EEzGP model fitted by `EEzGP_fit`.
`MSE_on`	A scalar indicating whether the uncertainty (i.e., mean squared error `MSE`) is calculated. Set to a non-zero value to calculate `MSE`.

Value

A prediction list containing the following components:

Y_hat A vector containing the prediction values
MSE A vector containing the prediction uncertainty (i.e., the covariance or covariance matrix for the output(s) at prediction location(s))

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# This function is used in a similar way as the use of EzGP_predict.
# See the examples in the documentation of the function EEzGP_fit.
# This function is used in a similar way as the use of EzGP_predict.
# See the examples in the documentation of the function EEzGP_fit.

Dataset for the example in function 'EzGP_fit'

Description

Data are sampled from the modified math function based on Example 4.1 in the paper listed in references. There are 3 quantitative factors and 3 qualitative factors each having 3 levels. In this dataset, there are 1296 data points. For the simplicity of illustration, we take the first 81 rows as training data points, and the last 1215 rows as testing data points.

Usage

data(EzGP_data)
data(EzGP_data)

Format

A named list containing training data and testing data:

"x1": 1st quantitative factor
"x2": 2nd quantitative factor
"x3": 3rd quantitative factor
"z1": 1st qualitative factor, which has 3 levels
"z2": 2nd qualitative factor, which has 3 levels
"z3": 3rd qualitative factor, which has 3 levels
"y": Response vector

Source

The dataset can be generated with the code at the end of this description file.

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

data(EzGP_data)
#Number of quantitative factors
p = 3
#Number of qualitative factors
q = 3
#Vector containing numbers of levels in qualitative factors
m=c(3,3,3)
# Nugget
tau = 0

X = EzGP_data[1:81, 1:(p+q)]
Y = EzGP_data[1:81, p+q+1]
X_new = EzGP_data[82:1296, 1:(p+q)]
data(EzGP_data)
#Number of quantitative factors
p = 3
#Number of qualitative factors
q = 3
#Vector containing numbers of levels in qualitative factors
m=c(3,3,3)
# Nugget
tau = 0

X = EzGP_data[1:81, 1:(p+q)]
Y = EzGP_data[1:81, p+q+1]
X_new = EzGP_data[82:1296, 1:(p+q)]

The Fitting Function of `EzGP` Model

Description

Fits an Easy-to-Interpret Gaussian process (EzGP) model to a dataset as described in reference 1. The input variables are mixed (with both quantitative and qualitative inputs) The output variable is quantitative and scalar.

Usage

EzGP_fit(
  X,
  Y,
  p,
  q,
  m,
  tau = 0,
  lb = "T",
  ub = "T",
  x0 = "T",
  xtol_rel = 1e-05,
  maxeval = 100,
  algorithm = "NLOPT_LD_LBFGS"
)
EzGP_fit(
  X,
  Y,
  p,
  q,
  m,
  tau = 0,
  lb = "T",
  ub = "T",
  x0 = "T",
  xtol_rel = 1e-05,
  maxeval = 100,
  algorithm = "NLOPT_LD_LBFGS"
)

Arguments

`X`	Matrix or data frame containing the inputs of training data. Each row represents the input setting of a data point and the columns are values of quantitative variables and qualitative variables.
`Y`	Vector containing the outputs of training data points.
`p`	Number of quantitative factors in the given dataset `X`.
`q`	Number of qualitative factors in the given dataset `X`.
`m`	A vector containing numbers of levels in the qualitative factors.
`tau`	Nugget if needed. The default nugget is 0, otherwise it has to be a non-negative real value.
`lb`	Vector with lower bounds of the parameter estimation. "T" for applying the default setting of lb (a vector of length number of parameters whose elements are all 0.1), otherwise one must provide a vector with length of the number of parameters.
`ub`	Vector with upper bounds of the parameter estimation. "T" for applying the default setting of ub (a vector of length number of parameters whose first `q+1` elements are 100 and the rest `number of parameters - q - 1` elements are 10), otherwise one must provide a vector with the length being the number of parameters.
`x0`	Vector with starting values for the optimization. "T" for applying the default setting of x0 (a vector made by `(lb + ub)/2`), otherwise one must provide a vector with the length being the number of parameters.
`xtol_rel`	Stopping criterion for relative change reached.
`maxeval`	Termination condition by specifying a maximum number of function.
`algorithm`	Optimization algorithm. See NLopt Algorithms for more availiable algorithms.

Value

A model of class "EzGP model" list of the following items:

param A list containing the estimated parameters
data A list containing the dataset and the information for fitting

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# Example with 3 quantitative and 3 qualitative variables (dataset included in the package):
#     Fit an EzGP model (with default settings), and then perform the prediction.
#     This example may run for a while.
p = 3
q = 3
m=c(3,3,3)
tau = 0
X = EzGP_data[1:15, 1:(p+q)]
Y = EzGP_data[1:15, p+q+1]
X_new = EzGP_data[16:20, 1:(p+q)]
# EzGP Model and Prediction
model <- EzGP_fit(X, Y, p, q, m)
pred <- EzGP_predict(X_new, model, MSE_on = 1)
result <- LLF_gradients(X, Y, p, q, m, model$param)
# Results showing
model
pred
result
# Example with 3 quantitative and 3 qualitative variables (dataset included in the package):
#     Fit an EzGP model (with default settings), and then perform the prediction.
#     This example may run for a while.
p = 3
q = 3
m=c(3,3,3)
tau = 0
X = EzGP_data[1:15, 1:(p+q)]
Y = EzGP_data[1:15, p+q+1]
X_new = EzGP_data[16:20, 1:(p+q)]
# EzGP Model and Prediction
model <- EzGP_fit(X, Y, p, q, m)
pred <- EzGP_predict(X_new, model, MSE_on = 1)
result <- LLF_gradients(X, Y, p, q, m, model$param)
# Results showing
model
pred
result

The Prediction Function of `EzGP` Model

Description

Predicts the output of the EzGP model fitted by EzGP_fit.

Usage

EzGP_predict(X_new, model, MSE_on = 0)
EzGP_predict(X_new, model, MSE_on = 0)

Arguments

`X_new`	Matrix or vector containing the input(s) where the predictions are to be made. Each row is an input vector.
`model`	The EzGP model fitted by `EzGP_fit`.
`MSE_on`	A scalar indicating whether the uncertainty (i.e., mean squared error `MSE`) is calculated. Set to a non-zero value to calculate `MSE`.

Value

A prediction list containing the following components:

Y_hat A vector containing the prediction values
MSE A vector containing the prediction uncertainty (i.e., the covariance or covariance matrix for the output(s) at prediction location(s))

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# see the examples in the documentation of the function EzGP_fit.
# see the examples in the documentation of the function EzGP_fit.

Dataset for the example in function 'LEzGP_fit'

Description

Data are sampled from the modified math function based on Example 4.2 and Example 4.3 in the paper listed in references. There are 9 quantitative factors and 9 qualitative factors each having 3 levels. In this dataset, there are 8250 data points. For the simplicity of illustration, we take the first 8150 rows as training data points, and the last 100 rows as testing data points.

Usage

data(LEzGP_data)
data(LEzGP_data)

Format

A named list containing training data and testing data:

"x1-x9": 1st quantitative factor to the 9th quantitative factor
"z1-z9": 1st qualitative factor to the 9th qualitative factor, which all have 3 levels
"ry": Response vector

Source

The dataset can be generated with the code at the end of this description file.

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

data(LEzGP_data)
#Number of quantitative factors
p = 9
#Number of qualitative factors
q = 9
#Vector containing numbers of levels in qualitative factors
m=rep(3,9)
# Nugget
tau = 0

X = LEzGP_data[1:8150, 1:(p+q)]
Y = LEzGP_data[1:8150, p+q+1]
X_new = LEzGP_data[8151:8250, 1:(p+q)]
data(LEzGP_data)
#Number of quantitative factors
p = 9
#Number of qualitative factors
q = 9
#Vector containing numbers of levels in qualitative factors
m=rep(3,9)
# Nugget
tau = 0

X = LEzGP_data[1:8150, 1:(p+q)]
Y = LEzGP_data[1:8150, p+q+1]
X_new = LEzGP_data[8151:8250, 1:(p+q)]

The Fitting Function of `LEzGP` Model

Description

Fits a Localized Easy-to-Interpret Gaussian process (LEzGP) model to a dataset as described in reference 1. The input variables are mixed (with both quantitative and qualitative inputs) The output variable is quantitative and scalar.

Usage

LEzGP_fit(
  X,
  Y,
  p,
  q,
  m,
  tar_z,
  ns,
  models = 1,
  tau = 0,
  lb = "T",
  ub = "T",
  x0 = "T",
  xtol_rel = 1e-05,
  maxeval = 100,
  algorithm = "NLOPT_LD_LBFGS"
)
LEzGP_fit(
  X,
  Y,
  p,
  q,
  m,
  tar_z,
  ns,
  models = 1,
  tau = 0,
  lb = "T",
  ub = "T",
  x0 = "T",
  xtol_rel = 1e-05,
  maxeval = 100,
  algorithm = "NLOPT_LD_LBFGS"
)

Arguments

`X`	Matrix or data frame containing the inputs of training data. Each row represents the input setting of a data point and the columns are values of quantitative variables and qualitative variables.
`Y`	Vector containing the outputs of training data points.
`p`	Number of quantitative factors in the given dataset `X`.
`q`	Number of qualitative factors in the given dataset `X`.
`m`	A vector containing numbers of levels in qualitative factors.
`tar_z`	A vector containing the qualitative part of the chosen target input (described in `reference 1`).
`ns`	The chosen tuning parameter (described in `reference 1`)
`models`	The model for fitting the selected proper subset of the dataset `X`. 0 for EzGP model, 1 for EEzGP model.
`tau`	Nugget if needed. The default nugget is 0, otherwise it has to be a non-negative real value.
`lb`	Vector with lower bounds of the parameter estimation. "T" for applying the default setting of lb (a vector of length number of parameters whose elements are all 0.1), otherwise one must provide a vector with the length being the number of parameters.
`ub`	Vector with upper bounds of the parameter estimation. "T" for applying the default setting of ub (a vector of length number of parameters whose first `q+1` elements are 100 and the rest `number of parameters - q - 1` elements are 10), otherwise one must provide a vector with length of the number of parameters.
`x0`	Vector with starting values for the optimization. "T" for applying the default setting of x0 (a vector made by `(lb + ub)/2`), otherwise one must provide a vector with length being the number of parameters.
`xtol_rel`	Stopping criterion for relative change reached.
`maxeval`	Termination condition by specifying a maximum number of function.
`algorithm`	Optimization algorithm. See NLopt Algorithms for more availiable algorithms.

Value

A model of class "LEzGP model" list of the following items:

param A list containing the estimated parameters
data A list containing the fitted dataset and the information for fitting

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# Example with 9 quantitative and 9 qualitative variables (dataset included in the package):
#     Fit a LEzGP model based on the EEzGP/EzGP model(with default settings), and then
#     perform the prediction.
p = 9
q = 9
m=rep(3,9)
tau = 0
X = LEzGP_data[1:60, 1:(p+q)]
Y = LEzGP_data[1:60, p+q+1]
X_new = LEzGP_data[61:70, 1:(p+q)]
tar_z = X_new[1, (p+1):(p+q)]
ns = 7
# LEzGP Model Based on EEzGP Model
model <- LEzGP_fit(X, Y, p, q, m, tar_z, ns)
y_hat <- EEzGP_predict(X_new, model)
# Results showing
model
y_hat
# Example with 9 quantitative and 9 qualitative variables (dataset included in the package):
#     Fit a LEzGP model based on the EEzGP/EzGP model(with default settings), and then
#     perform the prediction.
p = 9
q = 9
m=rep(3,9)
tau = 0
X = LEzGP_data[1:60, 1:(p+q)]
Y = LEzGP_data[1:60, p+q+1]
X_new = LEzGP_data[61:70, 1:(p+q)]
tar_z = X_new[1, (p+1):(p+q)]
ns = 7
# LEzGP Model Based on EEzGP Model
model <- LEzGP_fit(X, Y, p, q, m, tar_z, ns)
y_hat <- EEzGP_predict(X_new, model)
# Results showing
model
y_hat

The Log-likelihood Function and The Analytical Gradients in `EzGP` Package

Description

Calculates the log-likelihood function value and the analytical gradients as described in reference 1.

Usage

LLF_gradients(X, Y, p, q, m, parv, tau = 0, models = 0)
LLF_gradients(X, Y, p, q, m, parv, tau = 0, models = 0)

Arguments

`X`	Matrix or data frame containing the inputs of training data. Each row represents the input setting of a data point and the columns are values of quantitative variables and qualitative variables.
`Y`	Vector containing the outputs of training data points.
`p`	Number of quantitative factors in the given dataset `X`.
`q`	Number of qualitative factors in the given dataset `X`.
`m`	A vector containing numbers of levels in qualitative factors.
`parv`	Parameters in the EzGP/EEzGP model.
`tau`	Nugget if needed. The default nugget is 0, otherwise it has to be a non-negative real value.
`models`	Model indicator that indicates which model the likelihoods and analytical gradients are applied to. 0 for EzGP model, 1 for EEzGP model.

Value

A list of the following items:

objective The log-likelihood function value.
gradient The analytical gradients.

References

"EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors", Qian Xiao, Abhyuday Mandal, C. Devon Lin, and Xinwei Deng (doi:10.1137/19M1288462)

Examples

# see the examples in the documentation of the function EzGP_fit.
# see the examples in the documentation of the function EzGP_fit.

Package 'EzGP'

Help Index

The Function for Constructing the Covariance Matrix in EzGP Package

Description

Usage

Arguments

Details

Value

Note

References

See Also

Examples

The Fitting Function of EEzGP Model

Description

Usage

Arguments

Value

References

See Also

Examples

The Prediction Function of EEzGP Model

Description

Usage

Arguments

Value

References

See Also

Examples

Dataset for the example in function 'EzGP_fit'

Description

Usage

Format

Source

References

Examples

The Fitting Function of EzGP Model

Description

Usage

Arguments

Value

References

See Also

Examples

The Prediction Function of EzGP Model

Description

Usage

Arguments

Value

References

See Also

Examples

Dataset for the example in function 'LEzGP_fit'

Description

Usage

Format

Source

References

Examples

The Fitting Function of LEzGP Model

Description

Usage

Arguments

Value

References

See Also

Examples

The Log-likelihood Function and The Analytical Gradients in EzGP Package

Description

Usage

Arguments

Value

References

See Also

Examples

The Function for Constructing the Covariance Matrix in `EzGP` Package

The Fitting Function of `EEzGP` Model

The Prediction Function of `EEzGP` Model

The Fitting Function of `EzGP` Model

The Prediction Function of `EzGP` Model

The Fitting Function of `LEzGP` Model

The Log-likelihood Function and The Analytical Gradients in `EzGP` Package