---
title: "An introduction to `FGLMtrunc`"
output: 
  rmarkdown::html_vignette:
    toc: yes
    toc_depth: 3
    number_sections: false
vignette: >
  %\VignetteIndexEntry{FGLMtrunc}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
hook_output <- knitr::knit_hooks$get("output")
# set a new output hook to truncate text output
knitr::knit_hooks$set(output = function(x, options) {
  if (!is.null(n <- options$out.lines)) {
    x <- xfun::split_lines(x)
    if (length(x) > n) {
        
      # truncate the output
      x <- c(head(x, n), "....\n")
    }
    x <- paste(x, collapse = "\n")
  }
  hook_output(x, options)
})
```

## Introduction
`FGLMtrunc` is a package that fits truncated Functional Generalized Linear Models as described in [Liu, Divani, and Petersen (2020)](https://doi.org/10.1016/j.csda.2022.107421). It implements methods for both functional linear and functional logistic regression models. The solution path is computed efficiently using active set algorithm with warm start. Optimal smoothing and truncation parameters ($\lambda_s, \lambda_t$) are chosen by Bayesian information criterion (BIC).

To install `FGLMtrunc` directly from CRAN, type in R console this command:
```{r, eval=F}
install.packages("FGLMtrunc")
```
To load the `FGLMtrunc` package, type in R console:
```{r}
library(FGLMtrunc)
```
The function for fitting model is `fglm_trunc`, which have arguments to customize the fit. Below are details on some required arguments:

* `X.curves` is required for matrix of functional predictors. 

* `Y` is required for response vector.

* Either `nbasis` or `knots` is needed to define the interior knots of B-spline.

Please use `?fglm_trunc` for more details on function arguments. We will demonstrate usages of other commonly used arguments by examples.

## Functional Linear Regression (`family="gaussian"`)
Functional linear regression model is the default choice of function `fglm_trunc` with argument `family="gaussian"`.
For illustration, we use dataset `LinearExample`, which we created beforehand following Case I in simulation studies section from Liu et. al. (2020). This dataset contains $n=200$ observations, and functional predictors are observed at $p=101$ timepoints on $[0,1]$ interval. The true truncation point is $\delta = 0.54$.

```{r, fig.align='center', fig.height=4, fig.width=4}
data(LinearExample)
Y_linear = LinearExample$Y
Xcurves_linear = LinearExample$X.curves
timeGrid = seq(0, 1, length.out = 101)
plot(timeGrid, LinearExample$beta.true, type = 'l', 
     main = 'True coefficient function', xlab = "t", ylab=expression(beta(t)))
```

### Fitting `FGLMtrunc` model for linear regression

We fit the model using 50 B-spline basis with default `degree=3` for cubic splines. Since argument `grid` is not specified, an equally spaced sequence of length $p=101$ on $[0,1]$ interval (including boundaries) will automatically be used.
```{r}
fit = fglm_trunc(Y_linear, Xcurves_linear, nbasis = 50)
```
`fglm_trunc` also supports parallel computing to speed up the running time of tuning regularization parameters. Parallel backend must be registered before hand. Here is an example of using parallel with `doMC` backend (we cannot run the code here since it is not available for Windows) :
```{r, eval=F}
library(doMC)
registerDoMC(cores = 2)
fit = fglm_trunc(Y_linear, Xcurves_linear, nbasis = 50, parallel = TRUE)
```

One can also manually provides `grid` or `knots` sequences (or both). If `knots` is specified, `nbasis` will be ignored.
```{r, eval=F}
k <- 50 - 3 - 1 #Numbers of knots = nbasis - degree - 1
knots_n <- seq(0, 1, length.out = k+2)[-c(1, k+2)] # Remove boundary knots
fit2 = fglm_trunc(Y_linear, Xcurves_linear, grid = timeGrid, knots = knots_n)
```

`fit` and `fit2` fitted models will have the same results. 

`fit` is an object of class `FGLMtrunc` that contains relevant estimation results. Please use `?fglm_trunc` for more details on function outputs. Function call and truncation point will be printed with `print` function:
```{r}
print(fit)
``` 

### Plotting with fitted `FGLMtrunc` model
We can visualize the estimates of functional parameter $\beta$ directly with `plot`:
```{r, fig.align='center', fig.height=4, fig.width=4}
plot(fit)
```

The plot shows both smoothing and truncated estimates of $\beta$. We can set argument `include_smooth=FALSE` to show only truncated estimate.

### Predicting with fitted `FGLMtrunc` model
Predict method for `FGLMtrunc` fits works similar to `predict.glm`. Type `"link"` is the default choice for `FGLMtrunc` object. For linear regression, both type `"link"` and `"response"` return fitted values. `newX.curves` is required for these predictions.
```{r}
predict(fit, newX.curves = Xcurves_linear[1:5,])
```

To get truncated estimate of $\beta$, we can use either `fit$beta.truncated` or `predict` function:
```{r out.lines = 12}
predict(fit, type = "coefficients")
```

## Functional Logistic Regression (`family="binomial"`)

For logistic regression, we use dataset `LogisticExample`, which is similar to `LinearExample`, but the response $Y$ was generated as Bernoulli random variable.

```{r}
data(LogisticExample)
Y_logistic = LogisticExample$Y
Xcurves_logistic = LogisticExample$X.curves
```

### Fitting `FGLMtrunc` model for logistic regression

Similarly, we fit the model using 50 B-spline basis with default choice of cubic splines. We need to set `family="binomial"` for logistic regression. Printing and plotting are the same as before.
```{r}
fit4 = fglm_trunc(Y_logistic, Xcurves_logistic, family="binomial", nbasis = 50)
```

```{r, fig.align='center', fig.height=4, fig.width=4}
print(fit4)
plot(fit4)
```

### Predicting with fitted `FGLMtrunc` model for logistic regression
For functional logistic regression, each `type` option returns a different prediction: 

* `type="link"` gives the linear predictors which are log-odds.

* `type="response"` gives the predicted probabilities.

* `type="coefficients"` gives truncated estimate of functional parameter $\beta$ as before.

```{r, fig.align='center', fig.height=4, fig.width=4}
logistic_link_pred = predict(fit4, newX.curves = Xcurves_logistic, type="link")
plot(logistic_link_pred, ylab="log-odds")
```

```{r, fig.align='center', fig.height=4, fig.width=4}
logistic_response_pred = predict(fit4, newX.curves = Xcurves_logistic, type="response")
plot(logistic_response_pred, ylab="predicted probabilities")
```

## Functional Linear Regression with scalar predictors
### Fitting `FGLMtrunc` model
`FGLMtrunc` allows using scalar predictors together with functional predictors. First, we randomly generate observations for scalar predictors: 

```{r}
scalar_coef <- c(1, -1, 0.5) # True coefficients for scalar predictors
set.seed(1234)
S <- cbind(matrix(rnorm(400), nrow=200), rbinom(200, 1, 0.5))  # Randomly generated observations for scalar predictors. Binary coded as 0 and 1.
colnames(S) <- c("s1", "s2", "s3")
```

Next, we modify the response vector from `LinearExample` so that it takes into account scalar predictors:

```{r}
Y_scalar <- Y_linear + (S %*% scalar_coef)
```

Then we fit `FGLMtrunc` model with the matrix of scalar predictors `S`:

```{r}
fit_scalar = fglm_trunc(X.curves=Xcurves_linear, Y=Y_scalar, S=S, nbasis = 50)
fit_scalar
```

Fitted coefficients for scalar predictors are close to the true values.

### Predicting with scalar predictors

To make prediction with fitted model using scalar predictors, we need to specified argument `newS`:

```{r}
predict(fit_scalar, newX.curves = Xcurves_linear[1:5,], newS=S[1:5,])
```