--- title: Disciplined Convex Optimization in R author: Anqi Fu, Balasubramanian Narasimhan and Stephen Boyd bibliography: bibtex/cvxr_refs.bib date: '`r Sys.Date()`' output: html_document: fig_caption: yes theme: cerulean toc: yes toc_depth: 2 vignette: > %\VignetteIndexEntry{Disciplined Convex Optimization} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ## What is `CVXR`? `CVXR` is an R package that provides an object-oriented modeling language for convex optimization, similar to `CVX`, `CVXPY`, `YALMIP`, and `Convex.jl`. It allows the user to formulate convex optimization problems in a natural mathematical syntax rather than the restrictive standard form required by most solvers. The user specifies an objective and set of constraints by combining constants, variables, and parameters using a library of functions with known mathematical properties. `CVXR` then applies signed [disciplined convex programming (DCP)](https://cvxr.rbind.io) to verify the problem’s convexity. Once verified, the problem is converted into standard form using graph implementations and passed to a convex solver such as [OSQP](https://osqp.org) or [ECOS](https://github.com/embotech/ecos) or [SCS](https://github.com/cvxgrp/scs). ## Where can I learn more? The paper by [@cvxr2020](https://doi.org/10.18637/jss.v094.i14) is the main reference. Further documentation, along with a number of tutorial examples, is also available on the [CVXR website](https://cvxr.rbind.io). Below we provide a simple example to get you started. ## A Simple Example Consider a simple linear regression problem where it is desired to estimate a set of parameters using a least squares criterion. We generate some synthetic data where we know the model completely, that is $$ Y = X\beta + \epsilon $$ where $Y$ is a $100\times 1$ vector, $X$ is a $100\times 10$ matrix, $\beta = [-4,\ldots ,-1, 0, 1, \ldots, 5]$ is a $10\times 1$ vector, and $\epsilon \sim N(0, 1)$. ```{r} set.seed(123) n <- 100 p <- 10 beta <- -4:5 # beta is just -4 through 5. X <- matrix(rnorm(n * p), nrow=n) colnames(X) <- paste0("beta_", beta) Y <- X %*% beta + rnorm(n) ``` Given the data $X$ and $Y$, we can estimate the $\beta$ vector using `lm` function in R that fits a standard regression model. ```{r} ls.model <- lm(Y ~ 0 + X) # There is no intercept in our model above m <- data.frame(ls.est = coef(ls.model)) rownames(m) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(m) ``` These are the least-squares estimates and can be seen to be reasonably close to the original $\beta$ values -4 through 5. ## The `CVXR` formulation The `CVXR` formulation states the above as an optimization problem: $$ \begin{array}{ll} \underset{\beta}{\mbox{minimize}} & \|y - X\beta\|_2^2, \end{array} $$ which directly translates into a problem that `CVXR` can solve as shown in the steps below. - Step 0. Load the `CVXR` library ```{r} suppressWarnings(library(CVXR, warn.conflicts=FALSE)) ``` - Step 1. Define the variable to be estimated ```{r} betaHat <- Variable(p) ``` - Step 2. Define the objective to be optimized ```{r} objective <- Minimize(sum((Y - X %*% betaHat)^2)) ``` Notice how the objective is specified using functions such as `sum`, `*%*` and `^`, that are familiar to R users despite that fact that `betaHat` is no ordinary R expression but a `CVXR` expression. - Step 3. Create a problem to solve ```{r} problem <- Problem(objective) ``` - Step 4. Solve it! ```{r} result <- solve(problem) ``` - Step 5. Extract solution and objective value ```{r, echo = FALSE} solution <- result$getValue(betaHat) cat(sprintf("Objective value: %f\n", result$value)) ``` We can indeed satisfy ourselves that the results we get matches that from `lm`. ```{r} m <- cbind(coef(ls.model), result$getValue(betaHat)) colnames(m) <- c("lm est.", "CVXR est.") rownames(m) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(m) ``` ## Wait a minute! What have we gained? On the surface, it appears that we have replaced one call to `lm` with at least five or six lines of new R code. On top of that, the code actually runs slower, and so it is not clear what was really achieved. So suppose we knew for a fact that the $\beta$s were nonnegative and we wish to take this fact into account. This is [nonnegative least squares regression](https://en.wikipedia.org/wiki/Non-negative_least_squares) and `lm` would no longer do the job. In `CVXR`, the modified problem merely requires the addition of a constraint to the problem definition. ```{r} problem <- Problem(objective, constraints = list(betaHat >= 0)) result <- solve(problem) m <- data.frame(CVXR.est = result$getValue(betaHat)) rownames(m) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(m) ``` We can verify once again that these values are comparable to those obtained from another R package, say [nnls](https://CRAN.R-project.org/package=nnls). ```{r} if (requireNamespace("nnls", quietly = TRUE)) { nnls.fit <- nnls::nnls(X, Y)$x } else { nnls.fit <- rep(NA, p) } ``` ```{r} m <- cbind(result$getValue(betaHat), nnls.fit) colnames(m) <- c("CVXR est.", "nnls est.") rownames(m) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(m) ``` ## Okay that was cool, but... As you no doubt noticed, we have done nothing that other R packages could not do. So now suppose that we know, for some extraneous reason, that the sum of $\beta_2$ and $\beta_3$ is nonpositive and but all other $\beta$s are nonnegative. It is clear that this problem would not fit into any standard package. But in `CVXR`, this is easily done by adding a few constraints. To express the fact that $\beta_2 + \beta_3$ is nonpositive, we construct a row matrix with zeros everywhere, except in positions 2 and 3 (for $\beta_2$ and $\beta_3$ respectively). ```{r} A <- matrix(c(0, 1, 1, rep(0, 7)), nrow = 1) colnames(A) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(A) ``` The sum constraint is nothing but $$ A\beta <= 0 $$ which we express in R as ```{r} constraint1 <- A %*% betaHat <= 0 ``` _NOTE_: The above constraint can also be expressed simply as ```{r, eval = FALSE} constraint1 <- betaHat[2] + betaHat[3] <= 0 ``` but it is easier working with matrices in general with `CVXR`. For the nonnegativity for rest of the variables, we construct a $10\times 10$ matrix $A$ to have 1's along the diagonal everywhere except rows 2 and 3 and zeros everywhere. ```{r} B <- diag(c(1, 0, 0, rep(1, 7))) colnames(B) <- rownames(B) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(B) ``` The constraint for positivity is $$ B\beta > 0 $$ which we express in R as ```{r} constraint2 <- B %*% betaHat >= 0 ``` Now we are ready to solve the problem just as before. ```{r} problem <- Problem(objective, constraints = list(constraint1, constraint2)) result <- solve(problem) ``` And we can get the estimates of $\beta$. ```{r} m <- data.frame(CVXR.soln = result$getValue(betaHat)) rownames(m) <- paste0("$\\beta_{", 1:p, "}$") knitr::kable(m) ``` This demonstrates the chief advantage of `CVXR`: flexibility. Users can quickly modify and re-solve a problem, making our package ideal for prototyping new statistical methods. Its syntax is simple and mathematically intuitive. Furthermore, `CVXR` combines seamlessly with native R code as well as several popular packages, allowing it to be incorporated easily into a larger analytical framework. The user is free to construct statistical estimators that are solutions to a convex optimization problem where there may not be a closed form solution or even an implementation. Such solutions can then be combined with resampling techniques like the bootstrap to estimate variability. ## References