--- title: "Functional Programming with cppally" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Functional Programming with cppally} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette is mainly a gallery of examples, serving to provide intuition behind the usage of cppally functionals, as well as showcasing their utility. Let's load cppally in R and include cppally in our cpp or header file to get started ```{r setup} library(cppally) ``` ```cpp #include using namespace cppally; ``` ```{r, include=FALSE} # Helpers to compile all examples in debug mode cpp_source <- function(..., code, debug = TRUE, env = parent.frame()){ preamble <- c("#include ", "using namespace cppally;") code <- paste(c(preamble, code), collapse = "\n") cppally::cpp_source(debug = debug, env = env, code = code, ...) } # Helpers to source and display C++/R code chunk_impl <- function(x, language){ paste0("```", language, "\n", x, "\n```\n") } as_code_chunk <- function(x, language){ cat(chunk_impl(x, language)) } as_cpp_chunk <- function(x){ as_code_chunk(x, "cpp") } ``` ```{r, include=FALSE} # Compile necessary examples in one-go # as it's faster when building the vignette examples <- c( cpp_sum = ' [[cppally::register]] r_dbl cpp_sum(r_vector x){ return x.reduce([](auto a, auto b){ return a + b; }); } ', cpp_sum2 = ' [[cppally::register]] r_dbl cpp_sum2(r_vector x){ return x.reduce(std::plus<>{}); } ', cpp_gcd = ' [[cppally::register]] r_int cpp_gcd(r_vector x){ return x.reduce([](auto acc, auto curr){ auto res = cppally::gcd(acc, curr); // cppally has its own NA-aware gcd if ( (res == 1).is_true() ){ return done(res); } else { return keep(res); } }); } ', cpp_cumsum = ' [[cppally::register]] r_vector cpp_cumsum(r_vector x){ return x.cumulative_reduce(std::plus<>{}); } ', cpp_any_all_na = ' [[cppally::register]] bool cpp_any_na(r_vector x){ return x.reduce([](auto, auto curr){ return is_na(curr) ? done(true) : keep(false); }, /*init = */ false); } [[cppally::register]] bool cpp_all_na(r_vector x){ return x.reduce([](auto, auto curr){ return is_na(curr) ? keep(true) : done(false); }, /*init = */ true); } ', cpp_pmax = ' [[cppally::register]] r_vector cpp_pmax(r_vector x, r_vector y){ return pmap([](auto a, auto b){ return max(a, b); }, x, y); } ', cpp_if_else = ' template [[cppally::register]] T cpp_if_else(r_vec condition, T if_true, T if_false, T if_na){ return pmap([](r_lgl condition_, auto yes, auto no, auto missing) { if (condition_.is_true()){ return yes; } else if (condition_.is_false()){ return no; } else { return missing; } }, condition, if_true, if_false, if_na); } ', cpp_seq_along = ' template [[cppally::register]] r_vector cpp_seq_along(T x){ return pmap_with_index([](int i, auto){ // 2nd arg included so function can compile return r_int(i + 1); // R is 1-indexed }, x); } ', cpp_lag = ' [[cppally::register]] r_vector cpp_lag(r_vector x, int k){ return pmap_with_shift([&](auto a){ return lag(a, k); }, x); } ', cpp_diff = ' [[cppally::register]] r_vector cpp_diff(r_vector x){ return pmap_with_shift([&](auto a){ return curr(a) - lag(a); }, x); } ', cpp_in_place_abs = ' [[cppally::register]] r_vector cpp_in_place_abs(r_vector& x){ x.apply([](auto a){ return abs(a);}); return x; } ', cpp_in_place_lag = ' [[cppally::register]] r_vector cpp_in_place_lag(r_vector& x, int k){ x.shift(k); return x; } ', cpp_add2 = ' [[cppally::register]] r_vector cpp_add2(r_vector x, r_vector y){ return pmap_simd([](auto a, auto b){ return a + b; }, x, y); } ', cpp_pythagorean_theorem = ' [[cppally::register]] r_vector cpp_pythagorean_theorem(r_vector a, r_vector b){ return (a * a) + (b * b); // Pythagorean theorem - a^2 + b^2 = c^2 } ', cpp_rounding = ' [[cppally::register]] r_vector cpp_round(r_vector x, r_vector digits){ return pmap([](auto a, auto b){ return round(a, b); }, x, digits); } [[cppally::register]] r_vector cpp_floor(r_vector x){ return pmap([](auto a){ return floor(a); }, x); } [[cppally::register]] r_vector cpp_ceiling(r_vector x){ return pmap([](auto a){ return ceiling(a); }, x); } ' ) cpp_source(code = paste(examples, collapse = "\n"), debug = TRUE) ``` ### reduce `r_vec::reduce()` is a left-fold reduction functional that successively applies a binary function along the elements of the vector (from left-to-right). It allows for returning early and explicitly continuing by calling `done()` and `keep()`. The input function is typically a lambda, but can also be a callable (struct with `operator()`). Example of summing a vector with `reduce` ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_sum"]]) ``` ```{r} cpp_sum(1:10) ``` We could have also passed the callable `std::plus<>{}`, which makes it even more readable ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_sum2"]]) ``` ```{r} cpp_sum2(1:10) ``` To perform a reduction until a condition is met, use the helpers `done()` and `keep()` **Example:** Check vector has any `NA` or all `NA` ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_any_all_na"]]) ``` ```{r} x <- c(1, 2, NA, 4, 5) cpp_any_na(x) cpp_all_na(x) ``` Notice that the two folds are identical - only `done`/`keep` and `init` are swapped. This is the general pattern for any/all-style predicates. **Example:** greatest-common-divisor across integer vector. The trick here is to break when the result is 1 as `gcd(1, x) = 1` for any x ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_gcd"]]) ``` ```{r} cpp_gcd(c(5L, 25L, 125L)) cpp_gcd(c(5L, 25L, 1L, 125L)) ``` Use `cumulative_reduce` to return a vector of all the intermediate results of the reduction ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_cumsum"]]) ``` ```{r} cpp_cumsum(1:10) ``` ### pmap `pmap` is a C++ variadic function that allows one to apply a function across corresponding elements of multiple vectors. **Example:** vectorised binary max ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_pmax"]]) ``` ```{r} x <- c(10, 20, 30) y <- c(10, 50, 0) cpp_pmax(x, y) # pmap also recycles vectors cpp_pmax(x, 15) ``` **Example:** vectorised if else ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_if_else"]]) ``` ```{r} cpp_if_else(c(TRUE, FALSE, NA), "yes", "no", "missing") ``` `pmap_with_index()` is a variant that allows one to capture the index as we iterate along the vector **Example:** Integer sequence along vector ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_seq_along"]]) ``` ```{r} cpp_seq_along(letters) ``` ### Lagged operations Use `pmap_with_shift` to perform lagged operations. It checks for out-of-bounds access and therefore is safer than hand-writing it via `pmap_with_index` ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_lag"]]) ``` ```{r} # Lags cpp_lag(1:10, k = 1) cpp_lag(1:10, k = 2) cpp_lag(1:10, k = 3) # Leads cpp_lag(1:10, k = -1) cpp_lag(1:10, k = -2) cpp_lag(1:10, k = -3) ``` `pmap_with_shift` has three helpers: `lag()`, `lead()` and `curr()`. These are designed to assist in performing efficient lagged operations in a vectorised context, while maintaining readability **Example:** Lagged differencing ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_diff"]]) ``` ```{r} cpp_diff(1:10) cpp_diff(seq(10, 100, by = 5)) ``` ### In-place functionals To perform in-place transformations, use `r_vec::apply` as `pmap` always allocates a fresh vector and therefore cannot do in-place modification. ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_in_place_abs"]]) ``` ```{r} x <- c(-20, -10) cpp_in_place_abs(x) x # Modified in-place ``` `r_vec::shift` is a helper which can shift an entire vector in-place ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_in_place_lag"]]) ``` ```{r} x <- c(1, 2, 3, 4, 5) cpp_in_place_lag(x, k = 1) x # lagged in-place # keep lagging until we run out of elements to lag cpp_in_place_lag(x, k = 1) cpp_in_place_lag(x, k = 1) cpp_in_place_lag(x, k = 1) cpp_in_place_lag(x, k = 1) ``` ### Vectorised math `pmap` makes it easy to write vectorised math functions **Example:** Binary addition Here we are using `pmap_simd`, a variant of `pmap` that applies the supplied transformation under an omp simd directive. SIMD (single-instruction-multiple-data) is when the machine performs the same operation on multiple data points instead of one data point at time. In this case we are performing addition across multiple `r_int` values simultaneously. To safely use `pmap_simd`, iterations must be independent, not throw any errors, and the body must not have any side effects, which for math operations on `RMathType` classes is true. ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_add2"]]) ``` ```{r} cpp_add2(1:5, 10) ``` While we could go ahead and write vectorised versions of all the core operators, cppally has already done this with particular focus on efficiency. All operators are parallelised via simd and avoid allocating new vectors (like R does) where possible. The vectorised operators currently defined: binary: `+,-,*,/,+=,-=,*=,/=,==,<=,<,>=,>,|,&` unary: `!,-` **Example:** Mixed math operations ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_pythagorean_theorem"]]) ``` ```{r} cpp_pythagorean_theorem(1:10, 10:1) ``` cppally provides a rich set of scalar math functions (defined in r_math.h) which can be trivially vectorised. **Example:** vectorising `round`, `floor` and `ceiling` ```{r, echo=FALSE, comment="", results='asis'} as_cpp_chunk(examples[["cpp_rounding"]]) ``` ```{r} x <- seq(-2, 2, by = 0.5) cpp_round(x, digits = 0) cpp_floor(x) cpp_ceiling(x) ```