This document describes the final
results of the “Joint Profiling of Native and R Code” project. It
describes two new packages, winch
, and its helper package,
procmaps
. After a definition of the problem, the packages’
features are described, with an outline of work still to be done. It
concludes with a discussion of technical caveats and a connection to the
first part of the project. The appendix is a scratch pad for links and
other bits of information that don’t fit elsewhere.
The winch package combines native and R stack traces in a controlled way. For now this is outside of a profiling context. This deviates from the original goal of the project, with good reason: robust combined stack traces are a cornerstone for joint profiling and are useful independent of profiling.
A stack trace is a sequence of code locations that describes the execution path from a program’s main entry point to a specific point of interest. Each entry represents a nested function call.
Use cases include (source):
- In a debugger, to show the user the stack.
- In a profiler, to take a sample of the stack.
- From the program itself (e.g. from a crash handler to show the stack).
In R, stack traces can be obtained via sys.calls()
:
foo <- function() {
bar()
}
bar <- function() {
baz()
}
baz <- function() {
sys.calls()
}
tail(foo(), 3)
#> [[1]]
#> foo()
#>
#> [[2]]
#> bar()
#>
#> [[3]]
#> baz()
Native code cannot easily be deduced from the stack trace. This is
seen when replacing function calls with
winch::winch_call()
, which simulates a round-trip from
native code into R by calling a function passed as argument from C code.
The example becomes a double round-trip: R → native → R → native →
R.
library(winch)
foo <- function() {
winch_call(bar)
}
bar <- function() {
winch_call(baz)
}
tail(foo(), 5)
#> [[1]]
#> foo()
#>
#> [[2]]
#> winch_call(bar)
#>
#> [[3]]
#> (function ()
#> {
#> winch_call(baz)
#> })()
#>
#> [[4]]
#> winch_call(baz)
#>
#> [[5]]
#> (function ()
#> {
#> sys.calls()
#> })()
A bit of guesswork is required here: we can inspect the bodies of the
functions in the call stack to check for calls to .Call()
,
.External()
or .External2()
. In our example
the candidates are located at indexes 2 and 4:
baz <- function() {
funs <- winch:::sys_functions()
lapply(funs, body)
}
tail(foo(), 5)
#> [[1]]
#> UseMethod("tail")
#>
#> [[2]]
#> {
#> winch_call(bar)
#> }
#>
#> [[3]]
#> {
#> .Call(winch_c_call, fun, env)
#> }
#>
#> [[4]]
#> {
#> winch_call(baz)
#> }
#>
#> [[5]]
#> {
#> .Call(winch_c_call, fun, env)
#> }
It would be much more accurate if the information was available directly in the stack trace. This would require changes to base R.
R is an interpreted language. It can be expected that it knows how to track the execution state. (It is much more surprising that it allows programs to compute on their own execution state – R has very strong support for reflection indeed.) How do we achieve this for native code? Can an application programmatically generate its own stack trace?
For native code, this is provided by the compiler or by specialized libraries. The main result is a sequence of instruction pointers (IP for short): addresses in the process’s memory from where function calls occurred. In some cases, additional information such as the function name is provided. In general, function name, file + line information and even library name must be deduced separately, using other specialized libraries.
The winch package wraps two libraries (for portability across
different OSes) that help obtain IP sequences and function names. This
functionality is accessible via winch::winch_trace_back()
.
For demonstration purposes, this is what the native stack trace looks
like for our double round-trip. Calls into native code can be deduced
from the pathname
column:
baz <- function() {
winch_trace_back()
}
trace <- foo()
nrow(trace)
#> [1] 72
head(trace, 25)
#> func ip
#> 1 R_ParseString 00007ffb83006c80
#> 2 R_ParseString 00007ffb83025030
#> 3 Rf_eval 00007ffb83025480
#> 4 Rf_eval 00007ffb830276a0
#> 5 Rf_eval 00007ffb83028850
#> 6 Rf_eval 00007ffb83025480
#> 7 R_execMethod 00007ffb830295d0
#> 8 Rf_eval 00007ffb83025480
#> 9 Rf_eval 00007ffb830276a0
#> 10 Rf_eval 00007ffb83028850
#> 11 Rf_eval 00007ffb83025480
#> 12 winch_call 00007ffb5aeeb710
#> 13 Rf_NewFrameConfirm 00007ffb82fca290
#> 14 R_ParseString 00007ffb83006c80
#> 15 R_ParseString 00007ffb83025030
#> 16 Rf_eval 00007ffb83025480
#> 17 Rf_eval 00007ffb830276a0
#> 18 Rf_eval 00007ffb83028850
#> 19 Rf_eval 00007ffb83025480
#> 20 winch_call 00007ffb5aeeb710
#> 21 Rf_NewFrameConfirm 00007ffb82fca290
#> 22 R_ParseString 00007ffb83006c80
#> 23 R_ParseString 00007ffb83025030
#> 24 Rf_eval 00007ffb83025480
#> 25 Rf_eval 00007ffb830276a0
#> pathname is_libr
#> 1 /usr/lib/R/lib/libR.so TRUE
#> 2 /usr/lib/R/lib/libR.so TRUE
#> 3 /usr/lib/R/lib/libR.so TRUE
#> 4 /usr/lib/R/lib/libR.so TRUE
#> 5 /usr/lib/R/lib/libR.so TRUE
#> 6 /usr/lib/R/lib/libR.so TRUE
#> 7 /usr/lib/R/lib/libR.so TRUE
#> 8 /usr/lib/R/lib/libR.so TRUE
#> 9 /usr/lib/R/lib/libR.so TRUE
#> 10 /usr/lib/R/lib/libR.so TRUE
#> 11 /usr/lib/R/lib/libR.so TRUE
#> 12 /tmp/RtmpkqEvnQ/Rinst97b156a0a2b/winch/libs/winch.so FALSE
#> 13 /usr/lib/R/lib/libR.so TRUE
#> 14 /usr/lib/R/lib/libR.so TRUE
#> 15 /usr/lib/R/lib/libR.so TRUE
#> 16 /usr/lib/R/lib/libR.so TRUE
#> 17 /usr/lib/R/lib/libR.so TRUE
#> 18 /usr/lib/R/lib/libR.so TRUE
#> 19 /usr/lib/R/lib/libR.so TRUE
#> 20 /tmp/RtmpkqEvnQ/Rinst97b156a0a2b/winch/libs/winch.so FALSE
#> 21 /usr/lib/R/lib/libR.so TRUE
#> 22 /usr/lib/R/lib/libR.so TRUE
#> 23 /usr/lib/R/lib/libR.so TRUE
#> 24 /usr/lib/R/lib/libR.so TRUE
#> 25 /usr/lib/R/lib/libR.so TRUE
table(trace$pathname)
#>
#> /tmp/RtmpkqEvnQ/Rinst97b156a0a2b/winch/libs/winch.so
#> 2
#> /usr/lib/R/bin/exec/R
#> 2
#> /usr/lib/R/lib/libR.so
#> 66
#> /usr/lib/x86_64-linux-gnu/libc.so.6
#> 2
winch_trace_back()
is a low-level function not intended
to be called by the user. It is a cornerstone for joint backtraces
provided by winch_add_trace_back()
, as shown in the next
section. For now, winch_trace_back()
provides the native
function name and the library name on all tested platforms (Linux,
macOS, Windows). Information about file + line remains an open
issue.
R has always been an interface language, with excellent integration of calls into foreign interfaces. R code often calls into specialized native code for speed or to avoid re-implementation in R. Due to this duality, it is helpful to access both kinds of stack trace: R and native.
These stack traces are not very helpful if taken separately. Native
code is hidden from R backtrace, and R code is represented as sequence
of function calls in libR.so
or equivalent. Fusing these
stack traces allows representing the chain of function calls from both
domains when R code calls into native code (which then might call back
into R).
The combination of R and native stack traces in winch relies on two heuristics:
.Call()
,
.External()
or .External2()
.libR.so
.For our example, a simplified view on the R and native stack traces, side by side, looks like this:
# R # Native
baz() libR.so::... (repeated)
.Call(...) <--> winch.so::winch_call
bar() libR.so::... (repeated)
.Call(...) <--> winch.so::winch_call
foo() libR.so::... (repeated)
Combining stack traces now means pairing corresponding
.Call()
entries with native traces from other modules than
libR.so
, denoted by <-->
in the above
example.
The winch package implements this on top of
rlang::trace_back()
, a wrapper around
sys.calls()
that displays as a tree and creates better
output in the presence of lazy evaluation.
█
1. └─global::foo()
2. ├─winch::winch_call(bar)
3. └─(function () ... R/call.R:4:2
4. ├─winch::winch_call(baz)
5. └─(function () ... R/call.R:4:2
The function winch::winch_add_trace_back()
enriches an
rlang stack trace with relevant parts of the native stack trace, by
pairing .Call()
and similar entries with native code from
outside of libR.so
, as described above.
█
1. └─global::foo()
2. └─winch::winch_call(bar)
3. └─`/winch.so`::winch_call()
4. └─(function () ... R/call.R:4:2
5. └─winch::winch_call(baz)
6. └─`/winch.so`::winch_call()
7. └─(function () ... R/call.R:4:2
Stack traces must be collected at the time that the error occurs,
this includes calls to rlang::abort()
which already collect
a stack trace. The integration is a small change to
rlang that calls winch_add_trace_back()
during
collection if winch is installed and the
rlang_trace_use_winch
option is identical to
1L
. With this option set, manual addition of the native
stack trace is no longer necessary.
█
1. └─global::foo()
2. └─winch::winch_call(bar)
3. └─`/winch.so`::winch_call()
4. └─(function () ... R/call.R:4:2
5. └─winch::winch_call(baz)
6. └─`/winch.so`::winch_call()
7. └─(function () ... R/call.R:4:2
The vctrs
package uses C for performance-critical parts,
and often calls back into R for e.g. S3 dispatch or error reporting. The
example below triggers an error and native code calls into R to throw
the error.
options(
error = rlang::entrace,
rlang_backtrace_on_error = "full",
rlang_trace_use_winch = TRUE
)
vctrs::vec_as_location(quote, 2)
Error: Must subset elements with a valid subscript vector.
✖ Subscript has the wrong type `function`.
ℹ It must be logical, numeric, or character.
Backtrace:
█
1. └─vctrs::vec_as_location(quote, 2)
2. └─`/vctrs.so`::vctrs_as_location()
3. └─`/vctrs.so`::vec_as_location_opts()
A bare-bones solution that enhances sys.calls()
and
sys.frames()
and is usable outside of rlang remains an open
issue.
Building an R package on Linux and Windows seems to include the
necessary debugging information that helps libunwind and libbacktrace.
On Windows, it seems necessary to set DEBUG=true
, e.g. in
~/.R/Makevars
. This adds -gdwarf-2
to the
compiler flags.
Also, Windows users currently need
winch::winch_init_library()
to trace errors in their
library. This is a limitation
of the upstream library.
On Windows, 64-bit R is required.
The current implementation is useful for displaying combined stack traces for debugging. Testing in practice is likely to discover problems or corner cases that were missed during initial testing.
A few things need to be done for fully portable joint profiling based on the new infrastructure:
Providing file + line information for stack traces (requires work for macOS)
Collect IPs or native stack traces during profiling
For errors thrown from Rcpp code, the stack trace ends at the auto-generated stub, because this is the code that catches the C++ exception. It would be useful if stack trace information could be collected at the point of error and provided as part of the error condition.
Stack unwinding, or retrieving the sequence of IPs for the stack trace, is dependent on the compiler, and also on the executable format used. As the blog post already linked above puts it,
stack unwinding … [is an] esoteric topic [that] lies at the intersection of compilers, linkers, loaders, debuggers, ABIs, and language runtimes. There is precious little documentation about it and, from what I can tell, a small handful of guys who answer all the questions about it.
It’s not surprising that there is no portable “one size fits all” solution. Some but not all options give the function name for each entry, sometimes file + line information is available too. Determining the shared library that provided the code seems to be an entirely separate topic.
The following options were experimented with:
The libc function execinfo()
was a first obvious candidate. It works well enough on mainstream Linux,
but not on Windows. Also, function names were not discovered reliably,
in particular for optimized code.
The libunwind library supports similar functionality and discovers function names more reliably. Unfortunately, it also does not work on Windows. What’s more, there are two implementations: the macOS implementation, part of clang, is a newer rewrite of the original non-GNU implementation that is available on GitHub.
The libbacktrace library, part of GCC and also available standalone, with its own tests, seems to give slightly worse results for the detection of function names. On the up side, it is portable, with support for Windows (MSYS2, on which the most recent rtools is based). It also gives file + line information for each stack frame.
Wrapping the libbacktrace library looks like the best option in terms of portability and features. Unfortunately, it currently fails on macOS. Therefore, a combination of libbacktrace and libunwind is used depending on the platform:
The execinfo()
approach is no longer used because
libunwind is clearly better.
The libbacktrace tests succeed
The winch package can be built with libbacktrace enabled
Running the winch examples gives a segmentation fault
libR.so
symbols
boost stacktrace has not been tried yet
The stacktrace library is from 2009, was last changed 2013
A fully native route has not been considered. Without library support this would amount to rewriting large parts of existing library code.
Perhaps not surprising, mapping IPs to module names also requires
platform-specific code. On Linux, the mapping is provided through the
/proc/self/maps
virtual file. Other systems require vastly
different approaches. It appears that there is not a library dedicated
specifically to this problem.
The gperftools repository contains relevant code targeting all platforms of interest. I extracted this code in the procmaps package, now on CRAN.
A first release, version 0.0.1, was flagged due to a too complex
./configure
script. With further simplification it was
possible to avoid configuration entirely. It should be possible to
rewrite the relevant code in C to provide a standalone solution that is
applicable in a wider context.
When file and line are provided for function calls in the stack trace, debugging is even easier. This information is also relevant for profiling, so that the costs of each source line can be estimated.
The libbacktrace library optionally provides file + line information for each entry in the stack trace.
Unfortunately it currently does not work on macOS:
Its internal tests show “libbacktrace: no debug info in Mach-O executable”, perhaps related to an issue that also limits usability on Windows.
Using libbacktrace in winch gives a segmentation fault.
The gperftools library relies on the addr2line
executable for retrieving file + line information (source,
relevant
StackOverflow question).
Once a fully portable solution is found, this information should be
provided by winch_add_trace_back()
.
The main result of the first part of the project is the
jointprof
package, which implements a proof of concept for
recording native and R stack traces in parallel, exclusively during
profiling. It uses the gperftools
library to record native stack traces, and daisy chains into the
profiling routine provided by R to record R stack traces. A helper
package, profile
, processes the profiler data created by
gperftools.
This approach works on Linux only. On macOS, a patch is required; Windows is completely unsupported. The approach is a bit brittle. Occasionally stack traces could not be matched. Fusion of stack traces for debugging purposes is not possible.
Will Landau contributed the proffer
package to the
project, a drop-in replacement to profvis that uses pprof to visualize
profiles.
Scratch pad for information that doesn’t fit elsewhere.
symbolic backtrace:
related project in Nim: https://github.com/timotheecour/Nim/issues/49
r-windows: submitted ccache pull request – package and usage
pending: caching for R packages, https://github.com/r-windows/rtools-packages/issues/143
install or provide ag on msys2: https://packages.msys2.org/package/mingw-w64-x86_64-ag?repo=mingw64
libbacktrace: implemented GitHub Actions, https://github.com/ianlancetaylor/libbacktrace/pull/51
gcc has libbacktrace built in
works, is included on Ubuntu, but not on Windows
inclusion in Debian/Ubuntu seems to be a maintainer choice, https://bugs.gentoo.org/552098
sjlj vs. seh (why the difference between 32 and 64 bits?):
R_GlobalContext
is accessed from several R packages,
CRAN checks catch this