Various rephrasings

This commit is contained in:
Théophile Bastian 2018-08-17 18:07:17 +02:00
parent 7c99d3056e
commit 825b6c3e36
2 changed files with 32 additions and 27 deletions

View file

@ -112,23 +112,21 @@ yielded convincing results: on the experimental setup created (detailed later
in this report), the compiled version is around 26 times faster than the DWARF
version, while it remains only around 2.5 times bigger than the original data.
Even though the implementation is more a research prototype than a release
version, is still reasonably robust, compared to \prog{libunwind}, which is
built for robustness. Corner cases are frequent while analyzing stack data, and
even more when analyzing them through a profiler; yet the prototype fails only
on around 200 cases more than \prog{libunwind} on a 27000 samples test (1099
failures, against 885 for \prog{libunwind}).
The implementation is not yet release-ready, as it does not support 100\ \% of
the DWARF5 specification~\cite{dwarf5std} --~see Section~\ref{ssec:ehelfs}
below. Yet, it supports the vast majority --~around $99.9$\ \%~-- of the cases
seen in the wild, and is decently robust compared to \prog{libunwind}, the
reference implementation. Indeed, corner cases occur often, and on a 27000
samples test, 885 failures were observed for \prog{libunwind}, against 1099 for
the compiled DWARF version.
The prototype, unlike \prog{libunwind}, does not support $100\,\%$ of the DWARF
instructions present in the DWARF5 standard~\cite{dwarf5std}. It is also
limited to the x86\_64 architecture, and relies to some extent on the Linux
operating system. But none of those limitations are real problems in practice.
As argued later on, the vast majority of the DWARF instruction set actually
used in the wild is implemented; other processor architectures and ABIs are
only a matter of time spent and engineering work; and the operating system
dependency is only present in the libraries developed in order to interact with
the compiled unwinding data, which can be developed for virtually any operating
system.
The implementation, however, as a few other limitations. It only supports the
x86\_64 architecture, and relies to some extent on the Linux operating system.
But none of those are real problems in practice. Other processor architectures
and ABIs are only a matter of time spent and engineering work; and the
operating system dependency is only present in the libraries developed in order
to interact with the compiled unwinding data, which can be developed for
virtually any operating system.
\subsection*{Summary and future work}

View file

@ -219,8 +219,9 @@ in order to mitigate the slowness.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{DWARF format}
The DWARF format was first standardized as the format for debugging
information of the ELF executable binaries. It is now commonly used across a
The DWARF format was first standardized as the format for debugging information
of the ELF executable binaries, which are standard on UNIX-like systems,
including Linux and MacOS --~but not Windows. It is now commonly used across a
wide variety of binary formats to store debugging information. As of now, the
latest DWARF standard is DWARF 5~\cite{dwarf5std}, which is openly accessible.
@ -717,12 +718,15 @@ dump of a long-terminated process.
Unlike in the \ehframe, and unlike what should be done in a release,
real-world-proof version of the \ehelfs, the choice was made to keep this
prototype simple, and only handle the few registers that were needed to simply
unwind the stack. Thus, the only registers handled in \ehelfs{} are \reg{rip},
\reg{rbp}, \reg{rsp} and \reg{rbx}, the latter being used quite often in
\prog{libc} to hold the CFA address. This is enough to unwind the stack, but
is not sufficient to analyze every stack frame as \prog{gdb} would do after a
\lstbash{frame n} command.
implementation simple, and only handle the few registers that were needed to
simply unwind the stack. Thus, the only registers handled in \ehelfs{} are
\reg{rip}, \reg{rbp}, \reg{rsp} and \reg{rbx}, the latter being used quite
often in \prog{libc} to hold the CFA address. This is enough to unwind the
stack reliably, and thus enough for profiling, but is not sufficient to analyze
every stack frame as \prog{gdb} would do after a \lstbash{frame n} command.
Yet, if one was to enhance the code to handle every register, it would not be
much harder and would probably be only a few hours of code refactoring and
rewriting.
\lstinputlisting[language=C, caption={Unwinding context}, label={lst:unw_ctx}]
{src/dwarf_assembly_context/unwind_context.c}
@ -886,9 +890,12 @@ recording the time spent in each function, including within nested calls. This
analysis often enables programmers to optimize critical paths and functions in
their programs, while leaving unoptimized functions that are seldom traversed.
For this purpose, the basic idea is to stop the traced program at regular
intervals, unwind its stack, write down the current nested function calls, and
integrate the sampled data in the end.
\prog{Perf} is a \emph{polling} profiler, to be opposed with
\emph{instrumenting} profilers. This means that with \prog{perf}, the basic
idea is to stop the traced program at regular intervals, unwind its stack,
write down the current nested function calls, and integrate the sampled data in
the end. Instrumenting profilers, on the other hand, do not interrupt the
program, but instead inject code in it.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%