Various rephrasings

This commit is contained in:
Théophile Bastian 2018-08-17 18:07:17 +02:00
parent 7c99d3056e
commit 825b6c3e36
2 changed files with 32 additions and 27 deletions

View file

@ -112,23 +112,21 @@ yielded convincing results: on the experimental setup created (detailed later
in this report), the compiled version is around 26 times faster than the DWARF in this report), the compiled version is around 26 times faster than the DWARF
version, while it remains only around 2.5 times bigger than the original data. version, while it remains only around 2.5 times bigger than the original data.
Even though the implementation is more a research prototype than a release The implementation is not yet release-ready, as it does not support 100\ \% of
version, is still reasonably robust, compared to \prog{libunwind}, which is the DWARF5 specification~\cite{dwarf5std} --~see Section~\ref{ssec:ehelfs}
built for robustness. Corner cases are frequent while analyzing stack data, and below. Yet, it supports the vast majority --~around $99.9$\ \%~-- of the cases
even more when analyzing them through a profiler; yet the prototype fails only seen in the wild, and is decently robust compared to \prog{libunwind}, the
on around 200 cases more than \prog{libunwind} on a 27000 samples test (1099 reference implementation. Indeed, corner cases occur often, and on a 27000
failures, against 885 for \prog{libunwind}). samples test, 885 failures were observed for \prog{libunwind}, against 1099 for
the compiled DWARF version.
The prototype, unlike \prog{libunwind}, does not support $100\,\%$ of the DWARF The implementation, however, as a few other limitations. It only supports the
instructions present in the DWARF5 standard~\cite{dwarf5std}. It is also x86\_64 architecture, and relies to some extent on the Linux operating system.
limited to the x86\_64 architecture, and relies to some extent on the Linux But none of those are real problems in practice. Other processor architectures
operating system. But none of those limitations are real problems in practice. and ABIs are only a matter of time spent and engineering work; and the
As argued later on, the vast majority of the DWARF instruction set actually operating system dependency is only present in the libraries developed in order
used in the wild is implemented; other processor architectures and ABIs are to interact with the compiled unwinding data, which can be developed for
only a matter of time spent and engineering work; and the operating system virtually any operating system.
dependency is only present in the libraries developed in order to interact with
the compiled unwinding data, which can be developed for virtually any operating
system.
\subsection*{Summary and future work} \subsection*{Summary and future work}

View file

@ -219,8 +219,9 @@ in order to mitigate the slowness.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{DWARF format} \subsection{DWARF format}
The DWARF format was first standardized as the format for debugging The DWARF format was first standardized as the format for debugging information
information of the ELF executable binaries. It is now commonly used across a of the ELF executable binaries, which are standard on UNIX-like systems,
including Linux and MacOS --~but not Windows. It is now commonly used across a
wide variety of binary formats to store debugging information. As of now, the wide variety of binary formats to store debugging information. As of now, the
latest DWARF standard is DWARF 5~\cite{dwarf5std}, which is openly accessible. latest DWARF standard is DWARF 5~\cite{dwarf5std}, which is openly accessible.
@ -717,12 +718,15 @@ dump of a long-terminated process.
Unlike in the \ehframe, and unlike what should be done in a release, Unlike in the \ehframe, and unlike what should be done in a release,
real-world-proof version of the \ehelfs, the choice was made to keep this real-world-proof version of the \ehelfs, the choice was made to keep this
prototype simple, and only handle the few registers that were needed to simply implementation simple, and only handle the few registers that were needed to
unwind the stack. Thus, the only registers handled in \ehelfs{} are \reg{rip}, simply unwind the stack. Thus, the only registers handled in \ehelfs{} are
\reg{rbp}, \reg{rsp} and \reg{rbx}, the latter being used quite often in \reg{rip}, \reg{rbp}, \reg{rsp} and \reg{rbx}, the latter being used quite
\prog{libc} to hold the CFA address. This is enough to unwind the stack, but often in \prog{libc} to hold the CFA address. This is enough to unwind the
is not sufficient to analyze every stack frame as \prog{gdb} would do after a stack reliably, and thus enough for profiling, but is not sufficient to analyze
\lstbash{frame n} command. every stack frame as \prog{gdb} would do after a \lstbash{frame n} command.
Yet, if one was to enhance the code to handle every register, it would not be
much harder and would probably be only a few hours of code refactoring and
rewriting.
\lstinputlisting[language=C, caption={Unwinding context}, label={lst:unw_ctx}] \lstinputlisting[language=C, caption={Unwinding context}, label={lst:unw_ctx}]
{src/dwarf_assembly_context/unwind_context.c} {src/dwarf_assembly_context/unwind_context.c}
@ -886,9 +890,12 @@ recording the time spent in each function, including within nested calls. This
analysis often enables programmers to optimize critical paths and functions in analysis often enables programmers to optimize critical paths and functions in
their programs, while leaving unoptimized functions that are seldom traversed. their programs, while leaving unoptimized functions that are seldom traversed.
For this purpose, the basic idea is to stop the traced program at regular \prog{Perf} is a \emph{polling} profiler, to be opposed with
intervals, unwind its stack, write down the current nested function calls, and \emph{instrumenting} profilers. This means that with \prog{perf}, the basic
integrate the sampled data in the end. idea is to stop the traced program at regular intervals, unwind its stack,
write down the current nested function calls, and integrate the sampled data in
the end. Instrumenting profilers, on the other hand, do not interrupt the
program, but instead inject code in it.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%