From 825b6c3e36e27446b46c27bf71cfcdb6b796219c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Th=C3=A9ophile=20Bastian?= Date: Fri, 17 Aug 2018 18:07:17 +0200 Subject: [PATCH] Various rephrasings --- report/fiche_synthese.tex | 30 ++++++++++++++---------------- report/report.tex | 29 ++++++++++++++++++----------- 2 files changed, 32 insertions(+), 27 deletions(-) diff --git a/report/fiche_synthese.tex b/report/fiche_synthese.tex index 7e94903..ac5b144 100644 --- a/report/fiche_synthese.tex +++ b/report/fiche_synthese.tex @@ -112,23 +112,21 @@ yielded convincing results: on the experimental setup created (detailed later in this report), the compiled version is around 26 times faster than the DWARF version, while it remains only around 2.5 times bigger than the original data. -Even though the implementation is more a research prototype than a release -version, is still reasonably robust, compared to \prog{libunwind}, which is -built for robustness. Corner cases are frequent while analyzing stack data, and -even more when analyzing them through a profiler; yet the prototype fails only -on around 200 cases more than \prog{libunwind} on a 27000 samples test (1099 -failures, against 885 for \prog{libunwind}). +The implementation is not yet release-ready, as it does not support 100\ \% of +the DWARF5 specification~\cite{dwarf5std} --~see Section~\ref{ssec:ehelfs} +below. Yet, it supports the vast majority --~around $99.9$\ \%~-- of the cases +seen in the wild, and is decently robust compared to \prog{libunwind}, the +reference implementation. Indeed, corner cases occur often, and on a 27000 +samples test, 885 failures were observed for \prog{libunwind}, against 1099 for +the compiled DWARF version. -The prototype, unlike \prog{libunwind}, does not support $100\,\%$ of the DWARF -instructions present in the DWARF5 standard~\cite{dwarf5std}. It is also -limited to the x86\_64 architecture, and relies to some extent on the Linux -operating system. But none of those limitations are real problems in practice. -As argued later on, the vast majority of the DWARF instruction set actually -used in the wild is implemented; other processor architectures and ABIs are -only a matter of time spent and engineering work; and the operating system -dependency is only present in the libraries developed in order to interact with -the compiled unwinding data, which can be developed for virtually any operating -system. +The implementation, however, as a few other limitations. It only supports the +x86\_64 architecture, and relies to some extent on the Linux operating system. +But none of those are real problems in practice. Other processor architectures +and ABIs are only a matter of time spent and engineering work; and the +operating system dependency is only present in the libraries developed in order +to interact with the compiled unwinding data, which can be developed for +virtually any operating system. \subsection*{Summary and future work} diff --git a/report/report.tex b/report/report.tex index b7e6d15..45aa3b7 100644 --- a/report/report.tex +++ b/report/report.tex @@ -219,8 +219,9 @@ in order to mitigate the slowness. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{DWARF format} -The DWARF format was first standardized as the format for debugging -information of the ELF executable binaries. It is now commonly used across a +The DWARF format was first standardized as the format for debugging information +of the ELF executable binaries, which are standard on UNIX-like systems, +including Linux and MacOS --~but not Windows. It is now commonly used across a wide variety of binary formats to store debugging information. As of now, the latest DWARF standard is DWARF 5~\cite{dwarf5std}, which is openly accessible. @@ -717,12 +718,15 @@ dump of a long-terminated process. Unlike in the \ehframe, and unlike what should be done in a release, real-world-proof version of the \ehelfs, the choice was made to keep this -prototype simple, and only handle the few registers that were needed to simply -unwind the stack. Thus, the only registers handled in \ehelfs{} are \reg{rip}, -\reg{rbp}, \reg{rsp} and \reg{rbx}, the latter being used quite often in -\prog{libc} to hold the CFA address. This is enough to unwind the stack, but -is not sufficient to analyze every stack frame as \prog{gdb} would do after a -\lstbash{frame n} command. +implementation simple, and only handle the few registers that were needed to +simply unwind the stack. Thus, the only registers handled in \ehelfs{} are +\reg{rip}, \reg{rbp}, \reg{rsp} and \reg{rbx}, the latter being used quite +often in \prog{libc} to hold the CFA address. This is enough to unwind the +stack reliably, and thus enough for profiling, but is not sufficient to analyze +every stack frame as \prog{gdb} would do after a \lstbash{frame n} command. +Yet, if one was to enhance the code to handle every register, it would not be +much harder and would probably be only a few hours of code refactoring and +rewriting. \lstinputlisting[language=C, caption={Unwinding context}, label={lst:unw_ctx}] {src/dwarf_assembly_context/unwind_context.c} @@ -886,9 +890,12 @@ recording the time spent in each function, including within nested calls. This analysis often enables programmers to optimize critical paths and functions in their programs, while leaving unoptimized functions that are seldom traversed. -For this purpose, the basic idea is to stop the traced program at regular -intervals, unwind its stack, write down the current nested function calls, and -integrate the sampled data in the end. +\prog{Perf} is a \emph{polling} profiler, to be opposed with +\emph{instrumenting} profilers. This means that with \prog{perf}, the basic +idea is to stop the traced program at regular intervals, unwind its stack, +write down the current nested function calls, and integrate the sampled data in +the end. Instrumenting profilers, on the other hand, do not interrupt the +program, but instead inject code in it. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%