Tentative fiche de synthèse

2018-08-01 17:34:32 +02:00 · 2018-08-01 17:34:32 +02:00 · 7c2dbd228d
commit 7c2dbd228d
parent e8652acca5
3 changed files with 108 additions and 28 deletions
--- a/report/fiche_synthese.tex
+++ b/report/fiche_synthese.tex
@ -12,47 +12,127 @@ unwind stack frames, restoring machine registers to their proper values, for
 instance within the context of a debugger.
 As debugging data can easily get heavy beyond reasonable if stored carelessly,
-the DWARF standard pays a great attention to data compactness and compression.
+the DWARF standard pays a great attention to data compactness and compression,
-This, as always, is at the expense of efficiency: accessing stack unwinding
+and succeeds particularly well at it.  But this, as always, is at the expense
-data for a particular program point can be quite costly.
+of efficiency: accessing stack unwinding data for a particular program point
 can be quite costly.
 This is often not a huge problem, as stack unwinding is mostly thought of as a
 debugging procedure: when something behaves unexpectedly, the programmer might
 be interested in exploring the stack, moving around between stack frames,
 tracing the program path leading to some bug, \ldots{} Yet, stack unwinding
 might, in some cases, be performance-critical: for instance, profiler programs
 needs to perform a whole lot of stack unwindings. Even worse, exception
 handling relies on stack unwinding in order to find a suitable catch-block!
 The most widely used library used for stack unwinding,
-\texttt{libunwind}~\cite{libunwind}, 
+\texttt{libunwind}~\cite{libunwind}, essentially makes use of aggressive but
 fine-tuned caching and optimized code to mitigate this problem.
 \subsection*{The research problem}
-This internship explored the possibility to compile the standard ELF debugging
+\todo{Split the previous paragraph into two paragraphs, fitting this section as
-information format, DWARF, into x86\_64 assembly. 
+well}
 \note{I have trouble figuring out what is expected here, and what is expected
 in the previous section…}
-\qtodo{Delete question} \textit{
+% What is the question that you studied?
-What is the question that you studied?
+% Why is it important, what are the applications/consequences?
-Why is it important, what are the applications/consequences?
+% Is it a new problem?
-Is it a new problem?
+% If so, why are you the first researcher in the universe who consider it?
-If so, why are you the first researcher in the universe who consider it?
+% If not, why did you think that you could bring an original contribution?
 If not, why did you think that you could bring an original contribution?
 }
 \subsection*{Your contribution}
-What is your solution to the question described in the last paragraph?
+This internship explored the possibility to compile the standard ELF debugging
 information format, DWARF, directly into native assembly on the x86\_64
 architecture. Instead of parsing and interpreting at runtime the debug data,
 the stack unwinding data is accessed as a function of a dynamically-loaded
 shared library.
-Be careful, do \emph{not} give technical details, only rough ideas!
+Multiple approaches have been tried, in order to determine which compilation
 process leads to the best time/space trade-off.
-Pay a special attention to the description  of the \emph{scientific} approach.
+Quite unexpectedly, the part that proved hardest of the project was finding a
 benchmarking protocol that was both relevant and reliable. Unwinding one single
 frame is way too fast to be benched on a few samples (around $10\,\mu s$ per
 frame), and having a lot of samples is quite complex, since one must avoid
 unwinding the same frame over and over again, which would only benchmark the
 caching mechanism. The other problem is to distribute evenly the unwinding
 measures across the various program positions, including directly into the
 loaded libraries (\eg{} the \texttt{libc}).
 The solution eventually chosen was to modify \texttt{perf}, the standard
 profiling program for Linux, in order to gather statistics and benchmarks of
 its unwindings, and produce an alternative version of \texttt{libunwind} using
 the compiled debugging data, in order to interface it with \texttt{perf},
 allowing to benchmark \texttt{perf} with both the standard stack unwinding data
 and the alternative experimental compiled format. As a free and enjoyable
 side-effect, the experimental unwinding data is perfectly interfaced with
 \texttt{libunwind}, and thus interfaceable at practically no cost with any
 existing project using the common library \texttt{libunwind}.
 % What is your solution to the question described in the last paragraph?
 %
 % Be careful, do \emph{not} give technical details, only rough ideas!
 %
 % Pay a special attention to the description  of the \emph{scientific} approach.
 \subsection*{Arguments supporting its validity}
-What is the evidence that your solution is a good solution?
+% What is the evidence that your solution is a good solution?
-Experiments? Proofs?
+% Experiments? Proofs?
 % 
 % Comment the robustness of your solution: how does it rely/depend on the working assumptions?
-Comment the robustness of your solution: how does it rely/depend on the working assumptions?
+The goal was to obtain a compiled version of unwinding data that was faster
 than DWARF, reasonably heavier and reliable. The benchmarks mentioned have
 yielded convincing results: on the experimental setup created (detailed later
 in this report), the compiled version is up to 25 times faster than the DWARF
 version, while it remains only around 2.5 times bigger than the original data.
 Even though the implementation is more a research prototype than a release
 version, is still reasonably robust, compared to \texttt{libunwind}, which is
 built for robustness. Corner cases are frequent while analyzing stack data, and
 even more when analyzing them through a profiler; yet the prototype fails only
 on around 200 cases more than libunwind on a 27000 samples test (1099 failures,
 against 885 for libunwind).
 The prototype, unlike libunwind, does not support $100\,\%$ of the DWARF
 instruction present in the DWARF5 standard~\cite{dwarf5std}. It is also limited
 to the x86\_64 architecture, and relies to some extent on the Linux operating
 system. But none of those limitations are real problems in practice. As argued
 later on, the vast majority of the DWARF instructions actually used in the wild
 are implemented; other processor architectures and ABIs are only a matter of
 time spent and engineering work; and the operating system dependency is only
 present in the libraries developed in order to interact with the compiled
 unwinding data, which can be developed for virtually any operating system.
 \subsection*{Summary and future work}
-What is next? In which respect is your approach general?
+In most cases of everyday's life, the slowness of stack unwinding is not a
-What did your contribution bring to the area?
+problem, or even an annoyance. Yet, having a 25 times speed-up on stack
-What should be done now?
+unwinding-heavy tasks, such as profiling, can be really useful to analyse heavy
-What is the good \emph{next} question?
+programs, particularly if one wants to profile many times in order to analyze
 the impact of multiple changes. It can also be useful for exception-heavy
 programs~\qtodo{cite Stephen's software?}. Thus, it might be interesting to
 implement a more stable version, and try to interface it cleanly with
 mainstream tools, such as \texttt{perf}.
 It might also be interesting to investigate whether it is possible to reach
 even greater speeds by using some more complex compilation process that would
 have yet to be determined.
 Another question worth exploring might be whether it is possible to shrink even
 more the original DWARF unwinding data, which would be stored in a format not
 too far from the original standard, by applying techniques close to those
 used to shrink the compiled unwinding data.
 % What is next? In which respect is your approach general?
 % What did your contribution bring to the area?
 % What should be done now?
 % What is the good \emph{next} question?
 \pagestyle{plain}
--- a/report/report.tex
+++ b/report/report.tex
@ -1,4 +1,4 @@
-\title{DWARF debugging data, compilation and verification}
+\title{DWARF debugging data, compilation and optimization}
 \author{Théophile Bastian\\
 Under supervision of Francesco Zappa-Nardelli\\
--- a/shared/todo.sty
+++ b/shared/todo.sty
@ -3,9 +3,9 @@
 \definecolor{todobg}{HTML}{FF5F00}
 \definecolor{todofg}{HTML}{3700DA}
 \definecolor{notebg}{HTML}{87C23C}
-\definecolor{notefg}{HTML}{DF4431}
+\definecolor{notefg}{HTML}{BC3423}
 \newcommand{\qtodo}[1]{\colorbox{todobg}{\textcolor{todofg}{#1}}}
-\newcommand{\todo}[1]{\qtodo{\textbf{TODO:}\.#1}}
+\newcommand{\todo}[1]{\qtodo{\textbf{TODO:}\,#1}}
-\newcommand{\qnote}[1]{\colorbox{notebg}{\textcolor{notefg}{[#1]}}}
+\newcommand{\qnote}[1]{\colorbox{notebg}{\textcolor{notefg}{#1}}}
-\newcommand{\note}[1]{\qnote{\textbf{NOTE:}\.#1}}
+\newcommand{\note}[1]{\qnote{\textbf{NOTE:}\,#1}}