Tentative fiche de synthèse
This commit is contained in:
parent
e8652acca5
commit
7c2dbd228d
3 changed files with 108 additions and 28 deletions
|
@ -12,47 +12,127 @@ unwind stack frames, restoring machine registers to their proper values, for
|
|||
instance within the context of a debugger.
|
||||
|
||||
As debugging data can easily get heavy beyond reasonable if stored carelessly,
|
||||
the DWARF standard pays a great attention to data compactness and compression.
|
||||
This, as always, is at the expense of efficiency: accessing stack unwinding
|
||||
data for a particular program point can be quite costly.
|
||||
the DWARF standard pays a great attention to data compactness and compression,
|
||||
and succeeds particularly well at it. But this, as always, is at the expense
|
||||
of efficiency: accessing stack unwinding data for a particular program point
|
||||
can be quite costly.
|
||||
|
||||
This is often not a huge problem, as stack unwinding is mostly thought of as a
|
||||
debugging procedure: when something behaves unexpectedly, the programmer might
|
||||
be interested in exploring the stack, moving around between stack frames,
|
||||
tracing the program path leading to some bug, \ldots{} Yet, stack unwinding
|
||||
might, in some cases, be performance-critical: for instance, profiler programs
|
||||
needs to perform a whole lot of stack unwindings. Even worse, exception
|
||||
handling relies on stack unwinding in order to find a suitable catch-block!
|
||||
|
||||
The most widely used library used for stack unwinding,
|
||||
\texttt{libunwind}~\cite{libunwind},
|
||||
\texttt{libunwind}~\cite{libunwind}, essentially makes use of aggressive but
|
||||
fine-tuned caching and optimized code to mitigate this problem.
|
||||
|
||||
\subsection*{The research problem}
|
||||
|
||||
This internship explored the possibility to compile the standard ELF debugging
|
||||
information format, DWARF, into x86\_64 assembly.
|
||||
\todo{Split the previous paragraph into two paragraphs, fitting this section as
|
||||
well}
|
||||
|
||||
\note{I have trouble figuring out what is expected here, and what is expected
|
||||
in the previous section…}
|
||||
|
||||
|
||||
\qtodo{Delete question} \textit{
|
||||
What is the question that you studied?
|
||||
Why is it important, what are the applications/consequences?
|
||||
Is it a new problem?
|
||||
If so, why are you the first researcher in the universe who consider it?
|
||||
If not, why did you think that you could bring an original contribution?
|
||||
}
|
||||
% What is the question that you studied?
|
||||
% Why is it important, what are the applications/consequences?
|
||||
% Is it a new problem?
|
||||
% If so, why are you the first researcher in the universe who consider it?
|
||||
% If not, why did you think that you could bring an original contribution?
|
||||
|
||||
\subsection*{Your contribution}
|
||||
|
||||
What is your solution to the question described in the last paragraph?
|
||||
This internship explored the possibility to compile the standard ELF debugging
|
||||
information format, DWARF, directly into native assembly on the x86\_64
|
||||
architecture. Instead of parsing and interpreting at runtime the debug data,
|
||||
the stack unwinding data is accessed as a function of a dynamically-loaded
|
||||
shared library.
|
||||
|
||||
Be careful, do \emph{not} give technical details, only rough ideas!
|
||||
Multiple approaches have been tried, in order to determine which compilation
|
||||
process leads to the best time/space trade-off.
|
||||
|
||||
Pay a special attention to the description of the \emph{scientific} approach.
|
||||
Quite unexpectedly, the part that proved hardest of the project was finding a
|
||||
benchmarking protocol that was both relevant and reliable. Unwinding one single
|
||||
frame is way too fast to be benched on a few samples (around $10\,\mu s$ per
|
||||
frame), and having a lot of samples is quite complex, since one must avoid
|
||||
unwinding the same frame over and over again, which would only benchmark the
|
||||
caching mechanism. The other problem is to distribute evenly the unwinding
|
||||
measures across the various program positions, including directly into the
|
||||
loaded libraries (\eg{} the \texttt{libc}).
|
||||
|
||||
The solution eventually chosen was to modify \texttt{perf}, the standard
|
||||
profiling program for Linux, in order to gather statistics and benchmarks of
|
||||
its unwindings, and produce an alternative version of \texttt{libunwind} using
|
||||
the compiled debugging data, in order to interface it with \texttt{perf},
|
||||
allowing to benchmark \texttt{perf} with both the standard stack unwinding data
|
||||
and the alternative experimental compiled format. As a free and enjoyable
|
||||
side-effect, the experimental unwinding data is perfectly interfaced with
|
||||
\texttt{libunwind}, and thus interfaceable at practically no cost with any
|
||||
existing project using the common library \texttt{libunwind}.
|
||||
|
||||
% What is your solution to the question described in the last paragraph?
|
||||
%
|
||||
% Be careful, do \emph{not} give technical details, only rough ideas!
|
||||
%
|
||||
% Pay a special attention to the description of the \emph{scientific} approach.
|
||||
|
||||
\subsection*{Arguments supporting its validity}
|
||||
|
||||
What is the evidence that your solution is a good solution?
|
||||
Experiments? Proofs?
|
||||
% What is the evidence that your solution is a good solution?
|
||||
% Experiments? Proofs?
|
||||
%
|
||||
% Comment the robustness of your solution: how does it rely/depend on the working assumptions?
|
||||
|
||||
Comment the robustness of your solution: how does it rely/depend on the working assumptions?
|
||||
The goal was to obtain a compiled version of unwinding data that was faster
|
||||
than DWARF, reasonably heavier and reliable. The benchmarks mentioned have
|
||||
yielded convincing results: on the experimental setup created (detailed later
|
||||
in this report), the compiled version is up to 25 times faster than the DWARF
|
||||
version, while it remains only around 2.5 times bigger than the original data.
|
||||
|
||||
Even though the implementation is more a research prototype than a release
|
||||
version, is still reasonably robust, compared to \texttt{libunwind}, which is
|
||||
built for robustness. Corner cases are frequent while analyzing stack data, and
|
||||
even more when analyzing them through a profiler; yet the prototype fails only
|
||||
on around 200 cases more than libunwind on a 27000 samples test (1099 failures,
|
||||
against 885 for libunwind).
|
||||
|
||||
The prototype, unlike libunwind, does not support $100\,\%$ of the DWARF
|
||||
instruction present in the DWARF5 standard~\cite{dwarf5std}. It is also limited
|
||||
to the x86\_64 architecture, and relies to some extent on the Linux operating
|
||||
system. But none of those limitations are real problems in practice. As argued
|
||||
later on, the vast majority of the DWARF instructions actually used in the wild
|
||||
are implemented; other processor architectures and ABIs are only a matter of
|
||||
time spent and engineering work; and the operating system dependency is only
|
||||
present in the libraries developed in order to interact with the compiled
|
||||
unwinding data, which can be developed for virtually any operating system.
|
||||
|
||||
\subsection*{Summary and future work}
|
||||
|
||||
What is next? In which respect is your approach general?
|
||||
What did your contribution bring to the area?
|
||||
What should be done now?
|
||||
What is the good \emph{next} question?
|
||||
In most cases of everyday's life, the slowness of stack unwinding is not a
|
||||
problem, or even an annoyance. Yet, having a 25 times speed-up on stack
|
||||
unwinding-heavy tasks, such as profiling, can be really useful to analyse heavy
|
||||
programs, particularly if one wants to profile many times in order to analyze
|
||||
the impact of multiple changes. It can also be useful for exception-heavy
|
||||
programs~\qtodo{cite Stephen's software?}. Thus, it might be interesting to
|
||||
implement a more stable version, and try to interface it cleanly with
|
||||
mainstream tools, such as \texttt{perf}.
|
||||
|
||||
It might also be interesting to investigate whether it is possible to reach
|
||||
even greater speeds by using some more complex compilation process that would
|
||||
have yet to be determined.
|
||||
|
||||
Another question worth exploring might be whether it is possible to shrink even
|
||||
more the original DWARF unwinding data, which would be stored in a format not
|
||||
too far from the original standard, by applying techniques close to those
|
||||
used to shrink the compiled unwinding data.
|
||||
|
||||
% What is next? In which respect is your approach general?
|
||||
% What did your contribution bring to the area?
|
||||
% What should be done now?
|
||||
% What is the good \emph{next} question?
|
||||
|
||||
\pagestyle{plain}
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
\title{DWARF debugging data, compilation and verification}
|
||||
\title{DWARF debugging data, compilation and optimization}
|
||||
|
||||
\author{Théophile Bastian\\
|
||||
Under supervision of Francesco Zappa-Nardelli\\
|
||||
|
|
|
@ -3,9 +3,9 @@
|
|||
\definecolor{todobg}{HTML}{FF5F00}
|
||||
\definecolor{todofg}{HTML}{3700DA}
|
||||
\definecolor{notebg}{HTML}{87C23C}
|
||||
\definecolor{notefg}{HTML}{DF4431}
|
||||
\definecolor{notefg}{HTML}{BC3423}
|
||||
|
||||
\newcommand{\qtodo}[1]{\colorbox{todobg}{\textcolor{todofg}{#1}}}
|
||||
\newcommand{\todo}[1]{\qtodo{\textbf{TODO:}\.#1}}
|
||||
\newcommand{\qnote}[1]{\colorbox{notebg}{\textcolor{notefg}{[#1]}}}
|
||||
\newcommand{\note}[1]{\qnote{\textbf{NOTE:}\.#1}}
|
||||
\newcommand{\todo}[1]{\qtodo{\textbf{TODO:}\,#1}}
|
||||
\newcommand{\qnote}[1]{\colorbox{notebg}{\textcolor{notefg}{#1}}}
|
||||
\newcommand{\note}[1]{\qnote{\textbf{NOTE:}\,#1}}
|
||||
|
|
Loading…
Reference in a new issue