286 lines
13 KiB
TeX
286 lines
13 KiB
TeX
\title{DWARF debugging data, compilation and optimization}
|
|
|
|
\author{Théophile Bastian\\
|
|
Under supervision of Francesco Zappa-Nardelli\\
|
|
{\textsc{parkas}, \'Ecole Normale Supérieure de Paris}}
|
|
|
|
\date{March -- August 2018\\August 20, 2018}
|
|
|
|
\documentclass[11pt]{article}
|
|
|
|
\usepackage[left=2cm,right=2cm,top=2cm,bottom=2cm]{geometry}
|
|
\usepackage{amsmath}
|
|
\usepackage{amssymb}
|
|
\usepackage{stmaryrd}
|
|
\usepackage{mathtools}
|
|
\usepackage{indentfirst}
|
|
\usepackage[utf8]{inputenc}
|
|
%\usepackage[backend=biber,style=alphabetic]{biblatex}
|
|
\usepackage[backend=biber]{biblatex}
|
|
|
|
\usepackage{../shared/my_listings}
|
|
\usepackage{../shared/my_hyperref}
|
|
\usepackage{../shared/specific}
|
|
\usepackage{../shared/common}
|
|
\usepackage{../shared/todo}
|
|
|
|
\addbibresource{../shared/report.bib}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\begin{document}
|
|
|
|
%% Main title %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\maketitle
|
|
|
|
%% Fiche de synthèse %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\input{fiche_synthese}
|
|
|
|
%% Abstract %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\begin{abstract}
|
|
\todo{Is there a need for an abstract, given the presence above of the
|
|
``fiche de synthèse''?}
|
|
\end{abstract}
|
|
|
|
%% Table of contents %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\tableofcontents
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%% Main text content %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Stack unwinding data presentation}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Stack frames and unwinding}
|
|
|
|
On most platforms, programs make use of a \emph{call stack} to store
|
|
information about the nested function calls at the current execution point, and
|
|
keep track of their nesting. Each function call has its own \emph{stack frame},
|
|
an entry of the call stack, whose precise contents are often specified in the
|
|
Application Binary Interface (ABI) of the platform, and left to various extents
|
|
up to the compiler. Those frames are typically used for storing function
|
|
arguments, machine registers that must be restored before returning, the
|
|
function's return address and local variables.
|
|
|
|
For various reasons, it might be interesting, at some point of the execution of
|
|
a program, to glance at its program stack and be able to extract informations
|
|
from it. For instance, when running a debugger such as \prog{gdb}, a frequent
|
|
usage is to obtain a \emph{backtrace}, that is, the list of all nested function
|
|
calls at this point. This actually reads the stack to find the different stack
|
|
frames, and decode them to identify the function names, parameter values, etc.
|
|
|
|
This operation is far from trivial. Often, a stack frame will only make sense
|
|
with correct machine registers values, which can be restored from the previous
|
|
stack frame, imposing to \emph{walk} the stack, reading the entries one after
|
|
the other, instead of peeking at some frame directly. Moreover, the size of one
|
|
stack frame is often not that easy to determine when looking at some
|
|
instruction other than \texttt{return}, making it hard to extract single frames
|
|
from the whole stack.
|
|
|
|
Interpreting a frame in order to get the machine state \emph{before} this
|
|
frame, and thus be able to decode the next frame recursively, is called
|
|
\emph{unwinding} a frame. For all the reasons above and more, it is often
|
|
necessary to have additional data to perform stack unwinding. This data is
|
|
often stored among the debugging informations of a program, and one common
|
|
format of debugging data is DWARF\@.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Unwinding usage and frequency}
|
|
|
|
Stack unwinding is a more common operation that one might think at first. The
|
|
most commonly thought use-case is simply to get a stack trace of a program, and
|
|
provide a debugger with the information it needs: for instance, when inspecting
|
|
a stack trace in \prog{gdb}, it is quite common to jump to a previous frame:
|
|
|
|
\lstinputlisting{src/segfault/gdb_session}
|
|
|
|
To be able to do this, \texttt{gdb} must be able to restore \lstc{fct_a}'s
|
|
context, by unwinding \lstc{fct_b}'s frame.
|
|
|
|
\medskip
|
|
|
|
Yet, stack unwinding (and thus debugging data) \emph{is not limited to
|
|
debugging}.
|
|
|
|
Another common usage is profiling. A profiling tool, such as \prog{perf} under
|
|
Linux, is used to measure and analyze in which functions a program spends its
|
|
time, identify bottlenecks and find out which parts are critical to optimize.
|
|
To do so, modern profilers pause the traced program at regular, short
|
|
intervals, inspect their stack, and determine which function is currently being
|
|
run. They also often perform a stack unwinding to determine the call path to
|
|
this function, to determine which function indirectly takes time: \eg, a
|
|
function \lstc{fct_a} can call both \lstc{fct_b} and \lstc{fct_c}, which are
|
|
quite heavy; spend practically no time directly in \lstc{fct_a}, but spend a
|
|
lot of time in calls to the other two functions that were made by \lstc{fct_a}.
|
|
|
|
Exception handling also requires a stack unwinding mechanism in most languages.
|
|
Indeed, an exception is completely different from a \lstc{return}: while the
|
|
latter returns to the previous function, the former can be caught by virtually
|
|
any function in the call path, at any point of the function. It is thus
|
|
necessary to be able to unwind frames, one by one, until a suitable
|
|
\lstc{catch} block is found. The C++ language, for one, includes a
|
|
stack-unwinding library similar to \prog{libunwind} in its runtime.
|
|
|
|
In both of these two previous cases, performance \emph{can} be a problem. In
|
|
the latter, a slow unwinding directly impacts the overall program performance,
|
|
particularly if a lot of exceptions are thrown and caught far away in their
|
|
call path. In the former, profiling \emph{is} performance-heavy and often quite
|
|
slow when analyzing large programs anyway.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{DWARF format}
|
|
|
|
The DWARF format was first standardized as the format for debugging
|
|
information of the ELF executable binaries. It is now commonly used across a
|
|
wide variety of binary formats to store debugging information. As of now, the
|
|
latest DWARF standard is DWARF 5~\cite{dwarf5std}, which is openly accessible.
|
|
|
|
The DWARF data commonly includes type information about the variables in the
|
|
original programming language, correspondence of assembly instructions with a
|
|
line in the original source file, \ldots
|
|
The format also specifies a way to represent unwinding data, as described in
|
|
the previous paragraph, in an ELF section originally called
|
|
\lstc{.debug_frame}, most often found as \ehframe.
|
|
|
|
For any binary, debugging information can easily get quite large if no
|
|
attention is payed to keeping it as compact as possible. In this matter, DWARF
|
|
does an excellent job, and everything is stored in a very compact way. This,
|
|
however, as we will see, makes it both difficult to parse correctly (with \eg{}
|
|
variable-length integers) and quite slow to interpret.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{DWARF unwinding data}
|
|
|
|
The unwinding data, which we will call from now on the \ehframe, contains, for
|
|
each possible instruction pointer (that is, an instruction address within the
|
|
program), a set of ``registers'' that can be unwound, and a rule describing how
|
|
to do so.
|
|
|
|
The DWARF language is completely agnostic of the platform and ABI, and in
|
|
particular, is completely agnostic of a particular platform's registers. Thus,
|
|
when talking about DWARF, a register is merely a numerical identifier that is
|
|
often, but not necessarily, mapped to a real machine register by the ABI\@.
|
|
|
|
In practice, this data takes the form of a collection of tables, one table per
|
|
Frame Description Entry (FDE), which most often corresponds to a function. Each
|
|
column of the table is a register (\eg{} \reg{rsp}), with two additional
|
|
special registers, CFA (Canonical Frame Address) and RA (Return Address),
|
|
containing respectively the base pointer of the current stack frame and the
|
|
return address of the current function (\ie{} for x86\_64, the unwound value of
|
|
\reg{rip}, the instruction pointer). Each row of the table is a particular
|
|
instruction pointer, within the instruction pointer range of the tabulated FDE
|
|
(assuming a FDE maps directly to a function, this range is simply the IP range
|
|
of the given function in the \lstc{.text} section of the binary), a row being
|
|
valid from its start IP to the start IP of the next row, or the end IP of the
|
|
FDE if it is the last row.
|
|
|
|
\begin{minipage}{0.45\textwidth}
|
|
\lstinputlisting[language=C, firstline=3, lastline=12]
|
|
{src/fib7/fib7.c}
|
|
\end{minipage} \hfill \begin{minipage}{0.45\textwidth}
|
|
\lstinputlisting[language=C]{src/fib7/fib7.fde}
|
|
\end{minipage}
|
|
|
|
For instance, the C source code above, when compiled with \lstbash{gcc -O0
|
|
-fomit-frame-pointer}, gives the table at its right. During the function
|
|
prelude, \ie{} for $\mhex{675} \leq \reg{rip} < \mhex{679}$, the stack frame
|
|
only contains the return address, thus the CFA is 8 bytes above \reg{rsp}
|
|
(which was the value of \reg{rsp} before the call), and the return address is
|
|
precisely at \reg{rsp}. Then, 9 integers of 8 bytes each (8 for \lstc{fibo},
|
|
one for \lstc{pos}) are allocated on the stack, which puts the CFA 80 bytes
|
|
above \reg{rsp}, and the return address still 8 bytes below the CFA\@. Then, by
|
|
the end of the function, the local variables are discarded and \reg{rsp} is
|
|
reset to its value from the first row.
|
|
|
|
However, DWARF data isn't actually stored as a table in the binary files. The
|
|
first row has the location of the first IP in the FDE, and must define at least
|
|
its CFA\@. Then, when all relevant registers are defined, it is possible to
|
|
define a new row by providing a location offset (\eg{} here $4$), and the new
|
|
row is defined as a clone of the previous one, which can then be altered (\eg{}
|
|
here by setting \lstc{CFA} to $\reg{rsp} + 80$). This means that every line is
|
|
defined \wrt{} the previous one, and that the IPs of the successive rows cannot
|
|
be determined before evaluating every row before. Thus, unwinding a frame from
|
|
an IP close to the end of the frame will require evaluating pretty much every
|
|
DWARF row in the table before reaching the relevant information, slowing down
|
|
drastically the unwinding process.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{How big are FDEs?}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Unwinding state-of-the-art}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{General statistics}
|
|
\todo{}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Stack unwinding data compilation}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Compilation: \ehelfs}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{First results}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Space optimization}
|
|
\todo{}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Benchmarking}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Requirements}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Presentation of \prog{perf}}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Benchmarking with \prog{perf}}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Other explored methods}
|
|
\todo{}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Results}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Measured time performance}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Measured compactness}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Instructions coverage}
|
|
\todo{}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%% End main text content %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
%% Bibliography %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\printbibliography{}
|
|
|
|
\end{document}
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|