Start describing DWARF

This commit is contained in:
Théophile Bastian 2018-08-01 19:55:13 +02:00
parent 5d716351da
commit 0a7b8b4e64

View file

@ -55,14 +55,55 @@ Under supervision of Francesco Zappa-Nardelli\\
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Stack frames and unwinding} \subsection{Stack frames and unwinding}
\todo{}
On most platforms, programs make use of a \emph{call stack} to store
information about the nested function calls at the current execution point, and
keep track of their nesting. Each function call has its own \emph{stack frame},
an entry of the call stack, whose precise contents are often specified in the
Application Binary Interface (ABI) of the platform, and left to various extents
up to the compiler. Those frames are typically used for storing function
arguments, machine registers that must be restored before returning, the
function's return address and local variables.
For various reasons, it might be interesting, at some point of the execution of
a program, to glance at its program stack and be able to extract informations
from it. For instance, when running a debugger such as \prog{gdb}, a frequent
usage is to obtain a \emph{backtrace}, that is, the list of all nested function
calls at this point. This actually reads the stack to find the different stack
frames, and decode them to identify the function names, parameter values, etc.
This operation is far from trivial. Often, a stack frame will only make sense
with correct machine registers values, which can be restored from the previous
stack frame, imposing to \emph{walk} the stack, reading the entries one after
the other, instead of peeking at some frame directly. Moreover, the size of one
stack frame is often not that easy to determine when looking at some
instruction other than \texttt{return}, making it hard to extract single frames
from the whole stack.
Interpreting a frame in order to get the machine state \emph{before} this
frame, and thus be able to decode the next frame recursively, is called
\emph{unwinding} a frame. For all the reasons above and more, it is often
necessary to have additional data to perform stack unwinding. This data is
often stored among the debugging informations of a program, and one common
format of debugging data is DWARF\@.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{DWARF format} \subsection{DWARF format}
\todo{}
The DWARF format was first standardized as the format for debugging
information of the ELF executable binaries. It is now commonly used across a
wide variety of binary formats to store debugging information. As of now, the
latest DWARF standard is DWARF 5~\cite{dwarf5std}, which is openly accessible.
The DWARF data commonly includes type information about the variables in the
original programming language, correspondence of assembly instructions with a
line in the original source file, \ldots
The format also specifies a way to represent unwinding data, as described in
the previous paragraph, in an ELF section originally called
\lstc{.debug_frame}, most often found as \lstc{.eh_frame}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{DWARF functioning} \subsection{DWARF unwinding data}
\todo{} \todo{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%