More DWARF details
This commit is contained in:
parent
0a7b8b4e64
commit
be47fefd98
7 changed files with 91 additions and 2 deletions
|
@ -100,11 +100,69 @@ original programming language, correspondence of assembly instructions with a
|
|||
line in the original source file, \ldots
|
||||
The format also specifies a way to represent unwinding data, as described in
|
||||
the previous paragraph, in an ELF section originally called
|
||||
\lstc{.debug_frame}, most often found as \lstc{.eh_frame}.
|
||||
\lstc{.debug_frame}, most often found as \ehframe.
|
||||
|
||||
For any binary, debugging information can easily get quite large if no
|
||||
attention is payed to keeping it as compact as possible. In this matter, DWARF
|
||||
does an excellent job, and everything is stored in a very compact way. This,
|
||||
however, as we will see, makes it both difficult to parse correctly (with \eg{}
|
||||
variable-length integers) and quite slow to interpret.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{DWARF unwinding data}
|
||||
\todo{}
|
||||
|
||||
The unwinding data, which we will call from now on the \ehframe, contains, for
|
||||
each possible instruction pointer (that is, an instruction address within the
|
||||
program), a set of ``registers'' that can be unwound, and a rule describing how
|
||||
to do so.
|
||||
|
||||
The DWARF language is completely agnostic of the platform and ABI, and in
|
||||
particular, is completely agnostic of a particular platform's registers. Thus,
|
||||
when talking about DWARF, a register is merely a numerical identifier that is
|
||||
often, but not necessarily, mapped to a real machine register by the ABI\@.
|
||||
|
||||
In practice, this data takes the form of a collection of tables, one table per
|
||||
Frame Description Entry (FDE), which most often corresponds to a function. Each
|
||||
column of the table is a register (\eg{} \reg{rsp}), with two additional
|
||||
special registers, CFA (Canonical Frame Address) and RA (Return Address),
|
||||
containing respectively the base pointer of the current stack frame and the
|
||||
return address of the current function (\ie{} for x86\_64, the unwound value of
|
||||
\reg{rip}, the instruction pointer). Each row of the table is a particular
|
||||
instruction pointer, within the instruction pointer range of the tabulated FDE
|
||||
(assuming a FDE maps directly to a function, this range is simply the IP range
|
||||
of the given function in the \lstc{.text} section of the binary), a row being
|
||||
valid from its start IP to the start IP of the next row, or the end IP of the
|
||||
FDE if it is the last row.
|
||||
|
||||
\begin{minipage}{0.45\textwidth}
|
||||
\lstinputlisting[language=C, firstline=3, lastline=12]
|
||||
{src/fib7/fib7.c}
|
||||
\end{minipage} \hfill \begin{minipage}{0.45\textwidth}
|
||||
\lstinputlisting[language=C]{src/fib7/fib7.fde}
|
||||
\end{minipage}
|
||||
|
||||
For instance, the C source code above, when compiled with \lstbash{gcc -O0
|
||||
-fomit-frame-pointer}, gives the table at its right. During the function
|
||||
prelude, \ie{} for $\mhex{675} \leq \reg{rip} < \mhex{679}$, the stack frame
|
||||
only contains the return address, thus the CFA is 8 bytes above \reg{rsp}
|
||||
(which was the value of \reg{rsp} before the call), and the return address is
|
||||
precisely at \reg{rsp}. Then, 9 integers of 8 bytes each (8 for \lstc{fibo},
|
||||
one for \lstc{pos}) are allocated on the stack, which puts the CFA 80 bytes
|
||||
above \reg{rsp}, and the return address still 8 bytes below the CFA\@. Then, by
|
||||
the end of the function, the local variables are discarded and \reg{rsp} is
|
||||
reset to its value from the first row.
|
||||
|
||||
However, DWARF data isn't actually stored as a table in the binary files. The
|
||||
first row has the location of the first IP in the FDE, and must define at least
|
||||
its CFA\@. Then, when all relevant registers are defined, it is possible to
|
||||
define a new row by providing a location offset (\eg{} here $4$), and the new
|
||||
row is defined as a clone of the previous one, which can then be altered (\eg{}
|
||||
here by setting \lstc{CFA} to $\reg{rsp} + 80$). This means that every line is
|
||||
defined \wrt{} the previous one, and that the IPs of the successive rows cannot
|
||||
be determined before evaluating every row before. Thus, unwinding a frame from
|
||||
an IP close to the end of the frame will require evaluating pretty much every
|
||||
DWARF row in the table before reaching the relevant information, slowing down
|
||||
drastically the unwinding process.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{How big are FDEs?}
|
||||
|
|
1
report/src/.gitignore
vendored
Normal file
1
report/src/.gitignore
vendored
Normal file
|
@ -0,0 +1 @@
|
|||
*.bin
|
4
report/src/fib7/Makefile
Normal file
4
report/src/fib7/Makefile
Normal file
|
@ -0,0 +1,4 @@
|
|||
all: fib7.bin
|
||||
|
||||
fib7.bin: fib7.c
|
||||
gcc -O1 $< -o $@
|
17
report/src/fib7/fib7.c
Normal file
17
report/src/fib7/fib7.c
Normal file
|
@ -0,0 +1,17 @@
|
|||
#include <stdio.h>
|
||||
|
||||
int fib7() {
|
||||
int fibo[8];
|
||||
fibo[0] = 1;
|
||||
fibo[1] = 1;
|
||||
for(int pos = 2; pos < 8; ++pos)
|
||||
fibo[pos] =
|
||||
fibo[pos - 1]
|
||||
+ fibo[pos - 2];
|
||||
return fibo[7];
|
||||
}
|
||||
|
||||
int main(void) {
|
||||
printf("%d\n", fib7());
|
||||
return 0;
|
||||
}
|
5
report/src/fib7/fib7.fde
Normal file
5
report/src/fib7/fib7.fde
Normal file
|
@ -0,0 +1,5 @@
|
|||
[...] FDE [...] pc=675..6f3
|
||||
LOC CFA ra
|
||||
0000000000000675 rsp+8 c-8
|
||||
0000000000000679 rsp+80 c-8
|
||||
00000000000006f2 rsp+8 c-8
|
|
@ -2,6 +2,7 @@
|
|||
|
||||
\newcommand{\ie}{\textit{ie.}}
|
||||
\newcommand{\eg}{\textit{eg.}}
|
||||
\newcommand{\wrt}{\textit{wrt.}}
|
||||
|
||||
\newcommand{\set}[1]{\left\{ #1 \right\}}
|
||||
\newcommand{\card}[1]{\left\vert{} #1 \right\vert}
|
||||
|
|
|
@ -3,6 +3,9 @@
|
|||
\newcommand{\prog}[1]{\texttt{#1}}
|
||||
\newcommand{\ehelf}{\texttt{eh\_elf}}
|
||||
\newcommand{\ehelfs}{\texttt{eh\_elfs}}
|
||||
\newcommand{\ehframe}{\lstc{.eh_frame}}
|
||||
|
||||
\newcommand{\mhex}[1]{0\texttt{x}#1}
|
||||
|
||||
%% DWARF semantics
|
||||
\newcommand{\dwcfa}[1]{\texttt{DW\_CFA\_#1}}
|
||||
|
|
Loading…
Reference in a new issue