|
|
|
@ -43,12 +43,6 @@ Under supervision of Francesco Zappa-Nardelli\\
|
|
|
|
|
%% Fiche de synthèse %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
\input{fiche_synthese}
|
|
|
|
|
|
|
|
|
|
%% Abstract %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
\begin{abstract}
|
|
|
|
|
\todo{Is there a need for an abstract, given the presence above of the
|
|
|
|
|
``fiche de synthèse''?}
|
|
|
|
|
\end{abstract}
|
|
|
|
|
|
|
|
|
|
%% Table of contents %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
\tableofcontents
|
|
|
|
|
|
|
|
|
@ -185,34 +179,52 @@ valid from its start IP to the start IP of the next row, or the end IP of the
|
|
|
|
|
FDE if it is the last row.
|
|
|
|
|
|
|
|
|
|
\begin{minipage}{0.45\textwidth}
|
|
|
|
|
\lstinputlisting[language=C, firstline=3, lastline=12]
|
|
|
|
|
\lstinputlisting[language=C, firstline=3, lastline=12,
|
|
|
|
|
caption={Original C},label={lst:ex1_c}]
|
|
|
|
|
{src/fib7/fib7.c}
|
|
|
|
|
\end{minipage} \hfill \begin{minipage}{0.45\textwidth}
|
|
|
|
|
\lstinputlisting[language=C]{src/fib7/fib7.fde}
|
|
|
|
|
\lstinputlisting[language=C,caption={Processed DWARF},label={lst:ex1_dw}]
|
|
|
|
|
{src/fib7/fib7.fde}
|
|
|
|
|
\lstinputlisting[language=C,caption={Raw DWARF},label={lst:ex1_dwraw}]
|
|
|
|
|
{src/fib7/fib7.raw_fde}
|
|
|
|
|
\end{minipage}
|
|
|
|
|
|
|
|
|
|
For instance, the C source code above, when compiled with \lstbash{gcc -O0
|
|
|
|
|
-fomit-frame-pointer}, gives the table at its right. During the function
|
|
|
|
|
prelude, \ie{} for $\mhex{675} \leq \reg{rip} < \mhex{679}$, the stack frame
|
|
|
|
|
only contains the return address, thus the CFA is 8 bytes above \reg{rsp}
|
|
|
|
|
(which was the value of \reg{rsp} before the call), and the return address is
|
|
|
|
|
precisely at \reg{rsp}. Then, 9 integers of 8 bytes each (8 for \lstc{fibo},
|
|
|
|
|
one for \lstc{pos}) are allocated on the stack, which puts the CFA 80 bytes
|
|
|
|
|
above \reg{rsp}, and the return address still 8 bytes below the CFA\@. Then, by
|
|
|
|
|
the end of the function, the local variables are discarded and \reg{rsp} is
|
|
|
|
|
reset to its value from the first row.
|
|
|
|
|
\begin{minipage}{0.45\textwidth}
|
|
|
|
|
\lstinputlisting[language={[x86masm]Assembler},lastline=11,
|
|
|
|
|
caption={Generated assembly},label={lst:ex1_asm}]
|
|
|
|
|
{src/fib7/fib7.s}
|
|
|
|
|
\end{minipage} \hfill \begin{minipage}{0.45\textwidth}
|
|
|
|
|
\lstinputlisting[language={[x86masm]Assembler},firstline=12,
|
|
|
|
|
firstnumber=last]
|
|
|
|
|
{src/fib7/fib7.s}
|
|
|
|
|
\end{minipage}
|
|
|
|
|
|
|
|
|
|
However, DWARF data isn't actually stored as a table in the binary files. The
|
|
|
|
|
first row has the location of the first IP in the FDE, and must define at least
|
|
|
|
|
its CFA\@. Then, when all relevant registers are defined, it is possible to
|
|
|
|
|
define a new row by providing a location offset (\eg{} here $4$), and the new
|
|
|
|
|
row is defined as a clone of the previous one, which can then be altered (\eg{}
|
|
|
|
|
here by setting \lstc{CFA} to $\reg{rsp} + 80$). This means that every line is
|
|
|
|
|
defined \wrt{} the previous one, and that the IPs of the successive rows cannot
|
|
|
|
|
be determined before evaluating every row before. Thus, unwinding a frame from
|
|
|
|
|
an IP close to the end of the frame will require evaluating pretty much every
|
|
|
|
|
DWARF row in the table before reaching the relevant information, slowing down
|
|
|
|
|
drastically the unwinding process.
|
|
|
|
|
For instance, the C source code in Listing~\ref{lst:ex1_c} above, when compiled
|
|
|
|
|
with \lstbash{gcc -O1 -fomit-frame-pointer -fno-stack-protector}, yields the
|
|
|
|
|
assembly code in Listing~\ref{lst:ex1_asm}. When interpreting the generated
|
|
|
|
|
\ehframe{} with \lstbash{readelf -wF}, we obtain the (slightly edited)
|
|
|
|
|
Listing~\ref{lst:ex1_dw}. During the function prelude, \ie{} for $\mhex{615}
|
|
|
|
|
\leq \reg{rip} < \mhex{619}$, the stack frame only contains the return address,
|
|
|
|
|
thus the CFA is 8 bytes above \reg{rsp} (which was the value of \reg{rsp}
|
|
|
|
|
before the call), and the return address is precisely at \reg{rsp}. Then, 9
|
|
|
|
|
integers of 8 bytes each (8 for \lstc{fibo}, one for \lstc{pos}) are allocated
|
|
|
|
|
on the stack, which puts the CFA 80 bytes above \reg{rsp}, and the return
|
|
|
|
|
address still 8 bytes below the CFA\@. Then, by the end of the function, the
|
|
|
|
|
local variables are discarded and \reg{rsp} is reset to its value from the
|
|
|
|
|
first row.
|
|
|
|
|
|
|
|
|
|
However, DWARF data isn't actually stored as a table in the binary files, but
|
|
|
|
|
is instead stored as in Listing~\ref{lst:ex1_dwraw}. The first row has the
|
|
|
|
|
location of the first IP in the FDE, and must define at least its CFA\@. Then,
|
|
|
|
|
when all relevant registers are defined, it is possible to define a new row by
|
|
|
|
|
providing a location offset (\eg{} here $4$), and the new row is defined as a
|
|
|
|
|
clone of the previous one, which can then be altered (\eg{} here by setting
|
|
|
|
|
\lstc{CFA} to $\reg{rsp} + 80$). This means that every line is defined \wrt{}
|
|
|
|
|
the previous one, and that the IPs of the successive rows cannot be determined
|
|
|
|
|
before evaluating every row before. Thus, unwinding a frame from an IP close to
|
|
|
|
|
the end of the frame will require evaluating pretty much every DWARF row in the
|
|
|
|
|
table before reaching the relevant information, slowing down drastically the
|
|
|
|
|
unwinding process.
|
|
|
|
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
\subsection{How big are FDEs?}
|
|
|
|
@ -565,6 +577,7 @@ The major drawback of this approach, without any particular care taken, is the
|
|
|
|
|
space waste.
|
|
|
|
|
|
|
|
|
|
\begin{table}[h]
|
|
|
|
|
\centering
|
|
|
|
|
\begin{tabular}{r r r r r r}
|
|
|
|
|
\toprule
|
|
|
|
|
\thead{Shared object} & \thead{Original \\ program size}
|
|
|
|
|