Review and reword end of §1, §3 and §4

This commit is contained in:
Théophile Bastian 2018-08-08 14:01:55 +02:00
parent b761f360cc
commit b128ddd571
3 changed files with 176 additions and 110 deletions

View file

@ -88,16 +88,19 @@ before returning). Those preserved registers are \reg{rbx}, \reg{rsp},
conventions}\label{fig:call_stack} conventions}\label{fig:call_stack}
\end{wrapfigure} \end{wrapfigure}
The register \reg{rsp} is supposed to always point just past the last used The register \reg{rsp} is supposed to always point to the last used memory cell
memory cell in the stack, thus, when the process just enters a new function, in the stack, thus, when the process just enters a new function, \reg{rsp}
\reg{rsp} points 8 bytes after the location of the return address. Then, the points right to the location of the return address\footnote{Remember that since
compiler might use \reg{rbp} (``base pointer'') to save this value of the stack grows \emph{downwards} in memory, the arrow of \reg{rsp} points
\reg{rip}, by writing the old value of \reg{rbp} just below the return address \emph{below} the RA cell in the figure, and yet the memory cell indexed is the
on the stack, then copying \reg{rsp} to \reg{rbp}. This makes it easy to find one \emph{above} in the drawing, that is, the RA.}. Then, the compiler might
the return address from anywhere within the function, and also allows for easy use \reg{rbp} (``base pointer'') to save this value of \reg{rip}, by writing
addressing of local variables. Yet, using \reg{rbp} to save \reg{rip} is not the old value of \reg{rbp} just below the return address on the stack, then
always done, since it somehow ``wastes'' a register. This decision is, on copying \reg{rsp} to \reg{rbp}. This makes it easy to find the return address
x86\_64 System V, up to the compiler. from anywhere within the function, and also allows for easy addressing of local
variables. Yet, using \reg{rbp} to save \reg{rip} is not always done, since it
somehow ``wastes'' a register. This decision is, on x86\_64 System V, up to the
compiler.
Often, a function will start by subtracting some value to \reg{rsp}, allocating Often, a function will start by subtracting some value to \reg{rsp}, allocating
some space in the stack frame for its local variables. Then, it will push on some space in the stack frame for its local variables. Then, it will push on
@ -242,52 +245,92 @@ when talking about DWARF, a register is merely a numerical identifier that is
often, but not necessarily, mapped to a real machine register by the ABI\@. often, but not necessarily, mapped to a real machine register by the ABI\@.
In practice, this data takes the form of a collection of tables, one table per In practice, this data takes the form of a collection of tables, one table per
Frame Description Entry (FDE), which most often corresponds to a function. Each Frame Description Entry (FDE). A FDE, in turn, is a DWARF entry describing such
column of the table is a register (\eg{} \reg{rsp}), with two additional a table, that has a range of IPs on which it has authority. Most often, but not
necessarily, it corresponds to a single function in the original source code.
Each column of the table is a register (\eg{} \reg{rsp}), with two additional
special registers, CFA (Canonical Frame Address) and RA (Return Address), special registers, CFA (Canonical Frame Address) and RA (Return Address),
containing respectively the base pointer of the current stack frame and the containing respectively the base pointer of the current stack
return address of the current function (\ie{} for x86\_64, the unwound value of frame\footnote{The CFA is most commonly thought of as the base pointer of the
\reg{rip}, the instruction pointer). Each row of the table is a particular frame, yet this is not enforced by DWARF\@. The CFA is used as an address from
instruction pointer, within the instruction pointer range of the tabulated FDE which other registers will be deduced as offsets, and although it is supposed
(assuming a FDE maps directly to a function, this range is simply the IP range to be the actual base pointer, it can be anything as long as it is close enough
of the given function in the \lstc{.text} section of the binary), a row being to the addresses that will be deduced from it.} and the return address of the
valid from its start IP to the start IP of the next row, or the end IP of the current function (\ie{} for x86\_64, the unwound value of \reg{rip}, the
FDE if it is the last row. instruction pointer). Each row has a certain validity interval, on which it
describes accurate unwinding data. This range starts at the instruction pointer
it is associated with, and ends at the start IP of the next table row (or the
end IP of the current FDE if it was the last row). In particular, there can be
no ``IP hole'' within a FDE --~unlike FDEs themselves, which can leave holes
between them.
\begin{minipage}{0.45\textwidth} \begin{figure}[h]
\lstinputlisting[language=C, firstline=3, lastline=12, \begin{minipage}{0.45\textwidth}
caption={Original C},label={lst:ex1_c}] \lstinputlisting[language=C, firstline=3, lastline=12,
{src/fib7/fib7.c} caption={Original C},label={lst:ex1_c}]
\end{minipage} \hfill \begin{minipage}{0.45\textwidth} {src/fib7/fib7.c}
\lstinputlisting[language=C,caption={Processed DWARF},label={lst:ex1_dw}] \end{minipage} \hfill \begin{minipage}{0.45\textwidth}
{src/fib7/fib7.fde} \lstinputlisting[language=C,caption={Processed DWARF},
\lstinputlisting[language=C,caption={Raw DWARF},label={lst:ex1_dwraw}] label={lst:ex1_dw}]
{src/fib7/fib7.raw_fde} {src/fib7/fib7.fde}
\end{minipage} \lstinputlisting[language=C,caption={Raw DWARF},label={lst:ex1_dwraw}]
{src/fib7/fib7.raw_fde}
\end{minipage}
\end{figure}
\begin{minipage}{0.45\textwidth} \begin{figure}[h]
\lstinputlisting[language={[x86masm]Assembler},lastline=11, \begin{minipage}{0.45\textwidth}
caption={Generated assembly},label={lst:ex1_asm}] \lstinputlisting[language={[x86masm]Assembler},lastline=11,
{src/fib7/fib7.s} caption={Generated assembly},label={lst:ex1_asm}]
\end{minipage} \hfill \begin{minipage}{0.45\textwidth} {src/fib7/fib7.s}
\lstinputlisting[language={[x86masm]Assembler},firstline=12, \end{minipage} \hfill \begin{minipage}{0.45\textwidth}
firstnumber=last] \lstinputlisting[language={[x86masm]Assembler},firstline=12,
{src/fib7/fib7.s} firstnumber=last]
\end{minipage} {src/fib7/fib7.s}
\end{minipage}
\end{figure}
\begin{table}[h]
\centering
\begin{tabular}{|c|c|c|c|c|c}
\stackfhead{+ \mhex{30}}
& \stackfhead{+ \mhex{28}}
& \stackfhead{+ \mhex{20}}
& \stackfhead{+ \mhex{1c}}
& \stackfhead{+ \mhex{4}}
& \stackfhead{}
\\
\hline{}
Return Address & \textit{Alignment space}
& \spaced{2ex}{\lstc{fibo[7]}}
& \spaced{4ex}{\ldots}
& \spaced{2ex}{\lstc{fibo[0]}}
& \textit{Next frame}
\\
\hline
\end{tabular}
\caption{Stack frame schema}\label{table:ex1_stack_schema}
\end{table}
For instance, the C source code in Listing~\ref{lst:ex1_c} above, when compiled For instance, the C source code in Listing~\ref{lst:ex1_c} above, when compiled
with \lstbash{gcc -O1 -fomit-frame-pointer -fno-stack-protector}, yields the with \lstbash{gcc -O1 -fomit-frame-pointer -fno-stack-protector}, yields the
assembly code in Listing~\ref{lst:ex1_asm}. When interpreting the generated assembly code in Listing~\ref{lst:ex1_asm}. The memory layout of the stack
\ehframe{} with \lstbash{readelf -wF}, we obtain the (slightly edited) frame is presented in Table~\ref{table:ex1_stack_schema}, to help understanding
how the stack frame is constructed. When interpreting the generated \ehframe{}
with \lstbash{readelf -wF}, we obtain the (slightly edited)
Listing~\ref{lst:ex1_dw}. During the function prelude, \ie{} for $\mhex{615} Listing~\ref{lst:ex1_dw}. During the function prelude, \ie{} for $\mhex{615}
\leq \reg{rip} < \mhex{619}$, the stack frame only contains the return address, \leq \reg{rip} < \mhex{619}$, the stack frame only contains the return address,
thus the CFA is 8 bytes above \reg{rsp} (which was the value of \reg{rsp} thus the CFA is 8 bytes above \reg{rsp} (which was the value of \reg{rsp}
before the call), and the return address is precisely at \reg{rsp}. Then, 9 before the call, and is the topmost value of used space for this stack frame),
integers of 8 bytes each (8 for \lstc{fibo}, one for \lstc{pos}) are allocated and the return address is precisely at \reg{rsp} --~that is, stored between
on the stack, which puts the CFA 80 bytes above \reg{rsp}, and the return \reg{rsp} and $\reg{rsp} + 8$. Then, 8 integers of 4 bytes each (for
address still 8 bytes below the CFA\@. Then, by the end of the function, the \lstc{fibo}, \lstc{pos} being optimized out) are allocated on the stack, which
local variables are discarded and \reg{rsp} is reset to its value from the puts the CFA 32 bytes above \reg{rsp}, and the return address still 8 bytes
first row. below the CFA\@. Yet, \prog{gcc} decided to allocate a total space of 48 bytes
for the stack frame for memory alignment reasons, which means subtracting 40
bytes to \reg{rsp} (address $\mhex{615}$ in the assembly). Then, by the end of
the function, the local variables are discarded and \reg{rsp} is reset to its
value from the first row.
However, DWARF data isn't actually stored as a table in the binary files, but However, DWARF data isn't actually stored as a table in the binary files, but
is instead stored as in Listing~\ref{lst:ex1_dwraw}. The first row has the is instead stored as in Listing~\ref{lst:ex1_dwraw}. The first row has the
@ -295,12 +338,12 @@ location of the first IP in the FDE, and must define at least its CFA\@. Then,
when all relevant registers are defined, it is possible to define a new row by when all relevant registers are defined, it is possible to define a new row by
providing a location offset (\eg{} here $4$), and the new row is defined as a providing a location offset (\eg{} here $4$), and the new row is defined as a
clone of the previous one, which can then be altered (\eg{} here by setting clone of the previous one, which can then be altered (\eg{} here by setting
\lstc{CFA} to $\reg{rsp} + 80$). This means that every line is defined \wrt{} \lstc{CFA} to $\reg{rsp} + 48$). This means that every line is defined \wrt{}
the previous one, and that the IPs of the successive rows cannot be determined the previous one, and that the IPs of the successive rows cannot be determined
before evaluating every row before. Thus, unwinding a frame from an IP close to without evaluating every row that comes before in the first place. Thus,
the end of the frame will require evaluating pretty much every DWARF row in the unwinding a frame from an IP close to the end of the frame will require
table before reaching the relevant information, slowing down drastically the evaluating pretty much every DWARF row in the table before reaching the
unwinding process. relevant information, slowing down drastically the unwinding process.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{How big are FDEs?} \subsection{How big are FDEs?}
@ -377,8 +420,8 @@ brevity and clarity. All these instructions are up to variants (most
instructions exist in multiple formats to handle various operands formatting, instructions exist in multiple formats to handle various operands formatting,
to optimize space). Since we won't be talking about the underlying file format to optimize space). Since we won't be talking about the underlying file format
here, those variations between eg. \dwcfa{advance\_loc1} and here, those variations between eg. \dwcfa{advance\_loc1} and
\dwcfa{advance\_loc2} ---~which differ only on the number of bytes of their \dwcfa{advance\_loc2} --~which differ only on the number of bytes of their
operand~--- are irrelevant and will be eluded. operand~-- are irrelevant and will be eluded.
\begin{itemize} \begin{itemize}
\item{} \dwcfa{set\_loc(loc)}~: \item{} \dwcfa{set\_loc(loc)}~:
@ -478,8 +521,8 @@ in the context of the program being unwound. In particular, it must be able to
dereference some pointer derived from DWARF instructions that will point to the dereference some pointer derived from DWARF instructions that will point to the
execution stack, or even the heap. execution stack, or even the heap.
This function takes as arguments an instruction pointer ---~supposedly This function takes as arguments an instruction pointer --~supposedly
extracted from $\reg{rip}$~--- and an array of register values; and returns a extracted from $\reg{rip}$~-- and an array of register values; and returns a
fresh array of register values after unwinding this call frame. The function is fresh array of register values after unwinding this call frame. The function is
compositional\footnote{up to technicities: the IP obtained after unwinding the compositional\footnote{up to technicities: the IP obtained after unwinding the
first frame might be handled in a different dynamically loaded object, and this first frame might be handled in a different dynamically loaded object, and this
@ -641,25 +684,33 @@ machine code on the x86\_64 platform.
The rough idea of the compilation is to produce, out of the \ehframe{} section The rough idea of the compilation is to produce, out of the \ehframe{} section
of a binary, C code that resembles the code shown in the DWARF semantics from of a binary, C code that resembles the code shown in the DWARF semantics from
Section~\ref{sec:semantics} above. This C code is then compiled by GCC, Section~\ref{sec:semantics} above. This C code is then compiled by GCC in
providing for free all the optimization passes of a modern compiler. \lstbash{-O2} mode\footnote{Compiling in \lstbash{-O3} takes way too much
time.}, providing for free all the optimization passes of a modern compiler.
The generated code consists in a single monolithic function, taking as The generated code consists in a single monolithic function, \lstc{_eh_elf},
arguments an instruction pointer and a memory context (\ie{} the value of the taking as arguments an instruction pointer and a memory context (\ie{} the
various machine registers) as defined in Listing~\ref{lst:unw_ctx}. The value of the various machine registers) as defined in
function will then return a fresh memory context, containing the values the Listing~\ref{lst:unw_ctx}. The function will then return a fresh memory
registers hold after unwinding this frame. context, containing the values the registers hold after unwinding this frame.
The body of the function itself is mostly a huge switch, taking advantage of The body of the function itself is mostly a huge switch, taking advantage of
the non-standard ---~yet widely implemented in C compilers~--- syntax for range the non-standard --~yet widely implemented in C compilers~-- syntax for range
switches, in which each \lstc{case} can refer to a range. All the FDEs are switches, in which each \lstinline{case} can refer to a range. All the FDEs are
merged together into this switch, each row of a FDE being a switch case. The merged together into this switch, each row of a FDE being a switch case.
cases then fill a context with unwound values, then return it. Separating the various FDEs in the C code --~other than with comments~-- is,
unlike what is done in DWARF, pointless, since accessing a ``row'' has a linear
cost, and the C code is not meant to be read, except maybe for debugging
purposes. The switch cases bodies then fill a context with unwound values, then
return it.
An optionally enabled parameter can be used to pass a function pointer to a A setting of the compiler also optionally enables another parameter to the
dereferencing function, that conceptually does what the dereferencing \lstc{*} \lstc{_eh_elf} function, \lstc{deref}, which is a function pointer. This
operator does on a pointer, and is used to unwind a process that is not the \lstc{deref} function, when enabled, replaces everywhere the dereferencing
currently running process, and thus not sharing the same address space. A call \lstc{*} operator, and can be used to generate \ehelfs{} that will work on
remote address spaces (\ie{} whenever the unwinding is not done on the process
reading the \ehelf{} itself, but some other process, or even on a stack dump of
a long-terminated process).
Unlike in the \ehframe, and unlike what should be done in a release, Unlike in the \ehframe, and unlike what should be done in a release,
real-world-proof version of the \ehelfs, the choice was made to keep this real-world-proof version of the \ehelfs, the choice was made to keep this
@ -675,20 +726,24 @@ is not sufficient to analyze every stack frame as \prog{gdb} would do after a
In the unwind context from Listing~\ref{lst:unw_ctx}, the values of type In the unwind context from Listing~\ref{lst:unw_ctx}, the values of type
\lstc{uintptr_t} are the values of the corresponding registers, and \lstc{uintptr_t} are the values of the corresponding registers, and
\lstc{flags} is a 8-bytes value, indicating for each register whether it is \lstc{flags} is a 8-bits value, indicating for each register whether it is
present or not in this context (\ie{} if the \lstc{rbx} bit is not set, the present or not in this context (\ie{} if the \lstc{rbx} bit is not set, the
value of \lstc{rbx} in the structure isn't meaningful), plus an error bit, value of \lstc{rbx} in the structure isn't meaningful), plus an error bit,
indicating whether an error occurred during unwinding. indicating whether an error occurred during unwinding (which can be due \eg{}
to an unsupported operation in the original DWARF, thus compiled to an error).
This generated data is stored in separate shared object files, which we call This generated data is stored in separate shared object files, which we call
\ehelfs. It would have been possible to alter the original ELF file to embed \ehelfs. It would have been possible to alter the original ELF file to embed
this data as a new section, but it getting it to be executed just as any this data as a new section, but getting it to be executed just as any
portion of the \lstc{.text} section would probably have been painful, and portion of the \lstc{.text} section would probably have been painful, and
keeping it separated during the experimental phase is quite convenient. It is keeping it separated during the experimental phase is quite convenient. It is
possible to have multiple versions of \ehelfs{} files in parallel, with various possible to have multiple versions of \ehelfs{} files in parallel, with various
options turned on or off, and it doesn't require to alter the base system by options turned on or off, and it doesn't require to alter the base system by
editing \eg{} \texttt{/usr/lib/libc-*.so}. Instead, when the \ehelf{} data is editing \eg{} \texttt{/usr/lib/libc-*.so}. Instead, when the \ehelf{} data is
required, those files can simply be \lstc{dlopen}'d. required, those files can simply be \lstc{dlopen}'d. It is also possible to
imagine, in a future environment production, packaging \ehelfs{} files
separately, so that people interested in heavy computation can have the choice
to install them.
\medskip \medskip
@ -705,15 +760,19 @@ generated for the C code in Listing~\ref{lst:ex1_c}.
Without any particular care to efficiency or compactness, it is already Without any particular care to efficiency or compactness, it is already
possible to produce a compiled version very close to the one described in possible to produce a compiled version very close to the one described in
Section~\ref{sec:semantics}. Although the unwinding speed cannot yet be Section~\ref{sec:semantics}. Although the unwinding speed cannot yet be
actually benchmarked, it is already possible to write in a few hundreds of line actually benchmarked, it is already possible to write in a few hundred lines of
of C a simple stack walker printing the functions traversed. It already works C code a simple stack walker printing the functions traversed. It already works
without any problem on the easily tested cases, since corner cases are mostly without any problem on the easily tested cases, since corner cases are mostly
found in standard and highly optimal libraries, and it is not that easy to get found in standard and highly optimized libraries, and it is not that easy to get
the program to stop and print a stack trace from within a system library the program to stop and print a stack trace from within a system library
without using a debugger. without using a debugger.
The major drawback of this approach, without any particular care taken, is the The major drawback of this approach, without any particular care taken, is the
space waste. space waste. The space taken by those tentative \ehelfs{} is analyzed in
Table~\ref{table:basic_eh_elf_space} for \prog{hackbench}, a small program
introduced later in Section~\ref{ssec:bench_perf}, and the libraries on which
it depends.
\begin{table}[h] \begin{table}[h]
\centering \centering
@ -736,11 +795,6 @@ space waste.
\caption{Basic \ehelfs{} space usage}\label{table:basic_eh_elf_space} \caption{Basic \ehelfs{} space usage}\label{table:basic_eh_elf_space}
\end{table} \end{table}
The space taken by those tentative \ehelfs{} is analyzed in
Table~\ref{table:basic_eh_elf_space} for \prog{hackbench}, a small program
introduced later in Section~\ref{ssec:bench_perf}, and the libraries on which
it depends.
The first column only includes the sizes of the ELF sections \lstc{.text} (the The first column only includes the sizes of the ELF sections \lstc{.text} (the
program itself) and \lstc{.rodata}, the read-only data (such as static strings, program itself) and \lstc{.rodata}, the read-only data (such as static strings,
etc.). Only the weight of the \lstc{.text} section of the generated \ehelfs{} etc.). Only the weight of the \lstc{.text} section of the generated \ehelfs{}
@ -764,16 +818,17 @@ made in order to shrink the \ehelfs.
The major optimization that most reduced the output size was to use an if/else The major optimization that most reduced the output size was to use an if/else
tree implementing a binary search on the program counter relevant intervals, tree implementing a binary search on the program counter relevant intervals,
instead of a huge switch. In the process, we also \emph{outline} a lot of code, instead of a huge switch. In the process, we also \emph{outline} a lot of code,
that is, find out identical code blocks, move them outside of the if/else tree, that is, find out identical ``switch cases'' bodies (which are not switch cases
identify them by a label, and jump to them using a \lstc{goto}, which anymore, but if bodies), move them outside of the if/else tree, identify them
de-duplicates a lot of code and contributes greatly to the shrinking. In the by a label, and jump to them using a \lstc{goto}, which de-duplicates a lot of
process, we noticed that the vast majority of FDE rows are actually taken among code and contributes greatly to the shrinking. In the process, we noticed that
very few ``common'' FDE rows. the vast majority of FDE rows are actually taken among very few ``common'' FDE
rows.
This makes this optimization really efficient, as seen later in This makes this optimization really efficient, as seen later in
Section~\ref{ssec:results_size}, but also makes it an interesting question --- Section~\ref{ssec:results_size}, but also makes it an interesting question
not investigated during this internship --- to find out whether standard DWARF --~not investigated during this internship~-- to find out whether standard
data could be efficiently compressed in this way. DWARF data could be efficiently compressed in this way.
\begin{minipage}{0.45\textwidth} \begin{minipage}{0.45\textwidth}
\lstinputlisting[language=C, caption={\ehelf{} for the previous example}, \lstinputlisting[language=C, caption={\ehelf{} for the previous example},
@ -806,15 +861,16 @@ However, unwinding over and over again from the same program point would have
had no interest at all, since \prog{libunwind} would have simply cached the had no interest at all, since \prog{libunwind} would have simply cached the
relevant DWARF row. In the mean time, making sure that the various unwinding relevant DWARF row. In the mean time, making sure that the various unwinding
are made from different locations is somehow cheating, since it makes useless are made from different locations is somehow cheating, since it makes useless
\prog{libunwind}'s caching. All in all, the benchmarking method must have a \prog{libunwind}'s caching and does not reproduce ``real-world'' unwinding
``natural'' distribution of unwindings. distribution. All in all, the benchmarking method must have a ``natural''
distribution of unwindings.
Another requirement is to also distribute quite evenly the unwinding points Another requirement is to also distribute quite evenly the unwinding points
across the program: we would like to benchmark stack unwindings crossing some across the program: we would like to benchmark stack unwindings crossing some
standard library functions, starting from inside them, etc. standard library functions, starting from inside them, etc.
Finally, the unwound program must be interesting enough to enter and exit a lot Finally, the unwound program must be interesting enough to enter and exit a lot
of function, nest function calls, have FDEs that are not as simple as in of functions, nest function calls, have FDEs that are not as simple as in
Listing~\ref{lst:ex1_dw}, etc. Listing~\ref{lst:ex1_dw}, etc.
@ -864,19 +920,23 @@ system and process as much as possible, to be able to unwind in any context.
This very restricted information lacked a memory map (a table indicating which This very restricted information lacked a memory map (a table indicating which
shared object is mapped at which address in memory) in order to use \ehelfs. shared object is mapped at which address in memory) in order to use \ehelfs.
Apart from this, the modified version of \prog{libunwind} produced is entirely Apart from this, the modified version of \prog{libunwind} produced is entirely
compatible with the vanilla version. compatible with the vanilla version, meaning that the only modifications
required to use \ehelfs{} within any project using \prog{libunwind} should be
modifying one line of code (this function call, which is a setup function) and
linking against the modified version of \prog{libunwind} instead of the system
version.
Once this was done, plugging it in \prog{perf} was the matter of a few lines of Once this was done, plugging it in \prog{perf} was the matter of a few lines of
code only. The major problem encountered was to understand how \prog{perf} code only, left apart the benchmarking code. The major problem encountered was
works. In order to avoid perturbing the traced program, \prog{perf} does not to understand how \prog{perf} works. In order to avoid perturbing the traced
unwind at runtime, but rather records at regular interval the program's stack, program, \prog{perf} does not unwind at runtime, but rather records at regular
and all the auxiliary information that is needed to unwind later. This is done intervals the program's stack, and all the auxiliary information that is needed
when running \lstbash{perf record}. Then, \lstbash{perf report} unwinds the to unwind later. This is done when running \lstbash{perf record}. Then,
stack to analyze it; but at this point of time, the traced process is long \lstbash{perf report} unwinds the stack to analyze it; but at this point of
dead, thus any PID-based approach, or any approach using \texttt{/proc} time, the traced process is long dead, thus any PID-based approach, or any
information will fail. However, as this was the easiest method, this approach approach using \texttt{/proc} information will fail. However, as this was the
was chosen when implementing the first version of \ehelfs; thus requiring some easiest method, the first version of \ehelfs{} used those mechanisms; thus
code rewriting. requiring some code rewriting.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Other explored methods} \subsection{Other explored methods}
@ -884,15 +944,15 @@ code rewriting.
The first approach tried to benchmark was trying to create some specific C code The first approach tried to benchmark was trying to create some specific C code
that would meet the requirements from Section~\ref{ssec:bench_req}, while that would meet the requirements from Section~\ref{ssec:bench_req}, while
calling itself a benchmarking procedure from time to time. This was abandoned calling itself a benchmarking procedure from time to time. This was abandoned
quite fast, because generating C code interesting enough to be unwound turned quite quickly, because generating C code interesting enough to be unwound
out hard, and the generated FDEs invariably ended out uninteresting. It would turned out hard, and the generated FDEs invariably ended out uninteresting. It
also never have met the requirement of unwinding from fairly distributed would also never have met the requirement of unwinding from fairly distributed
locations anyway. locations anyway.
Another attempt was made using CSmith~\cite{csmith}, a random C code generator Another attempt was made using CSmith~\cite{csmith}, a random C code generator
initially made for C compilers random testing. The idea was still to craft an initially made for C compilers random testing. The idea was still to craft an
interesting C program that would unwind on its own frequently, but to integrate interesting C program that would unwind on its own frequently, but to integrate
randomly generated C code with CSmith to integrate interesting C snippets that CSmith-randomly generated C code within hand-written C snippets that
would generate large enough FDEs and nested calls. This was abandoned as well would generate large enough FDEs and nested calls. This was abandoned as well
as the call graph of a CSmith-generated code is often far too small, and the as the call graph of a CSmith-generated code is often far too small, and the
CSmith code is notoriously hard to understand and edit. CSmith code is notoriously hard to understand and edit.

View file

@ -7,3 +7,6 @@
\newcommand{\set}[1]{\left\{ #1 \right\}} \newcommand{\set}[1]{\left\{ #1 \right\}}
\newcommand{\card}[1]{\left\vert{} #1 \right\vert} \newcommand{\card}[1]{\left\vert{} #1 \right\vert}
\newcommand{\abs}[1]{\left\vert{} #1 \right\vert} \newcommand{\abs}[1]{\left\vert{} #1 \right\vert}
\newcommand{\tnhead}[2]{\multicolumn{1}{#1}{#2}} % Table neutral head
\newcommand{\spaced}[2]{\hspace{#1} #2 \hspace{#1}}

View file

@ -1,5 +1,8 @@
%% Specific commands for this project %% Specific commands for this project
\newcommand{\stackfhead}[1]
{\tnhead{l}{\hspace{-5ex}$\reg{rsp} #1$ \hspace{2em}}}
\newcommand{\prog}[1]{\texttt{#1}} \newcommand{\prog}[1]{\texttt{#1}}
\newcommand{\ehelf}{\texttt{eh\_elf}} \newcommand{\ehelf}{\texttt{eh\_elf}}
\newcommand{\ehelfs}{\texttt{eh\_elfs}} \newcommand{\ehelfs}{\texttt{eh\_elfs}}