diff --git a/report/fiche_synthese.tex b/report/fiche_synthese.tex index 2952b0b..36a0206 100644 --- a/report/fiche_synthese.tex +++ b/report/fiche_synthese.tex @@ -8,12 +8,13 @@ \subsection*{The general context} -The standard debugging data format, DWARF, contains tables that, for a given -instruction pointer (IP), permit to understand how the assembly instruction -relates to the source code, where variables are currently allocated in memory -or if they are stored in a register, what are their type and how to unwind the -current stack frame. This information is generated when passing \eg{} the -switch \lstbash{-g} to \prog{gcc} or equivalents. +The standard debugging data format, DWARF (Debugging With Attributed Record +Formats), contains tables permitting, for a given instruction pointer (IP), to +understand how instructions from the assembly code relates to the original +source code, where are variables currently allocated in memory or if they are +stored in a register, what are their type and how to unwind the current stack +frame. This information is generated when passing \eg{} the switch \lstbash{-g} +to \prog{gcc} or equivalents. Even in stripped (non-debug) binaries, a small portion of DWARF data remains: the stack unwinding data. This information is necessary to unwind stack @@ -28,7 +29,7 @@ Section~\ref{ssec:instr_cov}~\textendash, consisting in offsets from memory addresses stored in registers (such as \reg{rbp} or \reg{rsp}). Yet, the standard defines rules that take the form of a stack-machine expression that can access virtually all the process's memory and perform Turing-complete -computation~\cite{oakley2011exploiting}. +computations~\cite{oakley2011exploiting}. \subsection*{The research problem} @@ -83,8 +84,8 @@ few samples (around $10\,\mu s$ per frame) to avoid statistical errors. Having enough samples for this purpose --~at least a few thousands~-- is not easy, since one must avoid unwinding the same frame over and over again, which would only benchmark the caching mechanism. The other problem is to distribute -evenly the unwinding measures across the various IPs, including directly into -the loaded libraries (\eg{} the \prog{libc}). +evenly the unwinding measures across the various IPs, among which those +directly located into the loaded libraries (\eg{} the \prog{libc}). The solution eventually chosen was to modify \prog{perf}, the standard profiling program for Linux, in order to gather statistics and benchmarks of its unwindings. Modifying \prog{perf} was an additional challenge that turned @@ -131,7 +132,7 @@ the compiled DWARF version (see Section~\ref{ssec:timeperf}). The implementation, however, is not yet production-ready: it only supports the x86\_64 architecture, and relies to some extent on the Linux operating system. None of these pose a fundamental problem. Supporting other processor -architectures and ABIs are only a matter of engineering,. The operating system +architectures and ABIs are only a matter of engineering. The operating system dependency is only present in the libraries developed in order to interact with the compiled unwinding data, which can be developed for virtually any operating system.