From 8203502e9a57248b39709f4029e20d15de333efa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Th=C3=A9ophile=20Bastian?= Date: Fri, 3 Aug 2018 17:36:57 +0200 Subject: [PATCH] =?UTF-8?q?Rework=20fiche=20de=20synth=C3=A8se?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- report/fiche_synthese.tex | 97 ++++++++++++++++++++++----------------- shared/report.bib | 9 ++++ 2 files changed, 64 insertions(+), 42 deletions(-) diff --git a/report/fiche_synthese.tex b/report/fiche_synthese.tex index b11ff43..b946eca 100644 --- a/report/fiche_synthese.tex +++ b/report/fiche_synthese.tex @@ -9,37 +9,54 @@ \subsection*{The general context} The standard debugging data format for ELF binary files, DWARF, contains a lot -of information. Among those are the stack unwinding data, which allows to -unwind stack frames, restoring machine registers to their proper values, for -instance within the context of a debugger. +of information, which is generated mostly when passing \eg{} the switch +\lstbash{-g} to \prog{gcc}. This information, essentially provided for +debuggers, contains all that is needed to connect the generated assembly with +the original code, information that can be used by sanitizers (\eg{} the type +of each variable in the source language), etc. + +Even in stripped (non-debug) binaries, a small portion of DWARF data remains. +Among this essential data that is never stripped is the stack unwinding data, +which allows to unwind stack frames, restoring machine registers to the value +they had in the previous frame, for instance within the context of a debugger +or a profiler. + +This data is structured into tables, each row corresponding to an program +counter (PC) range for which it describes valid unwinding data, and each column +describing how to unwind a particular machine register (or virtual register +used for various purposes). These rules are mostly basic, consisting in offsets +from memory addresses stored in registers (such as \reg{rbp} or \reg{rsp}), but +in some cases, they can take the form of a stack-machine expression that can +access virtually all the process's memory and perform Turing-complete +computation~\cite{oakley2011exploiting}. + +\subsection*{The research problem} As debugging data can easily get heavy beyond reasonable if stored carelessly, the DWARF standard pays a great attention to data compactness and compression, -and succeeds particularly well at it. But this, as always, is at the expense +and succeeds particularly well at it. But this, as always, is at the expense of efficiency: accessing stack unwinding data for a particular program point can be quite costly. This is often not a huge problem, as stack unwinding is mostly thought of as a debugging procedure: when something behaves unexpectedly, the programmer might -be interested in exploring the stack, moving around between stack frames, -tracing the program path leading to some bug, \ldots{} Yet, stack unwinding -might, in some cases, be performance-critical: for instance, profiler programs -needs to perform a whole lot of stack unwindings. Even worse, exception -handling relies on stack unwinding in order to find a suitable catch-block! +be interested in exploring the stack. Yet, stack unwinding might, in some +cases, be performance-critical: for instance, profiler programs needs to +perform a whole lot of stack unwindings. Even worse, exception handling relies +on stack unwinding in order to find a suitable catch-block! For such +applications, it might be desirable to find a different time/space trade-off, +allowing a slightly space-heavier, but far more time-efficient unwinding +procedure. -The most widely used library used for stack unwinding, +This different trade-off is the question that I explored during this +internship: what good alternative trade-off is reachable when storing the stack +unwinding data completely differently? + +It seems that the subject has not really been explored yet, and as of now, the +most widely used library for stack unwinding, \prog{libunwind}~\cite{libunwind}, essentially makes use of aggressive but fine-tuned caching and optimized code to mitigate this problem. -\subsection*{The research problem} - -\todo{Split the previous paragraph into two paragraphs, fitting this section as -well} - -\note{I have trouble figuring out what is expected here, and what is expected -in the previous section…} - - % What is the question that you studied? % Why is it important, what are the applications/consequences? % Is it a new problem? @@ -48,11 +65,10 @@ in the previous section…} \subsection*{Your contribution} -This internship explored the possibility to compile the standard ELF debugging -information format, DWARF, directly into native assembly on the x86\_64 -architecture. Instead of parsing and interpreting at runtime the debug data, -the stack unwinding data is accessed as a function of a dynamically-loaded -shared library. +This internship explored the possibility to compile DWARF's stack unwinding +data directly into native assembly on the x86\_64 architecture. Instead of +parsing and interpreting at runtime the debug data, the stack unwinding data is +accessed as a function of a dynamically-loaded shared library. Multiple approaches have been tried, in order to determine which compilation process leads to the best time/space trade-off. @@ -74,7 +90,7 @@ allowing to benchmark \prog{perf} with both the standard stack unwinding data and the alternative experimental compiled format. As a free and enjoyable side-effect, the experimental unwinding data is perfectly interfaced with \prog{libunwind}, and thus interfaceable at practically no cost with any -existing project using the common library \prog{libunwind}. +existing project using the \textit{de facto} standard library \prog{libunwind}. % What is your solution to the question described in the last paragraph? % @@ -103,30 +119,27 @@ on around 200 cases more than \prog{libunwind} on a 27000 samples test (1099 failures, against 885 for \prog{libunwind}). The prototype, unlike \prog{libunwind}, does not support $100\,\%$ of the DWARF -instruction present in the DWARF5 standard~\cite{dwarf5std}. It is also limited -to the x86\_64 architecture, and relies to some extent on the Linux operating -system. But none of those limitations are real problems in practice. As argued -later on, the vast majority of the DWARF instructions actually used in the wild -are implemented; other processor architectures and ABIs are only a matter of -time spent and engineering work; and the operating system dependency is only -present in the libraries developed in order to interact with the compiled -unwinding data, which can be developed for virtually any operating system. +instructions present in the DWARF5 standard~\cite{dwarf5std}. It is also +limited to the x86\_64 architecture, and relies to some extent on the Linux +operating system. But none of those limitations are real problems in practice. +As argued later on, the vast majority of the DWARF instruction set actually +used in the wild is implemented; other processor architectures and ABIs are +only a matter of time spent and engineering work; and the operating system +dependency is only present in the libraries developed in order to interact with +the compiled unwinding data, which can be developed for virtually any operating +system. \subsection*{Summary and future work} -In most cases of everyday's life, the slowness of stack unwinding is not a -problem, or even an annoyance. Yet, having a 25 times speed-up on stack -unwinding-heavy tasks, such as profiling, can be really useful to analyse heavy -programs, particularly if one wants to profile many times in order to analyze -the impact of multiple changes. It can also be useful for exception-heavy +In most cases of everyday's life, a slow stack unwinding is not a problem, or +even an annoyance. Yet, having a 25 times speed-up on stack unwinding-heavy +tasks, such as profiling, can be really useful to profile heavy programs, +particularly if one wants to profile many times in order to analyze the impact +of multiple changes. It can also be useful for exception-heavy programs~\qtodo{cite Stephen's software?}. Thus, it might be interesting to implement a more stable version, and try to interface it cleanly with mainstream tools, such as \prog{perf}. -It might also be interesting to investigate whether it is possible to reach -even greater speeds by using some more complex compilation process that would -have yet to be determined. - Another question worth exploring might be whether it is possible to shrink even more the original DWARF unwinding data, which would be stored in a format not too far from the original standard, by applying techniques close to those diff --git a/shared/report.bib b/shared/report.bib index 1df40b4..db7b722 100644 --- a/shared/report.bib +++ b/shared/report.bib @@ -16,3 +16,12 @@ title = {Libunwind webpage}, url = {http://www.nongnu.org/libunwind/}, } + +@inproceedings{oakley2011exploiting, + title={Exploiting the Hard-Working DWARF: Trojan and Exploit Techniques with No Native Executable Code.}, + author={Oakley, James and Bratus, Sergey}, + booktitle={WOOT}, + pages={91--102}, + year={2011} +} +