Multiple rephrasings

This commit is contained in:
Théophile Bastian 2018-08-17 21:48:45 +02:00
parent 0c7350ff95
commit 4016b4f46c
2 changed files with 20 additions and 17 deletions

View file

@ -32,7 +32,7 @@ computation~\cite{oakley2011exploiting}.
\subsection*{The research problem}
As debugging data can easily get heavy beyond reasonable if stored carelessly,
As debugging data can easily take an unreasonable space if stored carelessly,
the DWARF standard pays a great attention to data compactness and compression,
and succeeds particularly well at it. But this, as always, is at the expense
of efficiency: accessing stack unwinding data for a particular program point
@ -118,7 +118,7 @@ below. Yet, it supports the vast majority --~around $99.9$\ \%~-- of the cases
seen in the wild, and is decently robust compared to \prog{libunwind}, the
reference implementation. Indeed, corner cases occur often, and on a 27000
samples test, 885 failures were observed for \prog{libunwind}, against 1099 for
the compiled DWARF version.
the compiled DWARF version (see Section~\ref{ssec:timeperf}).
The implementation, however, as a few other limitations. It only supports the
x86\_64 architecture, and relies to some extent on the Linux operating system.
@ -132,7 +132,7 @@ virtually any operating system.
In most cases of everyday's life, a slow stack unwinding is not a problem, or
even an annoyance. Yet, having a 26 times speed-up on stack unwinding-heavy
tasks, such as profiling, can be really useful to profile heavy programs,
tasks, such as profiling, can be really useful to profile large programs,
particularly if one wants to profile many times in order to analyze the impact
of multiple changes. It can also be useful for exception-heavy programs. Thus,
it might be interesting to implement a more stable version, and try to

View file

@ -179,9 +179,9 @@ traced program at regular, short intervals, inspect their stack, and determine
which function is currently being run. They also often perform a stack
unwinding to determine the call path to this function, to determine which
function indirectly takes time: \eg, a function \lstc{fct_a} can call both
\lstc{fct_b} and \lstc{fct_c}, which are quite heavy; spend practically no time
directly in \lstc{fct_a}, but spend a lot of time in calls to the other two
functions that were made from \lstc{fct_a}.
\lstc{fct_b} and \lstc{fct_c}, which take a lot of time; spend practically no
time directly in \lstc{fct_a}, but spend a lot of time in calls to the other
two functions that were made from \lstc{fct_a}.
Exception handling also requires a stack unwinding mechanism in most languages.
Indeed, an exception is completely different from a \lstinline{return}: while the
@ -375,13 +375,6 @@ distribution of FDE rows count. The histogram in
Figure~\ref{fig:fde_line_density} was generated on a random sample of around
2000 ELF files present on an ArchLinux system.
Most of the FDEs seem to be quite small, which only reflects that most
functions found in the wild are relatively small and do not particularly
allocate many times on the stack. Yet, the median value is at $8$ rows per FDE,
and the average is at $9.7$, which is already not that fast to unwind. Values
up to $50$ are not that uncommon, given some commonly used functions have such
large FDEs, and often end up in the call stack.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Unwinding state-of-the-art}
@ -531,7 +524,7 @@ second frame will require loading the corresponding DWARF information.
The function is the following:
\lstinputlisting[language=C]{src/dw_semantics/c_context.c}
\lstinputlisting[language=C]{src/dw_semantics/c_context.c}\label{lst:sem_c_ctx}
The translation of $\intermedlang$ as produced by the later-defined function
are then to be inserted in this context, where the comment states so.
@ -743,7 +736,10 @@ In the unwind context from Listing~\ref{lst:unw_ctx}, the values of type
\lstc{flags} is a 8-bits value, indicating for each register whether it is
present or not in this context, plus an error bit, indicating whether an error
occurred during unwinding. Such errors can be due \eg{} to an unsupported
operation in the original DWARF\@.
operation in the original DWARF\@. This context differs from the one presented
in Section~\ref{lst:sem_c_ctx}, since the previous one was only an array of
values, and the one from the real implementation is more robust, in particular
by including an error flag by lack of $\bot$ value.
This generated data is stored in separate shared object files, which we call
\ehelfs. It would have been possible to alter the original ELF file to embed
@ -997,7 +993,7 @@ computer has 32\,GB of RAM, and care was taken never to fill it and start
swapping.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Measured time performance}
\subsection{Measured time performance}\label{ssec:timeperf}
A benchmarking of \ehelfs{} against the vanilla \prog{libunwind} was made using
the exact same methodology as in Section~\ref{ssec:bench_perf}, only linking
@ -1057,6 +1053,14 @@ without using multiple cores to compile, the various shared objects needed to
run \prog{hackbench} --~that is, \prog{hackbench}, \prog{libc}, \prog{ld} and
\prog{libpthread}~-- are compiled in an overall time of $25.28$ seconds.
The unwinding errors observed are hard to investigate, but are most probably
due to truncated stack records. Indeed, since \prog{perf} dumps the last $n$
bytes of the call stack (for a given $n$), and only keeps those for later
unwinding, large stacks leads to lost information when analyzing the results.
The difference between \ehelfs{} and the vanilla library could be due either to
unsupported DWARF instructions or registers, \prog{libdwarfpp} bugs or bugs in
the custom \prog{libunwind} implementation that were not spotted.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Measured compactness}\label{ssec:results_size}
@ -1218,7 +1222,6 @@ It is also worth noting that of all the 4000 analyzed files, there are only 12
that contained all the unsupported expressions seen, and only 24 that contained
some unsupported instruction at all.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End main text content %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%