98 lines
5.1 KiB
TeX
98 lines
5.1 KiB
TeX
\section{A baseline: dynamic dependencies detection with \valgrind{}}
|
|
|
|
As we already saw, a dynamic analyzer, such as \gus{}, has direct access to
|
|
the actual data dependencies occurring throughout an execution. While such
|
|
analyzers are often too slow to use in practice, they can be used as a baseline
|
|
to evaluate static alternatives.
|
|
|
|
As it is a complex tool performing a wide range of analyses, \gus{} is,
|
|
however, unnecessarily complex to simply serve as a baseline. For the same
|
|
reason, it is also impractically slower than a simple dynamic analysis. For
|
|
this reason, we implement \depsim{}, a dynamic analysis tool based on top of
|
|
\valgrind{}, whose goal is to report dependencies encountered at runtime.
|
|
|
|
\subsection{Valgrind}
|
|
|
|
Most low-level developers and computer scientists know
|
|
\valgrind{}~\cite{tool:valgrind} as a memory analysis tool, reporting bad
|
|
memory accesses, memory leaks and the like. However, this is only a small part
|
|
of \valgrind{} ---~the \texttt{memcheck} tool. The whole program is actually a
|
|
binary instrumentation framework, upon which the famous \texttt{memcheck} is
|
|
built.
|
|
|
|
\valgrind{} supports a wide variety of platforms, including x86-64 and
|
|
ARM. However, at the time of the writing, it supports AVX2, but does not yet
|
|
support AVX-512 on x86-64~\cite{tool:valgrind_arch_support} ---~there is,
|
|
however, ongoing work in that direction~\cite{valgrind_avx512}. While its
|
|
documentation is seemingly sparse and not frequently maintained, its code
|
|
is well-commented and readable, with interfaces cleanly exposed; thus, once one
|
|
gets a good enough comprehension of how the project's code is structured, this
|
|
lack of documentation is not problematic. \valgrind{} is also a free and
|
|
open-source software, distributed under the GNU GPLv2 license.
|
|
|
|
\medskip{}
|
|
|
|
In order to instrument a binary file, \valgrind{} first lifts the original
|
|
assembly code to an intermediate representation. The instrumentation tool is
|
|
then able to modify this intermediate representation before passing it back to
|
|
Valgrind, which will re-compile it to a native binary before running it.
|
|
|
|
While this intermediate representation, called \vex{}, is convenient to
|
|
instrument a binary, it may further be used as a way to obtain \emph{semantics}
|
|
for some assembly code, independently of the Valgrind framework.
|
|
|
|
\subsection{Depsim}\label{ssec:depsim}
|
|
|
|
The tool we wrote to extract runtime-gathered dependencies, \depsim{}, is
|
|
able to extract dependencies through both registers, memory and temporary
|
|
variables ---~in its intermediate representation, Valgrind keeps some values
|
|
assigned to temporary variables in static single-assignment (SSA) form.
|
|
It however supports a flag to detect only memory-carried dependencies, as this
|
|
will be useful to evaluate our static algorithm later.
|
|
|
|
As a dynamic tool, the distinction between straight-line code and loop-carried
|
|
dependencies is irrelevant, as the analysis follows the actual program flow.
|
|
|
|
\medskip{}
|
|
|
|
In order to track dependencies, each basic block of the program is
|
|
instrumented. Dependencies are stored as a hash table and represented as a
|
|
pair of source and destination program counter; they are mapped to a number of
|
|
encountered occurrences.
|
|
|
|
Dependencies through temporaries are, by construction, resident to a single
|
|
basic block ---~they are thus statically detected at instrumentation time. At
|
|
runtime, the occurrence count of those dependencies is updated whenever the
|
|
basic block is executed.
|
|
|
|
For both register- and memory-carried dependencies, each write is instrumented
|
|
by adding a runtime write to a \emph{shadow} register file or memory, noting
|
|
that the written register or memory address was last written at the current
|
|
program counter. Each read, in turn, is instrumented by adding a fetch to this
|
|
shadow register file or memory, retrieving the last program counter at which
|
|
this location was written to; the dependency count between this program counter
|
|
and the current program counter is then incremented.
|
|
|
|
In practice, the shadow register file is simply implemented as an array
|
|
holding, for each register id, the last program counter that wrote at this
|
|
location. The shadow memory is instead implemented as a hash table.
|
|
|
|
At the end of the run, all the dependencies retrieved are reported. Care is
|
|
taken to translate back the runtime program counters to addresses in the
|
|
original ELF files, using the running process' memory map.
|
|
|
|
\medskip{}
|
|
|
|
Dependencies are mostly relevant if their source and destination are close
|
|
enough to be computationally meaningful. To this end, we also introduce in
|
|
\depsim{} a notion of \emph{dependency lifetime}. As we do not have access
|
|
without a heavy runtime slowdown to elapsed cycles in \valgrind{}, we define a
|
|
\emph{timestamp} as the number of instructions executed since beginning of the
|
|
program's execution; we increment this count at each branch instruction to
|
|
avoid excessive instrumentation slowdown.
|
|
|
|
We further annotate every write to the shadow memory with the timestamp at
|
|
which it occurred. Whenever a dependency should be added, we first check that
|
|
the dependency has not expired ---~that is, that it is not older than a given
|
|
threshold. This threshold is tunable for each run ---~and may be set to infinity
|
|
to keep every dependency.
|