99 lines
5 KiB
TeX
99 lines
5 KiB
TeX
\section{Staticdeps}
|
|
|
|
The static analyzer we present, \staticdeps{}, only aims to tackle the
|
|
difficulty~\ref{memcarried_difficulty_arith} mentioned above: tracking
|
|
dependencies across arbitrarily complex pointer arithmetic.
|
|
|
|
To do so, \staticdeps{} works at the basic-block level, unrolled enough times
|
|
to fill the reorder buffer as detailed above; this way, arbitrarily
|
|
long-reaching relevant loop-carried dependencies can be detected.
|
|
|
|
This problem could be solved using symbolic calculus algorithms. However, those
|
|
algorithms are not straightforward to implement, and the equality test between
|
|
two arbitrary expressions can be costly.
|
|
|
|
\subsection{The \staticdeps{} heuristic}
|
|
|
|
Instead, we use an heuristic based on random values. We consider the set $\calR
|
|
= \left\{0, 1, \ldots, 2^{64}-1\right\}$ of values representable by a 64-bits
|
|
unsigned integer; we extend this set to $\bar\calR = \calR \cup \{\bot\}$,
|
|
where $\bot$ denotes an invalid value. We then proceed as previously for
|
|
register-carried dependencies, applying the following principles.
|
|
|
|
\smallskip{}
|
|
\begin{itemize}
|
|
\item{} Whenever an unknown value is read, either from a register or from
|
|
memory, generate a fresh value from $\calR$, uniformly sampled at
|
|
random. This value is saved to a shadow register file or memory, and
|
|
will be used again the next time this same data is accessed.
|
|
|
|
\item{} Whenever an integer arithmetic operation is encountered, compute
|
|
the result of the operation and save the result to the shadow register
|
|
file or memory.
|
|
|
|
\item{} Whenever another kind of operation, or an operation that is
|
|
unsupported, is encountered, save the destination operand as $\bot$;
|
|
this operation is assumed to not be valid pointer arithmetic.
|
|
Operations on $\bot$ always yield $\bot$ as a result.
|
|
|
|
\item{} Whenever writing to a memory location, compute the written address
|
|
using the above principles, and proceed as with a dynamic analysis,
|
|
keeping track of the instruction that last wrote to a memory address.
|
|
|
|
\item{} Whenever reading from a memory location, compute the read address
|
|
using the above principles, and generate a dependency from the current
|
|
instruction to the instruction that last wrote to this address (if
|
|
known).
|
|
\end{itemize}
|
|
|
|
\subsection{Practical implementation}
|
|
|
|
We implement \staticdeps{} in Python, using \texttt{pyelftools} and the
|
|
\texttt{capstone} disassembler ---~which we already introduced in
|
|
\autoref{sec:benchsuite_bb}~--- to extract and disassemble the targeted basic
|
|
block. The semantics needed to compute encountered operations are obtained by
|
|
lifting the kernel's assembly to \valgrind{}'s \vex{} intermediary
|
|
representation.
|
|
|
|
\medskip{}
|
|
|
|
The implementation of the heuristic detailed above provides us with a raw list
|
|
of dependencies across iterations of the considered basic block. We then
|
|
``re-roll'' the unrolled kernel by transcribing each dependency to a triplet
|
|
$(\texttt{source\_insn}, \texttt{dest\_insn}, \Delta{}k)$, where the first two
|
|
elements are the source and destination instruction of the dependency \emph{in
|
|
the original, non-unrolled kernel}, and $\Delta{}k$ is the number of iterations
|
|
of the kernel between the source and destination instruction of the dependency.
|
|
|
|
Finally, we filter out spurious dependencies: each dependency found should
|
|
occur for each kernel iteration $i$ at which $i + \Delta{}k$ is within bounds.
|
|
If the dependency is found for less than $80\,\%$ of those iterations, the
|
|
dependency is declared spurious and is dropped.
|
|
|
|
\subsection{Limitations}\label{ssec:staticdeps_limits}
|
|
|
|
In \autoref{chap:CesASMe}, we argued that one of the shortcomings that most
|
|
crippled state-of-the-art tools was that analyses were conducted
|
|
out-of-context, considering only the basic block at hand. This analysis is also
|
|
true for \staticdeps{}, as it is still focused on a single basic block in
|
|
isolation; in particular, any aliasing that stems from outside of the analyzed
|
|
basic block is not visible to \staticdeps{}.
|
|
|
|
Work towards a broader analysis range, \eg{} at the scale of a function, or at
|
|
least initializing values with gathered assertions ---~maybe based on abstract
|
|
interpretation techniques~--- could be beneficial to the quality of
|
|
dependencies detections.
|
|
|
|
\medskip{}
|
|
|
|
As \staticdeps{}'s heuristic is based on randomness in a Monte Carlo sense, it
|
|
may yield false positives: two registers could theoretically be assigned the
|
|
same value sampled at random, making them aliasing addresses. This is, however,
|
|
very improbable, as values are sampled from a set of cardinality $2^{64}$. If
|
|
necessary, the error can be reduced by amplification: running multiple times
|
|
the algorithm on different randomness seeds reduces the error exponentially.
|
|
|
|
Conversely, \staticdeps{} should not present false negatives due to randomness.
|
|
Dependencies may go undetected, \eg{} because of out-of-scope aliasing or
|
|
unsupported operations. However, no dependency that falls into the scope of
|
|
\depsim{}'s analysis should be missed because of random initialisations.
|