31 lines
1.8 KiB
TeX
31 lines
1.8 KiB
TeX
\section*{Conclusion}
|
|
|
|
In this chapter, we studied data dependencies within assembly kernels; and more
|
|
specifically, data dependencies occurring through memory accesses, which we
|
|
call \emph{memory-carried dependencies}. \cesasme{}'s analysis showed in
|
|
\autoref{chap:CesASMe} that this kind of dependency was responsible for a
|
|
significant portion of state-of-the-art analyzers' prediction errors.
|
|
|
|
We introduce \staticdeps{}, a heuristic approach based on random values as
|
|
representatives of abstract values. This approach is able to find data
|
|
dependencies, including memory-carried ones, loop-carried or not, leveraging
|
|
semantics of the assembly code provided by \valgrind{}'s \vex. It is, however,
|
|
still unable to find aliasing addresses whose source of aliasing is outside of
|
|
the studied block's scope ---~and, as such, suffers from the \emph{lack of
|
|
context} pointed out in the previous chapter.
|
|
|
|
\medskip{}
|
|
|
|
Our evaluation of \staticdeps{} against a dynamic analysis baseline,
|
|
\depsim{}, shows that it finds between 95\,\% and 98\,\% of the existing
|
|
dependencies, depending on the metric used, giving us good confidence in the
|
|
reliability of \staticdeps{}.
|
|
We further enrich \uica{} with \staticdeps{}, and find that it performs on the
|
|
full \cesasme{}'s dataset as well as \uica{} alone on the pruned dataset of
|
|
\cesasme{}, removing memory-carried bottlenecks. From this, we conclude that
|
|
\staticdeps{} is very successful at finding the data dependencies through
|
|
memory that actually matter from a performance analysis perspective. We also
|
|
find that, despite being written in pure Python, \staticdeps{} is at least
|
|
30$\times$ faster than its C dynamic counterpart, \depsim; as such, we expect
|
|
a compiled and optimized implementation of \staticdeps{} to be two to three
|
|
orders of magnitude faster than \depsim{}.
|