phd-thesis/manuscrit/00_opening/10_abstract.tex

\selectlanguage{french}
\begin{abstract}
    Qu'il s'agisse de calculs massifs distribués sur plusieurs racks, de de
    calculs en environnement contraint comme de l'embarqué ou de l'\emph{edge
    computing} ou encore de tentatives de réduire l'empreinte écologique d'un
    programme fréquemment utilisé, de nombreux cas d'usage justifient
    l'optimisation poussée d'un programme. Celle-ci s'arrête souvent à
    l'optimisation de haut niveau (algorithmique, parallélisme, \ldots), mais
    il est possible de la pousser jusqu'à une optimisation bas-niveau,
    s'intéressant à l'assembleur généré en regard de la microarchitecture du
    processeur précis utilisé.

    Une telle optimisation demande une compréhension fine des aspects à la fois
    logiciels et matériels en jeu, et est bien souvent cantonnée aux experts du
    domaine.  Les \emph{code analyzers} (analyseurs de code), cependant,
    permettent d'abaisser le niveau d'expertise nécessaire pour accomplir de
    telles optimisations, en automatisant une partie du travail de
    compréhension des problèmes de performance rencontrés.  Ces mêmes outils
    permettent également aux experts d'être plus efficaces dans leur travail.

    Dans ce manuscrit, nous étudierons les principaux goulots d'étranglement de
    performance d'un processeur, sur lesquels l'état de l'art montre des
    performances inégales. Nous apportons, sur chacun de ces goulots
    d'étranglement, une contribution nouvelle~: automatisation de l'obtention
    d'un modèle du \emph{backend}, étude manuelle du \emph{frontend} en vue de
    l'automatisation de son modèle, et extraction automatique des dépendances
    \emph{à travers la mémoire} d'un noyau de calcul. Nous apportons également
    une étude systématique et automatisée des performances de prédiction de
    différents \emph{code analyzers} de l'état de l'art.
\end{abstract}
\clearpage

\selectlanguage{english}
\begin{abstract}
    Be it massively distributed computation over multiple server racks,
    constrained computation in embedded environments or \emph{edge computing},
    or still an attempt to reduce the ecological footprint of a frequently-run
    program, many use-cases make it relevant to deeply optimize a program. This
    optimisation is often limited to high-level optimisation --~choice of
    algorithms, parallel computing, \ldots{} Yet, it is possible to carry it
    further to low-level optimisations, by inspecting the generated assembly
    with respect to the microarchitecture of the specific microprocessor used
    to fine-tune it.

    Such an optimisation level requires a very detailed comprehension of both
    the software and hardware aspects implied, and is most often the realm of
    experts. \emph{Code analyzers}, however, are tools that help lowering the
    expertise threshold required to perform such optimisations by automating
    away a portion of the work required to understand the source of the
    encountered performance problems. The same tools are also useful to
    experts, as they help them to be more efficient in their work.

    In this manuscript, we study the main performance bottlenecks of a
    processor, on which the state of the art does not perform consistently. For
    each of these bottlenecks, we contribute to the state of the art. We work
    on automating the obtention of a model of the processor's \emph{backend};
    we manually study the processor's \emph{frontend}, hoping to set a
    milestone towards the automation of the obtention of such models; we
    provide a tool to automatically extract a computation kernel's
    \emph{memory-carried} dependencies. We also provide a systematic, automated
    and fully-tooled study of the prediction accuracy of various
    state-of-the-art code analyzers.
\end{abstract}
No results found.