Wrapping up: minor rewordings

2024-10-04 18:46:51 +02:00 · 2024-10-04 18:46:51 +02:00 · da9d606325
commit da9d606325
parent ad4d34bf1c
1 changed files with 12 additions and 10 deletions
--- a/manuscrit/90_wrapping_up/main.tex
+++ b/manuscrit/90_wrapping_up/main.tex
@ -14,16 +14,16 @@ described below.

 \medskip{}

-To conclude this manuscript, we loosely combine those three models into a
-predictor, that we call \acombined{}, by taking the maximal prediction among
-the three models.
+To conclude this manuscript, we take a minimalist first approach at combining
+those three models into a predictor, that we call \acombined{}, by taking the
+maximal prediction among the three models.

 This method is clearly less precise than \eg{} \uica{} or \llvmmca{}'s
 methods, which simulate iterations of the kernel while accounting for each
 model. It however allows us to quickly and easily evaluate an \emph{upper
 bound} of the quality of our models: a more refined tool using our models
-should obtain results at least as good as this method ---~and hopefully
-significantly better.
+should obtain results at least as good as this method ---~but we could expect
+it to perform significantly better.

 \section{Critical path model}

@ -38,10 +38,12 @@ by \osaca{}~\cite{osaca2}.
 In our case, we use instructions' latencies inferred by \palmed{} and its
 backend \pipedream{} on the A72.

-However, this method fails to account for out-of-orderness: the latency of an
-instruction is hidden by other computations, independent of the former one's
-result. This instruction-level parallelism is limited by the reorder buffer's
-size.
+\medskip{}
+
+So far, however, this method would fail to account for out-of-orderness: the
+latency of an instruction is hidden by other computations, independent of the
+former one's result. This instruction-level parallelism is limited by the
+reorder buffer's size.

 We thus unroll the kernel as many times as fits in the reorder buffer
 ---~accounting for each instruction's \uop{} count, as we have a frontend model
@ -76,7 +78,7 @@ Osaca (crit. path) & 1773 & 3 & (0.17\,\%) & 84.02\,\% & 70.39\,\% & 40.37\,\% &
    model}\label{fig:a72_combined_stats_boxplot}
 \end{figure}

-We evaluate \acombined{} using \cesasme{} on the Raspberry Pi's Cortex A72,
+We evaluate \acombined{} with \cesasme{} on the Raspberry Pi's Cortex A72,
 using the same set of benchmarks as in \autoref{chap:CesASMe} recompiled for
 AArch64. As most of the code analyzers we studied are unable to run on the A72,
 we are only able to compare \acombined{} to the baseline \perf{} measure,