From ca430288b84c10c6a5fc424682cbf90c48eb65a3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Th=C3=A9ophile=20Bastian?= Date: Sat, 23 Sep 2023 16:45:00 +0200 Subject: [PATCH] A72: minor changes --- .../40_A72-frontend/30_manual_frontend.tex | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/manuscrit/40_A72-frontend/30_manual_frontend.tex b/manuscrit/40_A72-frontend/30_manual_frontend.tex index 2c8988f..4708525 100644 --- a/manuscrit/40_A72-frontend/30_manual_frontend.tex +++ b/manuscrit/40_A72-frontend/30_manual_frontend.tex @@ -78,14 +78,15 @@ throughput is a linear function of the number of \uops{} in the kernel~---, we then know that either $\cyc{\kerK} = \cycF{\kerK}$ or $\cyc{\kerK} = \cycB{\kerK}$. -For a given instruction $i$, we then construct a sequence $\kerK_k$ of kernels +For a given instruction $i$ and for a certain $k \in \nat$, we then construct a +kernel $\kerK_k$ such that: \begin{enumerate}[(i)] - \item\label{cnd:kerKk:compo} for all $k \in \nat$, $\kerK_k$ is composed of the instruction $i$, + \item\label{cnd:kerKk:compo} $\kerK_k$ is composed of the instruction $i$, followed by $k$ basic instructions; - \item\label{cnd:kerKk:linear} the kernels $\kerK_k$ are simple enough to exhibit this purely linear + \item\label{cnd:kerKk:linear} the kernel $\kerK_k$ is simple enough to exhibit this purely linear frontend behaviour; - \item\label{cnd:kerKk:fbound} after a certain rank, $\cycB{\kerK_k} \leq \cycF{\kerK_k}$. + \item\label{cnd:kerKk:fbound} $\cycB{\kerK_k} \leq \cycF{\kerK_k}$. \end{enumerate} We denote by $\mucount{}\kerK$ the number of \uops{} in kernel $\kerK$. Under @@ -123,7 +124,7 @@ incorrect for this instruction, we should measure $\ceil{\cyc{\imath}} \leq \cyc{\kerK_{k_0}}$ and $\cyc{\kerK_{k_0}} + \sfrac{1}{3} = \cyc{\kerK_{k_0+1}}$. For instructions $i$ where it is not the case, increasing $k_0$ by 3 or using other basic instructions eventually -yielded satisfying measures. Finally, we then obtain +yielded satisfying measures. Finally, we obtain \[ \mucount{}i = 3 \cyc{\kerK_{k_0}} - k_0 @@ -332,7 +333,7 @@ kernel, frontend-wise. \bigskip{} This model, however, is not satisfactory in many cases. For instance, the -kernel $\kerK' = \texttt{ADDV} + 2x\basic{Int01}$ is predicted to run in $1.5$ -cycles, as depicted in \autoref{fig:frontend_nocross_addv_2add}; however, a -\pipedream{} measure yields $\cyc{\kerK'} = 1.35 \simeq 1\,\sfrac{1}{3}$ +kernel $\kerK' = \texttt{ADDV} + 2\times\basic{Int01}$ is predicted to run in +$1.5$ cycles, as depicted in \autoref{fig:frontend_nocross_addv_2add}; however, +a \pipedream{} measure yields $\cyc{\kerK'} = 1.35 \simeq 1\,\sfrac{1}{3}$ cycles.