A72: minor changes
This commit is contained in:
parent
eef1889478
commit
ca430288b8
1 changed files with 9 additions and 8 deletions
|
@ -78,14 +78,15 @@ throughput is a linear function of the number of \uops{} in the kernel~---, we
|
||||||
then know that either $\cyc{\kerK} = \cycF{\kerK}$ or $\cyc{\kerK} =
|
then know that either $\cyc{\kerK} = \cycF{\kerK}$ or $\cyc{\kerK} =
|
||||||
\cycB{\kerK}$.
|
\cycB{\kerK}$.
|
||||||
|
|
||||||
For a given instruction $i$, we then construct a sequence $\kerK_k$ of kernels
|
For a given instruction $i$ and for a certain $k \in \nat$, we then construct a
|
||||||
|
kernel $\kerK_k$
|
||||||
such that:
|
such that:
|
||||||
\begin{enumerate}[(i)]
|
\begin{enumerate}[(i)]
|
||||||
\item\label{cnd:kerKk:compo} for all $k \in \nat$, $\kerK_k$ is composed of the instruction $i$,
|
\item\label{cnd:kerKk:compo} $\kerK_k$ is composed of the instruction $i$,
|
||||||
followed by $k$ basic instructions;
|
followed by $k$ basic instructions;
|
||||||
\item\label{cnd:kerKk:linear} the kernels $\kerK_k$ are simple enough to exhibit this purely linear
|
\item\label{cnd:kerKk:linear} the kernel $\kerK_k$ is simple enough to exhibit this purely linear
|
||||||
frontend behaviour;
|
frontend behaviour;
|
||||||
\item\label{cnd:kerKk:fbound} after a certain rank, $\cycB{\kerK_k} \leq \cycF{\kerK_k}$.
|
\item\label{cnd:kerKk:fbound} $\cycB{\kerK_k} \leq \cycF{\kerK_k}$.
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
We denote by $\mucount{}\kerK$ the number of \uops{} in kernel $\kerK$. Under
|
We denote by $\mucount{}\kerK$ the number of \uops{} in kernel $\kerK$. Under
|
||||||
|
@ -123,7 +124,7 @@ incorrect for this instruction, we should measure
|
||||||
$\ceil{\cyc{\imath}} \leq \cyc{\kerK_{k_0}}$ and $\cyc{\kerK_{k_0}} +
|
$\ceil{\cyc{\imath}} \leq \cyc{\kerK_{k_0}}$ and $\cyc{\kerK_{k_0}} +
|
||||||
\sfrac{1}{3} = \cyc{\kerK_{k_0+1}}$. For instructions $i$ where it is not the
|
\sfrac{1}{3} = \cyc{\kerK_{k_0+1}}$. For instructions $i$ where it is not the
|
||||||
case, increasing $k_0$ by 3 or using other basic instructions eventually
|
case, increasing $k_0$ by 3 or using other basic instructions eventually
|
||||||
yielded satisfying measures. Finally, we then obtain
|
yielded satisfying measures. Finally, we obtain
|
||||||
|
|
||||||
\[
|
\[
|
||||||
\mucount{}i = 3 \cyc{\kerK_{k_0}} - k_0
|
\mucount{}i = 3 \cyc{\kerK_{k_0}} - k_0
|
||||||
|
@ -332,7 +333,7 @@ kernel, frontend-wise.
|
||||||
\bigskip{}
|
\bigskip{}
|
||||||
|
|
||||||
This model, however, is not satisfactory in many cases. For instance, the
|
This model, however, is not satisfactory in many cases. For instance, the
|
||||||
kernel $\kerK' = \texttt{ADDV} + 2x\basic{Int01}$ is predicted to run in $1.5$
|
kernel $\kerK' = \texttt{ADDV} + 2\times\basic{Int01}$ is predicted to run in
|
||||||
cycles, as depicted in \autoref{fig:frontend_nocross_addv_2add}; however, a
|
$1.5$ cycles, as depicted in \autoref{fig:frontend_nocross_addv_2add}; however,
|
||||||
\pipedream{} measure yields $\cyc{\kerK'} = 1.35 \simeq 1\,\sfrac{1}{3}$
|
a \pipedream{} measure yields $\cyc{\kerK'} = 1.35 \simeq 1\,\sfrac{1}{3}$
|
||||||
cycles.
|
cycles.
|
||||||
|
|
Loading…
Reference in a new issue