Minor fixes
This commit is contained in:
parent
d3b99be7a1
commit
103e6a0687
3 changed files with 9 additions and 5 deletions
manuscrit/60_staticdeps
|
@ -37,7 +37,9 @@ physical registers than what is exposed to the user through the ISA; a renaming
|
||||||
phrase will allocate those registers to avoid the effects of WaR and WaW
|
phrase will allocate those registers to avoid the effects of WaR and WaW
|
||||||
dependencies as much as possible.
|
dependencies as much as possible.
|
||||||
|
|
||||||
\smallskip{} For the present chapter, \textit{we only consider read-after-write
|
\smallskip{}
|
||||||
|
|
||||||
|
For the present chapter, \textit{we only consider read-after-write
|
||||||
dependencies}; however, all the techniques we present are applicable to other
|
dependencies}; however, all the techniques we present are applicable to other
|
||||||
types of dependencies if the considered architecture requires to take them into
|
types of dependencies if the considered architecture requires to take them into
|
||||||
account.
|
account.
|
||||||
|
|
|
@ -11,7 +11,7 @@ detected by keeping track of which instruction last wrote each register in a
|
||||||
Loop-carried dependencies can, to some extent, be detected the same way. As the
|
Loop-carried dependencies can, to some extent, be detected the same way. As the
|
||||||
basic block is always assumed to be the body of an infinite loop, a
|
basic block is always assumed to be the body of an infinite loop, a
|
||||||
straight-line analysis can be performed on a duplicated kernel. This strategy
|
straight-line analysis can be performed on a duplicated kernel. This strategy
|
||||||
is \eg{} adopted by \osaca{}~\cite{osaca2} (§II.D). When dealing only with
|
is \eg{} adopted by \osaca{}~\cite[§II.D]{osaca2}. When dealing only with
|
||||||
register accesses, this strategy is always sufficient: as each iteration always
|
register accesses, this strategy is always sufficient: as each iteration always
|
||||||
executes the same basic block, it is not possible for an instruction to depend
|
executes the same basic block, it is not possible for an instruction to depend
|
||||||
on another instruction two iterations earlier or more.
|
on another instruction two iterations earlier or more.
|
||||||
|
|
|
@ -20,10 +20,11 @@ performance}\label{ssec:staticdeps:rob_proof}
|
||||||
instructions $\left(I_{p+kn}, I_{q+kn}\right)_{k\in\nat}$.
|
instructions $\left(I_{p+kn}, I_{q+kn}\right)_{k\in\nat}$.
|
||||||
\end{theorem}
|
\end{theorem}
|
||||||
|
|
||||||
To prove this assertion, we require a few postulates describing the functioning
|
To prove this assertion, we require a few postulates to model the functioning
|
||||||
of a CPU and, in particular, how \uops{} transit in (decoded) and out (retired)
|
of a CPU and, in particular, how \uops{} transit in (decoded) and out
|
||||||
the reorder buffer.
|
(retired) the reorder buffer.
|
||||||
|
|
||||||
|
\needspace{10em}
|
||||||
\begin{postulate}[Reorder buffer as a circular buffer]
|
\begin{postulate}[Reorder buffer as a circular buffer]
|
||||||
The reorder buffer is a circular buffer of size $R \in \nat^+$.
|
The reorder buffer is a circular buffer of size $R \in \nat^+$.
|
||||||
It contains only decoded \uops{}.
|
It contains only decoded \uops{}.
|
||||||
|
@ -90,6 +91,7 @@ Consequently, if both $i_q$ and $i_{q'}$ are in-flight then
|
||||||
$|q'-q|<R$, and thus $\distance{I_p}{I_{p'}} < R - 1$.
|
$|q'-q|<R$, and thus $\distance{I_p}{I_{p'}} < R - 1$.
|
||||||
\end{proof}
|
\end{proof}
|
||||||
|
|
||||||
|
\needspace{15em}
|
||||||
\begin{postulate}[Issue delay]
|
\begin{postulate}[Issue delay]
|
||||||
Reasons why the issue of a \uop{} $i$ is delayed can be:
|
Reasons why the issue of a \uop{} $i$ is delayed can be:
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
|
|
Loading…
Add table
Reference in a new issue