Mention codebase location

This commit is contained in:
Théophile Bastian 2017-08-24 15:42:32 +02:00
parent dbbe189f7f
commit 5cafe06d2f
1 changed files with 63 additions and 41 deletions

View File

@ -66,8 +66,6 @@
\tableofcontents
\todo{Talk of the repo, somewhere}
\pagebreak
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -75,24 +73,26 @@
In the previous years, verification and proved software has gathered an
increasing interest in the computer science community, as people realised how
hard bugs are to track down. But hardware bugs are even more tedious to find
and fix, and can easily lead to disastrous consequences, as those cannot be
patched on existing hardware. For instance, the well-known Pentium
``\textsc{fdiv}'' bug~\cite{pratt1995fdiv} that affected a large number of
Pentium processors lead to wrong results for some floating point divisions.
Intel had to offer to replace all the defective CPUs, leading to an announced
loss of 475 million dollars~\cite{nicely_fdiv}. Even recently, the Skylake and
Kaby Lake hyperthreading bug had to be patched using microcode, loosing
performance and reliability.
hard bugs are to track down, and the little confidence they had in their own
code. But hardware bugs are even more tedious to find and fix, and can easily
lead to disastrous consequences, as those cannot be patched on existing
hardware. For instance, the well-known Pentium ``\textsc{fdiv}''
bug~\cite{pratt1995fdiv} that affected a large number of Pentium processors
lead to wrong results for some floating point divisions. Intel had to offer to
replace all the defective CPUs, leading to an announced loss of 475 million
dollars~\cite{nicely_fdiv}. Even recently, the Skylake and Kaby Lake
hyperthreading bug had to be patched using microcode, losing performance and
reliability.
To avoid such disasters, the industry nowadays uses a wide range of techniques
to catch bugs as early as possible --- which, hopefully, is before the product's
release date. Among those are \todo{list + cite}, but also proved hardware. On
circuits as complex as processors, usually, only sub-components are proved
correct in a specified context --- that should, but is not proved to, be
respected by the other parts of the circuit. Yet, this trade-off between proved
correctness and engineer's work time already gives a pretty good confidence in
the circuit.
to catch bugs as early as possible --- which, hopefully, is before the
product's release date. Among those are \todo{list + cite}, but also proved
hardware. On circuits as complex as processors, usually, only sub-components
are proved correct with respect to a given specification of its behaviour,
while the circuit is kept in a specified context, \ie{} a set of valid inputs
and outputs, etc. --- that should, but is not proved to, be respected by the
other parts of the circuit. Yet, this trade-off between proved correctness and
engineer's work time already gives a pretty good confidence in the circuit.
In this context, Carl Seger was one of the main developers of fl at Intel, a
functional ml-inspired programming language integrating many features useful to
@ -113,7 +113,7 @@ understand, and make it easier to understand to work more efficiently.
and Mary Sheeran, aiming to develop tools for proved design of FPGA circuits.
One of the keystones of this project is an open-sourced and publicly available
version of fl, used for the proving part, and is still at the moment under
development.
heavy development.
My part of the work resided on this ``search and replace'' tool. More
specifically, I focused on writing a C++ library, \emph{isomatch}, which is
@ -152,7 +152,7 @@ functions for this task.
\vert~&- & \textit{(sub)} \\
\vert~&\times & \textit{(times)} \\
\vert~&\div & \textit{(div)} \\
\vert~&\% & \textit{(mod)} \\
\vert~&\% & \textit{(mod)} \\ % chktex 35
\vert~&\lsl & \textit{(logical shift left)} \\
\vert~&\lsr & \textit{(logical shift right)} \\
\vert~&\asr & \textit{(arithmetic shift right)} \\
@ -186,9 +186,31 @@ can be seen as the construction of a circuit by assembling smaller integrated
circuits (ICs), themselves built the same way, etc. A group is composed of
sub-circuits, input pins and output pins. Each level can of course contain
``leaf'' gates, like \textit{and} or \textit{delay} gates. This is important,
because it allows the program to work on smaller areas the circuit (\eg{}
because it allows the program to work on smaller areas of the circuit (\eg{}
loading in memory only a part of the circuit, etc.).
\emph{Isomatch} comes along with a small parser for a toy ad-hoc language,
designed to allow one to quickly write a test circuit and run the
\emph{isomatch} functions on it. There was no real apparent need for a better
language, or a standard circuit description language (such as VSDL), since the
user will mostly use \emph{isomatch} through fl, and feed it directly with data
read from fl --- which is able to handle \eg{} VSDL\@.
\subsection{Codebases}
Carl Seger's new version of fl is currently being developed as \textbf{VossII},
and is yet to be open-sourced.
My contribution to the project, \textbf{isomatch}, is free software, and is
available on GitHub:
\begin{center}
\raisebox{-0.4\height}{
\includegraphics[height=2em]{../common/github32.png}}
\hspace{1em}
\url{https://github.com/tobast/circuit-isomatch/}
\end{center}
\subsection{Objective}
More precisely, the problems that \emph{isomatch} must solve are the following.
@ -199,25 +221,25 @@ More precisely, the problems that \emph{isomatch} must solve are the following.
way, with possibly different names, etc.?
\item\label{prob:match} Given two circuits, \emph{needle} and
\emph{haystack}, find every (non-overlapping) occurrence of \emph{needle} in
\emph{haystack}. An occurrence is a set $S$ of sub-circuits of
\emph{haystack} such that there is a one-to-one mapping of structurally
equivalent circuits of $S$ with circuits of \emph{needle}, and those
circuits are connected the same way in both circuits.
\emph{haystack}, find every (non-overlapping) occurrence of
\emph{needle} in \emph{haystack}. An occurrence is a set $S$ of
sub-circuits of \emph{haystack} such that there is a one-to-one mapping
of structurally equivalent circuits of $S$ with circuits of
\emph{needle}, and those circuits are connected the same way in both circuits.
\end{enumerate}
Both problems are hard. The first one is an instance of graph isomorphism, as
the actual question is whether there exists a one-to-one mapping between
sub-circuits of the two groups, such that every mapped circuit is equal to the
other (either directly if it is a leaf gate, or recursively with the same
procedure); and whether this mapping respects connections (edges) between those
circuits. Graph isomorphism is known to be in NP (given a permutation of the
first graph, it is polynomial to check whether the first is equal to the second
\wrt{} the permutation), but not known to be in either P or NP-complete. Thus,
since Babai's work on graph isomorphism~\cite{babai2016graph} is only of
theoretical interest, the known algorithms remain in worst-case exponential
time, and require ad-hoc heuristics for specific kind of graphs to get maximum
efficiency.
Both problems are hard. The first one is an instance of the graph isomorphism
problem, as the actual question is whether there exists a one-to-one mapping
between sub-circuits of the two groups, such that every mapped circuit is equal
to the other (either directly if it is a leaf gate, or recursively with the
same procedure); and whether this mapping respects connections (edges) between
those circuits. Graph isomorphism is known to be in NP (given a permutation of
the first graph, it is polynomial to check whether the first is equal to the
second \wrt{} the permutation), but not known to be in either P or NP-complete.
Thus, since Babai's work on graph isomorphism~\cite{babai2016graph} is only of
theoretical interest (at the moment), the known algorithms remain in worst-case
exponential time, and require ad-hoc heuristics for specific kind of graphs to
get maximum efficiency.
The second one is an instance of subgraph isomorphism problem, which is known
to be NP-complete~\cite{cook1971complexity}. Even though a few algorithms
@ -231,9 +253,9 @@ The goal of \textit{isomatch} is to be applied to large circuits on-the-fly,
during their conception. Those circuits can (and will probably) be as large as
a full processor, and the software will be operated by a human, working on
their circuit. Thus, \textit{isomatch} must be as fast as possible, since
matching operation will be executed often, and often multiple times in a row.
It must then remain fast enough for the human not to lose too much time, and
eventually lose patience.
matching operations will be executed quite often, and often multiple times in a
row. It must then remain fast enough for the human not to lose too much time,
and eventually lose patience.
\todo{Mention clean codebase somewhere}