diff --git a/common/refs.bib b/common/refs.bib index def4768..48230a0 100644 --- a/common/refs.bib +++ b/common/refs.bib @@ -61,7 +61,7 @@ } @inproceedings{seger2006integrating, - title={Integrating design and verification-from simple idea to practical system}, + title={Integrating design and verification-from simple idea to practical system (abstract only)}, author={Seger, Carl}, booktitle={Formal Methods and Models for Co-Design, 2006. MEMOCODE'06. Proceedings. Fourth ACM and IEEE International Conference on}, pages={161--162}, @@ -70,7 +70,7 @@ } @inproceedings{seger2006design, - title={The design of a floating point execution unit using the Integrated Design and Verification (IDV) system}, + title={The design of a floating point execution unit using the Integrated Design and Verification (IDV) system (abstract only)}, author={Seger, Carl}, booktitle={Int. Workshop on Designing Correct Circuits}, year={2006} diff --git a/report/report.tex b/report/report.tex index 7e79964..24fa8ce 100644 --- a/report/report.tex +++ b/report/report.tex @@ -84,17 +84,35 @@ reliability. To avoid such disasters, the industry nowadays uses a wide range of techniques to catch bugs as early as possible --- which, hopefully, is before the -product's release date. Among those are \todo{list + cite}, but also proved -hardware. On circuits as complex as processors, usually, only sub-components -are proved correct with respect to a given specification of its behaviour, +product's release date. These techniques include of course a lot of testing on +simulated hardware or FPGAs (since an actual processor is extremely expensive +to actually print). A lot of testing is run as a routine on the current version +of the hardware, to catch and notify the designers, since it remains the +easiest way to test the behaviour of a circuit. Symbolic trajectory evaluation +has also its place in the domain, allowing one to run a circuit on a few cycles +(before it becomes too expensive) with symbolic values, \ie{} variables instead +of zeroes, ones and $X$s (for ``not a value''). This kind of testing is way +more powerful than plain testing, since its results are more precise; yet it is +also too expensive to run on a significantly long number of cycles, and +therefore a lot of bugs are impossible to catch this way. + +The previous methods are a good cheap method to test a circuit, but give only +little confidence in its correction --- it only proves that among all the cases +that were tested, all yielded a correct behaviour. These reasons led to the +development of proved hardware in the industry. On circuits as complex as +processors, usually, only sub-components are proved correct with respect to a +given specification of its behaviour (usually source code that should behave as +the processor is expected to behave, itself with respect to the written +documentation draft of the circuit). These proofs are typically valid only while the circuit is kept in a specified context, \ie{} a set of valid inputs and outputs, etc. --- that should, but is not proved to, be respected by the other parts of the circuit. Yet, this trade-off between proved correctness and engineer's work time already gives a pretty good confidence in the circuit. -In this context, Carl Seger was one of the main developers of fl at Intel, a -functional ml-inspired programming language integrating many features useful to -get insights of a circuit, testing it and proving it. It mostly features +In this context, Carl Seger was one of the main developers of fl at +Intel~\cite{seger1993vos}~\cite{seger2005industrially}~\cite{seger2006design}, +a functional ml-inspired programming language integrating many features useful +to get insights of a circuit, testing it and proving it. It mostly features symbolic trajectory evaluation based model checking and theorem proving, and is intended to be an all-purpose toolbox for the hardware prover. @@ -291,6 +309,8 @@ and eventually lose patience. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{General approach} +\subsection{Theoretical solution} + The global strategy used to solve efficiently the problem can be broken down to three main parts. @@ -329,6 +349,19 @@ This part mostly consists in an implementation of Ullmann's algorithm for subgraph isomorphism~\cite{ullmann1976algorithm}, a well-known algorithm for this problem, that uses the specificities of the graph to be a little faster. +\subsection{Workflow} + +In a first time, to get the algorithms, heuristics and methods right, I +designed a prototype in OCaml. This prototype was not intended to --- and +neither was --- optimized, but allowed me to find places where the program took +an unacceptable amount of time. For instance, I left the prototype that I +thought fast enough to compute a group equality a whole night long, before +finding out in the morning it was actually not fast enough at all. This first +version was also written in a strongly typed language, with a lot of static +guarantees. It gave me enough confidence in my methods and strategies to move +on to a way faster and optimized C++ version, the current version of +\emph{isomatch}. + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Signatures}\label{sec:signatures} @@ -670,7 +703,50 @@ a given hierarchy group of the haystack. To avoid wasting computation time, we first check that every signature present in the needle is present at least as many times in the haystack. This simple check saved a lot of time. -\todo{More stuff?} +\paragraph{Conversion to adjacency matrix.} The internal structures and graphs +are represented as inherited classes of \lstcpp{CircuitTree}, connected to +various \lstcpp{WireId}s. Thus, there is no adjacency matrix easily available, +and the various vertices have no predefined IDs. Thus, we need to assign IDs to +every vertex, \ie{} every gate and wire. + +\paragraph{Order of rows and columns} In his original paper, Ullmann recommends +to index the vertices in order of decreasing degree (\ie, the vertices with +highest degree are topmost/leftmost in the matrix). This amplifies the effect +of the refinement procedure, since vertices with higher degree are connected to +more vertices, and thus has a higher chance of having a neighbour with no +matching neighbour in the haystack. This allows the algorithm to cut whole +branches of the search tree very early, and speeds up the algorithm a lot. Yet, +Ullmann also states that on specific cases, a different order might be more +appropriate. + +My idea was that it might be interesting to put first all the wires, and then +all the circuits, or the other way around. For that, I did a few benchmarks. +The measures were made on a system with a nearly-constant load during the +experiments. The machine had a i7-6700 processor (3.6GHz reported % chktex 8 +frequency). All the measures are averaged over 100 runs, and were measured on +the usual test set. + +\vspace{1em} +\begin{tabular}{l r r} + \textbf{Ordering method} & \textbf{Run time (ms)} & \textbf{Loss (\%)} \\ + Wires by degree decreasing, then gates as they come & 48.8 & --- \\ + As they come, gates then wires & 49.1 & 0.6\% \\ + Wires by degree decreasing, then gates by degree decreasing & 49.3 + & 1.0\% \\ + As they come, wires then gates & 49.3 & 1.0\% \\ + Gates as they come, then wires by degree decreasing & 49.5 & 1.4\% \\ + By degree decreasing, all mixed & 49.5 & 1.4\% \\ +\end{tabular} +\vspace{1em} + +The time loss is nearly insignificant, and can be explained by constant costs: +when we want to sort vertices, the vector must be copied then sorted, which can +be more expensive than just taking its elements as they come, if the gain is +not high enough. + +Nevertheless, I chose to implement the fastest option with respect to this +benchmark. If the gap is nearly insignificant, the choice can't really be +drastically harmful in the end. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Performance}