More progress
intro: cite techniques talk about ml 1st version benchs on order of indexation in Ullmann
This commit is contained in:
parent
ca21ecf0ff
commit
9f3120f0aa
2 changed files with 85 additions and 9 deletions
|
@ -61,7 +61,7 @@
|
||||||
}
|
}
|
||||||
|
|
||||||
@inproceedings{seger2006integrating,
|
@inproceedings{seger2006integrating,
|
||||||
title={Integrating design and verification-from simple idea to practical system},
|
title={Integrating design and verification-from simple idea to practical system (abstract only)},
|
||||||
author={Seger, Carl},
|
author={Seger, Carl},
|
||||||
booktitle={Formal Methods and Models for Co-Design, 2006. MEMOCODE'06. Proceedings. Fourth ACM and IEEE International Conference on},
|
booktitle={Formal Methods and Models for Co-Design, 2006. MEMOCODE'06. Proceedings. Fourth ACM and IEEE International Conference on},
|
||||||
pages={161--162},
|
pages={161--162},
|
||||||
|
@ -70,7 +70,7 @@
|
||||||
}
|
}
|
||||||
|
|
||||||
@inproceedings{seger2006design,
|
@inproceedings{seger2006design,
|
||||||
title={The design of a floating point execution unit using the Integrated Design and Verification (IDV) system},
|
title={The design of a floating point execution unit using the Integrated Design and Verification (IDV) system (abstract only)},
|
||||||
author={Seger, Carl},
|
author={Seger, Carl},
|
||||||
booktitle={Int. Workshop on Designing Correct Circuits},
|
booktitle={Int. Workshop on Designing Correct Circuits},
|
||||||
year={2006}
|
year={2006}
|
||||||
|
|
|
@ -84,17 +84,35 @@ reliability.
|
||||||
|
|
||||||
To avoid such disasters, the industry nowadays uses a wide range of techniques
|
To avoid such disasters, the industry nowadays uses a wide range of techniques
|
||||||
to catch bugs as early as possible --- which, hopefully, is before the
|
to catch bugs as early as possible --- which, hopefully, is before the
|
||||||
product's release date. Among those are \todo{list + cite}, but also proved
|
product's release date. These techniques include of course a lot of testing on
|
||||||
hardware. On circuits as complex as processors, usually, only sub-components
|
simulated hardware or FPGAs (since an actual processor is extremely expensive
|
||||||
are proved correct with respect to a given specification of its behaviour,
|
to actually print). A lot of testing is run as a routine on the current version
|
||||||
|
of the hardware, to catch and notify the designers, since it remains the
|
||||||
|
easiest way to test the behaviour of a circuit. Symbolic trajectory evaluation
|
||||||
|
has also its place in the domain, allowing one to run a circuit on a few cycles
|
||||||
|
(before it becomes too expensive) with symbolic values, \ie{} variables instead
|
||||||
|
of zeroes, ones and $X$s (for ``not a value''). This kind of testing is way
|
||||||
|
more powerful than plain testing, since its results are more precise; yet it is
|
||||||
|
also too expensive to run on a significantly long number of cycles, and
|
||||||
|
therefore a lot of bugs are impossible to catch this way.
|
||||||
|
|
||||||
|
The previous methods are a good cheap method to test a circuit, but give only
|
||||||
|
little confidence in its correction --- it only proves that among all the cases
|
||||||
|
that were tested, all yielded a correct behaviour. These reasons led to the
|
||||||
|
development of proved hardware in the industry. On circuits as complex as
|
||||||
|
processors, usually, only sub-components are proved correct with respect to a
|
||||||
|
given specification of its behaviour (usually source code that should behave as
|
||||||
|
the processor is expected to behave, itself with respect to the written
|
||||||
|
documentation draft of the circuit). These proofs are typically valid only
|
||||||
while the circuit is kept in a specified context, \ie{} a set of valid inputs
|
while the circuit is kept in a specified context, \ie{} a set of valid inputs
|
||||||
and outputs, etc. --- that should, but is not proved to, be respected by the
|
and outputs, etc. --- that should, but is not proved to, be respected by the
|
||||||
other parts of the circuit. Yet, this trade-off between proved correctness and
|
other parts of the circuit. Yet, this trade-off between proved correctness and
|
||||||
engineer's work time already gives a pretty good confidence in the circuit.
|
engineer's work time already gives a pretty good confidence in the circuit.
|
||||||
|
|
||||||
In this context, Carl Seger was one of the main developers of fl at Intel, a
|
In this context, Carl Seger was one of the main developers of fl at
|
||||||
functional ml-inspired programming language integrating many features useful to
|
Intel~\cite{seger1993vos}~\cite{seger2005industrially}~\cite{seger2006design},
|
||||||
get insights of a circuit, testing it and proving it. It mostly features
|
a functional ml-inspired programming language integrating many features useful
|
||||||
|
to get insights of a circuit, testing it and proving it. It mostly features
|
||||||
symbolic trajectory evaluation based model checking and theorem proving, and is
|
symbolic trajectory evaluation based model checking and theorem proving, and is
|
||||||
intended to be an all-purpose toolbox for the hardware prover.
|
intended to be an all-purpose toolbox for the hardware prover.
|
||||||
|
|
||||||
|
@ -291,6 +309,8 @@ and eventually lose patience.
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{General approach}
|
\section{General approach}
|
||||||
|
|
||||||
|
\subsection{Theoretical solution}
|
||||||
|
|
||||||
The global strategy used to solve efficiently the problem can be broken down to
|
The global strategy used to solve efficiently the problem can be broken down to
|
||||||
three main parts.
|
three main parts.
|
||||||
|
|
||||||
|
@ -329,6 +349,19 @@ This part mostly consists in an implementation of Ullmann's algorithm for
|
||||||
subgraph isomorphism~\cite{ullmann1976algorithm}, a well-known algorithm for
|
subgraph isomorphism~\cite{ullmann1976algorithm}, a well-known algorithm for
|
||||||
this problem, that uses the specificities of the graph to be a little faster.
|
this problem, that uses the specificities of the graph to be a little faster.
|
||||||
|
|
||||||
|
\subsection{Workflow}
|
||||||
|
|
||||||
|
In a first time, to get the algorithms, heuristics and methods right, I
|
||||||
|
designed a prototype in OCaml. This prototype was not intended to --- and
|
||||||
|
neither was --- optimized, but allowed me to find places where the program took
|
||||||
|
an unacceptable amount of time. For instance, I left the prototype that I
|
||||||
|
thought fast enough to compute a group equality a whole night long, before
|
||||||
|
finding out in the morning it was actually not fast enough at all. This first
|
||||||
|
version was also written in a strongly typed language, with a lot of static
|
||||||
|
guarantees. It gave me enough confidence in my methods and strategies to move
|
||||||
|
on to a way faster and optimized C++ version, the current version of
|
||||||
|
\emph{isomatch}.
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Signatures}\label{sec:signatures}
|
\section{Signatures}\label{sec:signatures}
|
||||||
|
|
||||||
|
@ -670,7 +703,50 @@ a given hierarchy group of the haystack. To avoid wasting computation time, we
|
||||||
first check that every signature present in the needle is present at least as
|
first check that every signature present in the needle is present at least as
|
||||||
many times in the haystack. This simple check saved a lot of time.
|
many times in the haystack. This simple check saved a lot of time.
|
||||||
|
|
||||||
\todo{More stuff?}
|
\paragraph{Conversion to adjacency matrix.} The internal structures and graphs
|
||||||
|
are represented as inherited classes of \lstcpp{CircuitTree}, connected to
|
||||||
|
various \lstcpp{WireId}s. Thus, there is no adjacency matrix easily available,
|
||||||
|
and the various vertices have no predefined IDs. Thus, we need to assign IDs to
|
||||||
|
every vertex, \ie{} every gate and wire.
|
||||||
|
|
||||||
|
\paragraph{Order of rows and columns} In his original paper, Ullmann recommends
|
||||||
|
to index the vertices in order of decreasing degree (\ie, the vertices with
|
||||||
|
highest degree are topmost/leftmost in the matrix). This amplifies the effect
|
||||||
|
of the refinement procedure, since vertices with higher degree are connected to
|
||||||
|
more vertices, and thus has a higher chance of having a neighbour with no
|
||||||
|
matching neighbour in the haystack. This allows the algorithm to cut whole
|
||||||
|
branches of the search tree very early, and speeds up the algorithm a lot. Yet,
|
||||||
|
Ullmann also states that on specific cases, a different order might be more
|
||||||
|
appropriate.
|
||||||
|
|
||||||
|
My idea was that it might be interesting to put first all the wires, and then
|
||||||
|
all the circuits, or the other way around. For that, I did a few benchmarks.
|
||||||
|
The measures were made on a system with a nearly-constant load during the
|
||||||
|
experiments. The machine had a i7-6700 processor (3.6GHz reported % chktex 8
|
||||||
|
frequency). All the measures are averaged over 100 runs, and were measured on
|
||||||
|
the usual test set.
|
||||||
|
|
||||||
|
\vspace{1em}
|
||||||
|
\begin{tabular}{l r r}
|
||||||
|
\textbf{Ordering method} & \textbf{Run time (ms)} & \textbf{Loss (\%)} \\
|
||||||
|
Wires by degree decreasing, then gates as they come & 48.8 & --- \\
|
||||||
|
As they come, gates then wires & 49.1 & 0.6\% \\
|
||||||
|
Wires by degree decreasing, then gates by degree decreasing & 49.3
|
||||||
|
& 1.0\% \\
|
||||||
|
As they come, wires then gates & 49.3 & 1.0\% \\
|
||||||
|
Gates as they come, then wires by degree decreasing & 49.5 & 1.4\% \\
|
||||||
|
By degree decreasing, all mixed & 49.5 & 1.4\% \\
|
||||||
|
\end{tabular}
|
||||||
|
\vspace{1em}
|
||||||
|
|
||||||
|
The time loss is nearly insignificant, and can be explained by constant costs:
|
||||||
|
when we want to sort vertices, the vector must be copied then sorted, which can
|
||||||
|
be more expensive than just taking its elements as they come, if the gain is
|
||||||
|
not high enough.
|
||||||
|
|
||||||
|
Nevertheless, I chose to implement the fastest option with respect to this
|
||||||
|
benchmark. If the gap is nearly insignificant, the choice can't really be
|
||||||
|
drastically harmful in the end.
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Performance}
|
\section{Performance}
|
||||||
|
|
Loading…
Reference in a new issue