Add code quality subsection
This commit is contained in:
parent
5cafe06d2f
commit
4f5b0e8784
1 changed files with 93 additions and 34 deletions
|
@ -53,8 +53,8 @@
|
||||||
transformations to a circuit.
|
transformations to a circuit.
|
||||||
|
|
||||||
This problem turns out to be more or less the \emph{subgraph isomorphism
|
This problem turns out to be more or less the \emph{subgraph isomorphism
|
||||||
problem}, which is NP-complete, and must nevertheless be solved fast on
|
problem}, which is NP-complete, and must nevertheless be solved efficiently
|
||||||
processor-sized circuits on this particular case.
|
on processor-sized circuits on this particular case.
|
||||||
|
|
||||||
During my internship, I developed a C++ library to perform this task that
|
During my internship, I developed a C++ library to perform this task that
|
||||||
will be integrated in VossII, based on a few well-known algorithms as well
|
will be integrated in VossII, based on a few well-known algorithms as well
|
||||||
|
@ -211,7 +211,7 @@ available on GitHub:
|
||||||
\url{https://github.com/tobast/circuit-isomatch/}
|
\url{https://github.com/tobast/circuit-isomatch/}
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
\subsection{Objective}
|
\subsection{Problems}
|
||||||
|
|
||||||
More precisely, the problems that \emph{isomatch} must solve are the following.
|
More precisely, the problems that \emph{isomatch} must solve are the following.
|
||||||
|
|
||||||
|
@ -247,6 +247,39 @@ to be NP-complete~\cite{cook1971complexity}. Even though a few algorithms
|
||||||
is nevertheless necessary to implement them the right way, and with the right
|
is nevertheless necessary to implement them the right way, and with the right
|
||||||
heuristics, to get the desired efficiency for the given problem.
|
heuristics, to get the desired efficiency for the given problem.
|
||||||
|
|
||||||
|
\subsection{Code quality}
|
||||||
|
|
||||||
|
Another prominent objective was to keep the codebase as clean as possible.
|
||||||
|
Indeed, this code will probably have to be maintained for quite some time, and
|
||||||
|
most probably by other people than me. This means that the code and all its
|
||||||
|
surroundings must be really clean, readable and reusable. I tried to put a lot
|
||||||
|
of effort in making the code idiomatic and easy to use, through \eg{} the
|
||||||
|
implementation of iterators over my data structures when needed, idiomatic
|
||||||
|
C++14, etc.
|
||||||
|
|
||||||
|
This also means that the code has to be well-documented: the git history had to
|
||||||
|
be kept clean and understandable, and a clean documentation can be generated
|
||||||
|
from the code, using \texttt{doxygen}. The latest documentation is also
|
||||||
|
compiled as HTML pages here:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\raisebox{-0.4\height}{
|
||||||
|
\includegraphics[height=2.3em]{../common/docs.png}}
|
||||||
|
\hspace{1em}
|
||||||
|
\url{https://tobast.fr/m1/isomatch}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Since the code is C++, it is also very prone to diverse bugs. While I did not
|
||||||
|
took the time to integrate unit tests --- which would have been a great
|
||||||
|
addition ---, I used a sequence of test that can be run using \lstc{make
|
||||||
|
test}, and tests a lot of features of isomatch.
|
||||||
|
|
||||||
|
The code is also tested regularly and on a wide variety of cases with
|
||||||
|
\lstbash{valgrind} to ensure that there are no memory errors ---
|
||||||
|
use-after-free, unallocated memory, memory leaks, bad pointer
|
||||||
|
arithmetics,~\ldots In every tested case, strictly no memory is lost, and no
|
||||||
|
invalid read was reported.
|
||||||
|
|
||||||
\subsection{Sought efficiency}
|
\subsection{Sought efficiency}
|
||||||
|
|
||||||
The goal of \textit{isomatch} is to be applied to large circuits on-the-fly,
|
The goal of \textit{isomatch} is to be applied to large circuits on-the-fly,
|
||||||
|
@ -257,10 +290,6 @@ matching operations will be executed quite often, and often multiple times in a
|
||||||
row. It must then remain fast enough for the human not to lose too much time,
|
row. It must then remain fast enough for the human not to lose too much time,
|
||||||
and eventually lose patience.
|
and eventually lose patience.
|
||||||
|
|
||||||
\todo{Mention clean codebase somewhere}
|
|
||||||
|
|
||||||
\todo{Mention VossII somewhere}
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{General approach}
|
\section{General approach}
|
||||||
|
|
||||||
|
@ -268,7 +297,7 @@ The global strategy used to solve efficiently the problem can be broken down to
|
||||||
three main parts.
|
three main parts.
|
||||||
|
|
||||||
\paragraph{Signatures.} The initial idea to make the computation fast is to
|
\paragraph{Signatures.} The initial idea to make the computation fast is to
|
||||||
aggregate the inner data of a gate --- be it a leaf gate or a group --- in a
|
aggregate the inner data of a gate~---~be it a leaf gate or a group~---~in a
|
||||||
kind of hash, a 64 bits unsigned integer. This approach is directly inspired
|
kind of hash, a 64 bits unsigned integer. This approach is directly inspired
|
||||||
from what was done in fl, back at Intel. This hash must be easy to compute,
|
from what was done in fl, back at Intel. This hash must be easy to compute,
|
||||||
and must be based only on the structure of the graph --- that is, must be
|
and must be based only on the structure of the graph --- that is, must be
|
||||||
|
@ -307,8 +336,10 @@ this problem, that uses the specificities of the graph to be a little faster.
|
||||||
|
|
||||||
The signature is computed as a simple hash of the element, and is defined for
|
The signature is computed as a simple hash of the element, and is defined for
|
||||||
every type of expression and circuit. It could probably be enhanced with a bit
|
every type of expression and circuit. It could probably be enhanced with a bit
|
||||||
more work to cover more uniformly the hash space, but no collision was observed
|
more work to cover more uniformly the hash space, but no illegitimate collision
|
||||||
on the examples tested.
|
(that is, a collision that could be avoided with a better hash function, as
|
||||||
|
opposed to collisions due to an equal local graph structure) was observed on
|
||||||
|
the examples tested.
|
||||||
|
|
||||||
\paragraph{Signature constants.} Signature constants are used all around the
|
\paragraph{Signature constants.} Signature constants are used all around the
|
||||||
signing process, and is a 5-tuple $\sigconst{} = (a, x_l, x_h, d_l, d_h)$ of 32
|
signing process, and is a 5-tuple $\sigconst{} = (a, x_l, x_h, d_l, d_h)$ of 32
|
||||||
|
@ -316,13 +347,14 @@ bits unsigned numbers. All of $x_l$, $x_h$, $d_l$ and $d_h$ are picked as prime
|
||||||
numbers between $10^8$ and $10^9$ (which just fits in a 32 bits unsigned
|
numbers between $10^8$ and $10^9$ (which just fits in a 32 bits unsigned
|
||||||
integer); while $a$ is a random integer uniformly picked between $2^{16}$ and
|
integer); while $a$ is a random integer uniformly picked between $2^{16}$ and
|
||||||
$2^{32}$. These constants are generated by a small python script,
|
$2^{32}$. These constants are generated by a small python script,
|
||||||
\path{util/primegen/pickPrimes.py}.
|
\path{util/primegen/pickPrimes.py} in the repository.
|
||||||
|
|
||||||
Those constants are used to produce a 64 bits unsigned value out of another 64
|
Those constants are used to produce a 64 bits unsigned value out of another 64
|
||||||
bits unsigned value, called $v$ thereafter, through an operator $\sigop$,
|
bits unsigned value, called $v$ thereafter, through an operator $\sigop$,
|
||||||
computed as follows (with all computations done on 64 bits unsigned integers).
|
computed as follows (with all computations done on 64 bits unsigned integers).
|
||||||
|
|
||||||
\vspace{1em}
|
\vspace{1em}
|
||||||
|
\begin{center}
|
||||||
\begin{algorithmic}
|
\begin{algorithmic}
|
||||||
\Function{$\sigop$}{$\sigconst{}, v$}
|
\Function{$\sigop$}{$\sigconst{}, v$}
|
||||||
\State{} $out1 \gets (v + a) \cdot x_l$
|
\State{} $out1 \gets (v + a) \cdot x_l$
|
||||||
|
@ -332,6 +364,7 @@ computed as follows (with all computations done on 64 bits unsigned integers).
|
||||||
\State{} \Return{} $low + 2^{32} \cdot high$
|
\State{} \Return{} $low + 2^{32} \cdot high$
|
||||||
\EndFunction{}
|
\EndFunction{}
|
||||||
\end{algorithmic}
|
\end{algorithmic}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
\paragraph{Expressions.} Each type of expression (or, in the case of
|
\paragraph{Expressions.} Each type of expression (or, in the case of
|
||||||
expression with operator, each type of operator) has its signature constant,
|
expression with operator, each type of operator) has its signature constant,
|
||||||
|
@ -358,7 +391,7 @@ capture at all the \emph{structure} of the graph. An information we can capture
|
||||||
without breaking the signature's independence towards the order of description
|
without breaking the signature's independence towards the order of description
|
||||||
of the graph, is the set of its neighbours. Yet, we cannot ``label'' the gates
|
of the graph, is the set of its neighbours. Yet, we cannot ``label'' the gates
|
||||||
without breaking this rule; thus, we represent the set of neighbours by the set
|
without breaking this rule; thus, we represent the set of neighbours by the set
|
||||||
of our \emph{neighbours' signatures}.
|
of the \emph{neighbours' signatures}.
|
||||||
|
|
||||||
At this point, we can define the \emph{signature of order $n$} ($n \in
|
At this point, we can define the \emph{signature of order $n$} ($n \in
|
||||||
\natset$) of a circuit $C$ as follows:
|
\natset$) of a circuit $C$ as follows:
|
||||||
|
@ -375,13 +408,13 @@ At this point, we can define the \emph{signature of order $n$} ($n \in
|
||||||
|
|
||||||
The ``IO adjacency'' term is an additional term in the signatures of order
|
The ``IO adjacency'' term is an additional term in the signatures of order
|
||||||
above $0$, indicating what input and output pins of the circuit group
|
above $0$, indicating what input and output pins of the circuit group
|
||||||
containing the current gate are adjacent to it.
|
containing the current gate are adjacent to it. Adding this information to the
|
||||||
|
signature was necessary, since a lot of gates can be signed differently using
|
||||||
|
this information (see Corner cases in Section~\ref{ssec:corner_cases}).
|
||||||
|
|
||||||
The default order of signature used in all computations, unless more is useful,
|
The default order of signature used in all computations, unless more is useful,
|
||||||
is 2, after a few benchmarks.
|
is 2, after a few benchmarks.
|
||||||
|
|
||||||
\todo{explain range of $n$}
|
|
||||||
|
|
||||||
\paragraph{Efficiency.} Every circuit memoizes all it can concerning its
|
\paragraph{Efficiency.} Every circuit memoizes all it can concerning its
|
||||||
signature: the inner signature, the IO adjacency, the signatures of order $n$
|
signature: the inner signature, the IO adjacency, the signatures of order $n$
|
||||||
already computed, etc.
|
already computed, etc.
|
||||||
|
@ -400,8 +433,10 @@ or its children are modified. A memoized data is always stored alongside with a
|
||||||
timestamp of computation, which invalidates a previous result when needed.
|
timestamp of computation, which invalidates a previous result when needed.
|
||||||
|
|
||||||
One possible path of investigation for future work, if the computation turns
|
One possible path of investigation for future work, if the computation turns
|
||||||
out to be still too slow in real-world cases --- which looks unlikely ---,
|
out to be still too slow in real-world cases --- which looks unlikely, unless
|
||||||
would be to try to multithread this computation.
|
fl's substitution is run on a regular basis for a huge number of cases using
|
||||||
|
\eg{} a crontab for automated testing ---, would be to try to multithread this
|
||||||
|
computation.
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Group equality}\label{sec:group_equality}
|
\section{Group equality}\label{sec:group_equality}
|
||||||
|
@ -428,7 +463,7 @@ the number of permutations examined to no more than $4$ in studied cases.
|
||||||
|
|
||||||
Once a permutation is judged worth to be examined, the group equality is run
|
Once a permutation is judged worth to be examined, the group equality is run
|
||||||
recursively on all its matched gates. If this step succeeds, the graph
|
recursively on all its matched gates. If this step succeeds, the graph
|
||||||
structure is then checked. If both steps succeeds, the permutation is correct
|
structure is then checked. If both steps succeed, the permutation is correct
|
||||||
and an isomorphism has been found; if not, we move on to the next permutation.
|
and an isomorphism has been found; if not, we move on to the next permutation.
|
||||||
|
|
||||||
\todo{Anything more to tell here?}
|
\todo{Anything more to tell here?}
|
||||||
|
@ -449,7 +484,10 @@ was first described by Julian R Ullmann in 1976~\cite{ullmann1976algorithm}.
|
||||||
Another, more recent algorithm to deal with this problem is Luigi P Cordella's
|
Another, more recent algorithm to deal with this problem is Luigi P Cordella's
|
||||||
VF2 algorithm~\cite{cordella2004sub}, published in 2004. This algorithm is
|
VF2 algorithm~\cite{cordella2004sub}, published in 2004. This algorithm is
|
||||||
mostly Ullmann's algorithm, transcribed in a recursive writing, with the
|
mostly Ullmann's algorithm, transcribed in a recursive writing, with the
|
||||||
addition of five heuristics. \qtodo{Why not use it then?}
|
addition of five heuristics. I originally planned to implement both algorithms
|
||||||
|
and benchmark both, but had no time to do so in the end; though, Ullmann with
|
||||||
|
the few additional heuristics applicable in our very specific case turned out
|
||||||
|
to be fast enough.
|
||||||
|
|
||||||
Ullmann is a widely used and fast algorithm for this problem. It makes an
|
Ullmann is a widely used and fast algorithm for this problem. It makes an
|
||||||
extensive use of adjacency matrix description of the graph, and the initial
|
extensive use of adjacency matrix description of the graph, and the initial
|
||||||
|
@ -461,8 +499,9 @@ matrix. Each $1$ in a cell $(i, j)$ indicates that the $i$-th needle part is a
|
||||||
possible match with the $j$-th haystack part. This matrix is called $perm$
|
possible match with the $j$-th haystack part. This matrix is called $perm$
|
||||||
thereafter.
|
thereafter.
|
||||||
|
|
||||||
The algorithm, left apart the \textsc{refine} function (detailed just after),
|
The algorithm, left apart the \textsc{refine} function, which is detailed just
|
||||||
is described in Figure~\ref{alg:ullmann}.
|
after and can be omitted for a (way) slower version of the algorithm, is
|
||||||
|
described in Figure~\ref{alg:ullmann}.
|
||||||
|
|
||||||
\begin{figure}[h]
|
\begin{figure}[h]
|
||||||
\begin{algorithmic}
|
\begin{algorithmic}
|
||||||
|
@ -505,7 +544,7 @@ is described in Figure~\ref{alg:ullmann}.
|
||||||
|
|
||||||
The refining process is the actual keystone of the algorithm. It is the
|
The refining process is the actual keystone of the algorithm. It is the
|
||||||
mechanism allowing the algorithm to cut down many exploration branches, by
|
mechanism allowing the algorithm to cut down many exploration branches, by
|
||||||
removing ones from the matrix.
|
changing ones to zeroes in the matrix being built.
|
||||||
|
|
||||||
The idea is that a match between a needle's vertex $i$ and a haystack's vertex
|
The idea is that a match between a needle's vertex $i$ and a haystack's vertex
|
||||||
$j$ is only possible if, for each neighbour $k$ of $i$, $j$ has a neighbour
|
$j$ is only possible if, for each neighbour $k$ of $i$, $j$ has a neighbour
|
||||||
|
@ -583,7 +622,12 @@ occur).
|
||||||
\subsection{Implementation optimisations}
|
\subsection{Implementation optimisations}
|
||||||
|
|
||||||
\paragraph{Initial permutation matrix.} The matrix is first filled according to
|
\paragraph{Initial permutation matrix.} The matrix is first filled according to
|
||||||
the signatures matches. It is then refined a bit more, by making sure that for
|
the signatures matches. Note that only signatures of order 0 --- \ie{} the
|
||||||
|
inner data of a vertex --- can be used here: indeed, we cannot rely on the
|
||||||
|
context here, since there can be some context in the haystack that is absent
|
||||||
|
from the needle, and we cannot check for ``context inclusion'' with our
|
||||||
|
definition of signatures: \emph{all} the context must be exactly the same for
|
||||||
|
two signatures to match. It is then refined a bit more, by making sure that for
|
||||||
every match, every potentially matching gate has the same ``wire kinds''.
|
every match, every potentially matching gate has the same ``wire kinds''.
|
||||||
Indeed, a gate needle's wire must have at least the same inbound adjacent
|
Indeed, a gate needle's wire must have at least the same inbound adjacent
|
||||||
signatures as its matching haystack wire, and same goes for outbound adjacent
|
signatures as its matching haystack wire, and same goes for outbound adjacent
|
||||||
|
@ -665,6 +709,13 @@ for a single run) and measured by the command \texttt{time}.
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
|
The computation time is more or less linear in in the level of signature
|
||||||
|
required, which is coherent with the implementation. In practice, only small
|
||||||
|
portions of a circuit will be signed with a high order, which means that we can
|
||||||
|
afford really high order signatures (\eg{} 40 or 50, which already means that
|
||||||
|
the diameter of the group is 40 or 50) without having a real impact on the
|
||||||
|
computation time.
|
||||||
|
|
||||||
|
|
||||||
\paragraph{Equality.} To test the circuit group equality, a small piece of
|
\paragraph{Equality.} To test the circuit group equality, a small piece of
|
||||||
code takes a circuit, scrambles it as much as possible
|
code takes a circuit, scrambles it as much as possible
|
||||||
|
@ -680,13 +731,15 @@ considerably speeding it up: the same program proving only one way takes about
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\subsection{Corner cases}
|
\subsection{Corner cases}\label{ssec:corner_cases}
|
||||||
|
|
||||||
There were a few observed cases where the algorithm tends to be slower on
|
There were a few observed cases where the algorithm tends to be slower on
|
||||||
certain configurations.
|
certain configurations, and a few other such cases that could be fixed.
|
||||||
|
|
||||||
\todo{More corner cases}
|
\todo{More corner cases}
|
||||||
|
|
||||||
|
\todo{Corner case: io pins, io adjacency}
|
||||||
|
|
||||||
\paragraph{Split/merge trees.} A common pattern that tends to slow down the
|
\paragraph{Split/merge trees.} A common pattern that tends to slow down the
|
||||||
algorithm is split/merge trees. Those patterns occur when one wants to merge
|
algorithm is split/merge trees. Those patterns occur when one wants to merge
|
||||||
$n$ one bit wires into a single $n$ bits wire, or the other way around.
|
$n$ one bit wires into a single $n$ bits wire, or the other way around.
|
||||||
|
@ -712,6 +765,12 @@ nodes on the layer below cannot be freely exchanged.
|
||||||
|
|
||||||
\todo{Figure describing the problem}
|
\todo{Figure describing the problem}
|
||||||
|
|
||||||
|
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
\section*{Conclusion}
|
||||||
|
|
||||||
|
\todo{}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
|
||||||
\printbibliography{}
|
\printbibliography{}
|
||||||
|
|
Loading…
Reference in a new issue