Add code quality subsection

2017-08-24 17:07:02 +02:00 · 2017-08-24 17:07:02 +02:00 · 4f5b0e8784
commit 4f5b0e8784
parent 5cafe06d2f
1 changed files with 93 additions and 34 deletions
--- a/report/report.tex
+++ b/report/report.tex
@ -53,8 +53,8 @@
    transformations to a circuit.
    This problem turns out to be more or less the \emph{subgraph isomorphism
-    problem}, which is NP-complete, and must nevertheless be solved fast on
+    problem}, which is NP-complete, and must nevertheless be solved efficiently
-    processor-sized circuits on this particular case.
+    on processor-sized circuits on this particular case.
    During my internship, I developed a C++ library to perform this task that
    will be integrated in VossII, based on a few well-known algorithms as well
@ -211,7 +211,7 @@ available on GitHub:
    \url{https://github.com/tobast/circuit-isomatch/}
 \end{center}
-\subsection{Objective}
+\subsection{Problems}
 More precisely, the problems that \emph{isomatch} must solve are the following.
@ -247,6 +247,39 @@ to be NP-complete~\cite{cook1971complexity}. Even though a few algorithms
 is nevertheless necessary to implement them the right way, and with the right
 heuristics, to get the desired efficiency for the given problem.
 \subsection{Code quality}
 Another prominent objective was to keep the codebase as clean as possible.
 Indeed, this code will probably have to be maintained for quite some time, and
 most probably by other people than me. This means that the code and all its
 surroundings must be really clean, readable and reusable. I tried to put a lot
 of effort in making the code idiomatic and easy to use, through \eg{} the
 implementation of iterators over my data structures when needed, idiomatic
 C++14, etc.
 This also means that the code has to be well-documented: the git history had to
 be kept clean and understandable, and a clean documentation can be generated
 from the code, using \texttt{doxygen}. The latest documentation is also
 compiled as HTML pages here:
 \begin{center}
    \raisebox{-0.4\height}{
        \includegraphics[height=2.3em]{../common/docs.png}}
    \hspace{1em}
    \url{https://tobast.fr/m1/isomatch}
 \end{center}
 Since the code is C++, it is also very prone to diverse bugs. While I did not
 took the time to integrate unit tests --- which would have been a great
 addition ---, I used a sequence of test that can be run using \lstc{make
 test}, and tests a lot of features of isomatch.
 The code is also tested regularly and on a wide variety of cases with
 \lstbash{valgrind} to ensure that there are no memory errors ---
 use-after-free, unallocated memory, memory leaks, bad pointer
 arithmetics,~\ldots In every tested case, strictly no memory is lost, and no
 invalid read was reported.
 \subsection{Sought efficiency}
 The goal of \textit{isomatch} is to be applied to large circuits on-the-fly,
@ -257,10 +290,6 @@ matching operations will be executed quite often, and often multiple times in a
 row.  It must then remain fast enough for the human not to lose too much time,
 and eventually lose patience.
 \todo{Mention clean codebase somewhere}
 \todo{Mention VossII somewhere}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{General approach}
@ -268,7 +297,7 @@ The global strategy used to solve efficiently the problem can be broken down to
 three main parts.
 \paragraph{Signatures.} The initial idea to make the computation fast is to
-aggregate the inner data of a gate --- be it a leaf gate or a group --- in a
+aggregate the inner data of a gate~---~be it a leaf gate or a group~---~in a
 kind of hash, a 64 bits unsigned integer. This approach is directly inspired
 from what was done in fl, back at Intel. This hash must be easy to compute,
 and must be based only on the structure of the graph --- that is, must be
@ -307,8 +336,10 @@ this problem, that uses the specificities of the graph to be a little faster.
 The signature is computed as a simple hash of the element, and is defined for
 every type of expression and circuit. It could probably be enhanced with a bit
-more work to cover more uniformly the hash space, but no collision was observed
+more work to cover more uniformly the hash space, but no illegitimate collision
-on the examples tested.
+(that is, a collision that could be avoided with a better hash function, as
 opposed to collisions due to an equal local graph structure) was observed on
 the examples tested.
 \paragraph{Signature constants.} Signature constants are used all around the
 signing process, and is a 5-tuple $\sigconst{} = (a, x_l, x_h, d_l, d_h)$ of 32
@ -316,13 +347,14 @@ bits unsigned numbers. All of $x_l$, $x_h$, $d_l$ and $d_h$ are picked as prime
 numbers between $10^8$ and $10^9$ (which just fits in a 32 bits unsigned
 integer); while $a$ is a random integer uniformly picked between $2^{16}$ and
 $2^{32}$.  These constants are generated by a small python script,
-\path{util/primegen/pickPrimes.py}.
+\path{util/primegen/pickPrimes.py} in the repository.
 Those constants are used to produce a 64 bits unsigned value out of another 64
 bits unsigned value, called $v$ thereafter, through an operator $\sigop$,
 computed as follows (with all computations done on 64 bits unsigned integers).
 \vspace{1em}
 \begin{center}
    \begin{algorithmic}
        \Function{$\sigop$}{$\sigconst{}, v$}
            \State{} $out1 \gets (v + a) \cdot x_l$
@ -332,6 +364,7 @@ computed as follows (with all computations done on 64 bits unsigned integers).
            \State{} \Return{} $low + 2^{32} \cdot high$
        \EndFunction{}
    \end{algorithmic}
 \end{center}
 \paragraph{Expressions.} Each type of expression (or, in the case of
 expression with operator, each type of operator) has its signature constant,
@ -358,7 +391,7 @@ capture at all the \emph{structure} of the graph. An information we can capture
 without breaking the signature's independence towards the order of description
 of the graph, is the set of its neighbours. Yet, we cannot ``label'' the gates
 without breaking this rule; thus, we represent the set of neighbours by the set
-of our \emph{neighbours' signatures}.
+of the \emph{neighbours' signatures}.
 At this point, we can define the \emph{signature of order $n$} ($n \in
 \natset$) of a circuit $C$ as follows:
@ -375,13 +408,13 @@ At this point, we can define the \emph{signature of order $n$} ($n \in
 The ``IO adjacency'' term is an additional term in the signatures of order
 above $0$, indicating what input and output pins of the circuit group
-containing the current gate are adjacent to it.
+containing the current gate are adjacent to it. Adding this information to the
 signature was necessary, since a lot of gates can be signed differently using
 this information (see Corner cases in Section~\ref{ssec:corner_cases}).
 The default order of signature used in all computations, unless more is useful,
 is 2, after a few benchmarks.
 \todo{explain range of $n$}
 \paragraph{Efficiency.} Every circuit memoizes all it can concerning its
 signature: the inner signature, the IO adjacency, the signatures of order $n$
 already computed, etc.
@ -400,8 +433,10 @@ or its children are modified. A memoized data is always stored alongside with a
 timestamp of computation, which invalidates a previous result when needed.
 One possible path of investigation for future work, if the computation turns
-out to be still too slow in real-world cases --- which looks unlikely ---,
+out to be still too slow in real-world cases --- which looks unlikely, unless
-would be to try to multithread this computation.
+fl's substitution is run on a regular basis for a huge number of cases using
 \eg{} a crontab for automated testing ---, would be to try to multithread this
 computation.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{Group equality}\label{sec:group_equality}
@ -428,7 +463,7 @@ the number of permutations examined to no more than $4$ in studied cases.
 Once a permutation is judged worth to be examined, the group equality is run
 recursively on all its matched gates. If this step succeeds, the graph
-structure is then checked. If both steps succeeds, the permutation is correct
+structure is then checked. If both steps succeed, the permutation is correct
 and an isomorphism has been found; if not, we move on to the next permutation.
 \todo{Anything more to tell here?}
@ -449,7 +484,10 @@ was first described by Julian R Ullmann in 1976~\cite{ullmann1976algorithm}.
 Another, more recent algorithm to deal with this problem is Luigi P Cordella's
 VF2 algorithm~\cite{cordella2004sub}, published in 2004. This algorithm is
 mostly Ullmann's algorithm, transcribed in a recursive writing, with the
-addition of five heuristics. \qtodo{Why not use it then?}
+addition of five heuristics. I originally planned to implement both algorithms
 and benchmark both, but had no time to do so in the end; though, Ullmann with
 the few additional heuristics applicable in our very specific case turned out
 to be fast enough.
 Ullmann is a widely used and fast algorithm for this problem. It makes an
 extensive use of adjacency matrix description of the graph, and the initial
@ -461,8 +499,9 @@ matrix. Each $1$ in a cell $(i, j)$ indicates that the $i$-th needle part is a
 possible match with the $j$-th haystack part. This matrix is called $perm$
 thereafter.
-The algorithm, left apart the \textsc{refine} function (detailed just after),
+The algorithm, left apart the \textsc{refine} function, which is  detailed just
-is described in Figure~\ref{alg:ullmann}.
+after and can be omitted for a (way) slower version of the algorithm, is
 described in Figure~\ref{alg:ullmann}.
 \begin{figure}[h]
 \begin{algorithmic}
@ -505,7 +544,7 @@ is described in Figure~\ref{alg:ullmann}.
 The refining process is the actual keystone of the algorithm. It is the
 mechanism allowing the algorithm to cut down many exploration branches, by
-removing ones from the matrix.
+changing ones to zeroes in the matrix being built.
 The idea is that a match between a needle's vertex $i$ and a haystack's vertex
 $j$ is only possible if, for each neighbour $k$ of $i$, $j$ has a neighbour
@ -583,7 +622,12 @@ occur).
 \subsection{Implementation optimisations}
 \paragraph{Initial permutation matrix.} The matrix is first filled according to
-the signatures matches. It is then refined a bit more, by making sure that for
+the signatures matches. Note that only signatures of order 0 --- \ie{} the
 inner data of a vertex --- can be used here: indeed, we cannot rely on the
 context here, since there can be some context in the haystack that is absent
 from the needle, and we cannot check for ``context inclusion'' with our
 definition of signatures: \emph{all} the context must be exactly the same for
 two signatures to match. It is then refined a bit more, by making sure that for
 every match, every potentially matching gate has the same ``wire kinds''.
 Indeed, a gate needle's wire must have at least the same inbound adjacent
 signatures as its matching haystack wire, and same goes for outbound adjacent
@ -665,6 +709,13 @@ for a single run) and measured by the command \texttt{time}.
    \end{tikzpicture}
 \end{center}
 The computation time is more or less linear in in the level of signature
 required, which is coherent with the implementation. In practice, only small
 portions of a circuit will be signed with a high order, which means that we can
 afford really high order signatures (\eg{} 40 or 50, which already means that
 the diameter of the group is 40 or 50) without having a real impact on the
 computation time.
 \paragraph{Equality.} To test the circuit group equality, a small piece of
 code takes a circuit, scrambles it as much as possible
@ -680,13 +731,15 @@ considerably speeding it up: the same program proving only one way takes about
-\subsection{Corner cases}
+\subsection{Corner cases}\label{ssec:corner_cases}
 There were a few observed cases where the algorithm tends to be slower on
-certain configurations.
+certain configurations, and a few other such cases that could be fixed.
 \todo{More corner cases}
 \todo{Corner case: io pins, io adjacency}
 \paragraph{Split/merge trees.} A common pattern that tends to slow down the
 algorithm is split/merge trees. Those patterns occur when one wants to merge
 $n$ one bit wires into a single $n$ bits wire, or the other way around.
@ -712,6 +765,12 @@ nodes on the layer below cannot be freely exchanged.
 \todo{Figure describing the problem}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section*{Conclusion}
 \todo{}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \printbibliography{}