\section*{Conclusion and future works} In this chapter, we have presented a fully-tooled approach that enables: \begin{itemize} \item the generation of a wide variety of microbenchmarks, reflecting both the expertise contained in an initial benchmark suite, and the diversity of code transformations allowing to stress different aspects of a performance model ---~or even a measurement environment, \eg{} \bhive; and \item the comparability of various measurements and analyses applied to each of these microbenchmarks. \end{itemize} Thanks to this tooling, we were able to show the limits and strengths of various performance models in relation to the expertise contained in the Polybench suite. We discuss throughput results in Section~\ref{ssec:overall_results} and bottleneck prediction in Section~\ref{ssec:bottleneck_pred_analysis}. We were also able to demonstrate the difficulties of reasoning at the level of a basic block isolated from its context. We specifically study those difficulties in the case of \bhive{} in Section~\ref{ssec:bhive_errors}. Indeed, the actual values ---~both from registers and memory~--- involved in a basic block's computation are constitutive not only of its functional properties (\ie{} the result of the calculation), but also of some of its non-functional properties (\eg{} latency, throughput). We were also able to show in Section~\ref{ssec:memlatbound} that state-of-the-art static analyzers struggle to account for memory-carried dependencies; a weakness significantly impacting their overall results on our benchmarks. We believe that detecting and accounting for these dependencies is an important future works direction. Moreover, we present this work in the form of a modular software package, each component of which exposes numerous adjustable parameters. These components can also be replaced by others fulfilling the same abstract function: another initial benchmark suite in place of Polybench, other loop nest optimizers in place of PLUTO and PoCC, other code analyzers, and so on. This software modularity reflects the fact that our contribution is about interfacing and communication between distinct issues. \medskip Furthermore, we believe that the contributions we made in the course of this work may eventually be used to face different, yet neighbouring issues. These perspectives can also be seen as future works: \smallskip \paragraph{Program optimization.} The whole program processing we have designed can be used not only to evaluate the performance model underlying a static analyzer, but also to guide program optimization itself. In such a perspective, we would generate different versions of the same program using the transformations discussed in Section~\ref{sec:bench_gen} and colored blue in Figure~\ref{fig:contrib}. These different versions would then feed the execution and measurement environment outlined in Section~\ref{sec:bench_harness} and colored orange in Figure~\ref{fig:contrib}. Indeed, thanks to our previous work, we know that the results of these comparable analyses and measurements would make it possible to identify which version is the most efficient, and even to reconstruct information indicating why (which bottlenecks, etc.). However, this approach would require that these different versions of the same program are functionally equivalent, \ie{} that they compute the same result from the same inputs; yet we saw in Section~\ref{sec:bench_harness} that, as it stands, the transformations we apply are not concerned with preserving the semantics of the input codes. To recover this semantic preservation property, abandoning the kernelification pass we have presented suffices; this however would require to control L1-residence otherwise. \smallskip \paragraph{Dataset building.} Our microbenchmarks generation phase outputs a large, diverse and representative dataset of microkernels. In addition to our harness, we believe that such a dataset could be used to improve existing data-dependant solutions. %the measurement and execution environment we %propose is not the only type of tool whose function is to process a large %dataset (\ie{} the microbenchmarks generated earlier) to automatically %abstract its characteristics. We can also think of: Inductive methods, for instance in \anica, strive to preserve the properties of a basic block through successive abstractions of the instructions it contains, so as to draw the most general conclusions possible from a particular experiment. Currently, \anica{} starts off from randomly generated basic blocks. This approach guarantees a certain variety, and avoids over-specialization, which would prevent it from finding interesting cases too far from an initial dataset. However, it may well lead to the sample under consideration being systematically outside the relevant area of the search space ---~\ie{} having no relation to real-life programs or those in the user's field. On the other hand, machine learning methods based on neural networks, for instance in \ithemal, seek to correlate the result of a function with the characteristics of its input ---~in this case to correlate a throughput prediction with the instructions making up a basic block~--- by backpropagating the gradient of a cost function. In the case of \ithemal{}, it is trained on benchmarks originating from a data suite. As opposed to random generation, this approach offers representative samples, but comes with a risk of lack of variety and over-specialization. Comparatively, our microbenchmark generation method is natively meant to produce a representative, varied and large dataset. We believe that enriching the dataset of the above-mentioned methods with our benchmarks might extend their results and reach.