\chapter*{Conclusion} \addcontentsline{toc}{chapter}{Conclusion} During this manuscript, we explored the main bottlenecks that arise while analyzing the low-level performance of a microkernel: \begin{itemize} \item frontend bottlenecks ---~the processor's frontend is unable to saturate the backend with instructions (\autoref{chap:palmed}); \item backend bottlenecks ---~the backend is saturated with instructions and processes them as fast as possible (\autoref{chap:frontend}); \item dependencies bottlenecks ---~data dependencies between instructions prevent the backend from being saturated; the latter is stalled awaiting previous results (\autoref{chap:staticdeps}). \end{itemize} We also conduced in \autoref{chap:CesASMe} a systematic comparative study of a variety of state-of-the-art code analyzers. \bigskip{} State-of-the-art code analyzers such as \llvmmca{} or \uica{} already boast a good accuracy. Both of these models ---~and most of the others also~--- are however based on models obtained by various degrees of manual investigation, and are unable to scale without further manual effort to future or uncharted microprocessors. The field of microarchitectural models for code analysis emerged with fundamentally manual methods, such as Agner Fog's tables. Such tables, however, may now be produced in a more automated way using \uopsinfo{} ---~at least for certain microarchitectures~---; \pmevo{} pushes further in this direction by automatically computing a frontend model from benchmarks ---~but still has trouble scaling to a full instruction set. In its own way, \ithemal{}, a machine-learning based approach, could also be considered automated ---~yet, it still requires a large training set for the intended processor, which must be at least partially crafted manually. This trend towards model automation seems only natural as new microarchitectures keep appearing, while new ISAs such as ARM reach the supercomputer area. \medskip{} We investigate this direction by exploring the three major bottlenecks mentioned earlier in the perspective of providing fully-automated, benchmarks-based models for each of them. Optimally, these models should be generated by simply executing a program on a machine running on top of the targeted microarchitecture. \begin{itemize} \item We contribute to \palmed{}, a framework able to extract a port-mapping of a processor, serving as a backend model. \item We manually extract a frontend model for the Cortex A72 processor. We believe that the foundation of our methodology works on most processors. The main characteristics of a frontend, apart from their instructions' \uops{} decomposition and issue width, must however still be investigated, and their relative importance evaluated. \item We provide with \staticdeps{} a method to to extract data dependencies between instructions. It is able to detect \textit{loop-carried} dependencies (dependencies that span across multiple loop iterations), as well as \textit{memory-carried} dependencies (dependencies based on reading at a memory address written by another instruction). While the former is widely implemented, the latter is, to the best of our knowledge, an original contribution. We bundle this method in a processor-independent tool, based on semantics of the ISA provided by \valgrind{}, which supports a variety of ISAs. \end{itemize} \bigskip{} We evaluated independently these three models, each of them providing satisfactory results: \palmed{} is competitive with the state of the art, with the advantage of being automatic; our frontend model significantly improves a backend model's accuracy and our dependencies model significantly improves \uica{}'s results, while being consistent with a dynamic dependencies analysis. These models, however, should become really meaningful only when combined together ---~or, even better, when each of them could be combined with any other model of the other parts. To the best of our knowledge, however, no such modular tool exists; nor is there any standardized approach to interact with such models. The usual approach of the domain to try a new idea, instead, is to create a full analyzer implementing this idea, such as we did with \palmed{} for backend models, or such as \uica{}'s implementation. In hindsight, we advocate for the emergence of such a modular code analyzer. It would maybe not be as convenient or well-packaged as ``production-ready'' code analyzers, such as \llvmmca{} ---~which is packaged for Debian. It could, however, greatly simplify the academic process of trying a new idea on any of the three main models, by decorrelating them. It would also ease the comparative evaluation of those ideas, while eliminating many of the discrepancies between experimental setups that make an actual comparison difficult ---~the reason that prompted us to make \cesasme{} in \autoref{chap:CesASMe}. Indeed, with such a modular tool, it would be easy to run the same experiment, in the same conditions, while only changing \eg{} the frontend model but keeping a well-tried backend model. \bigskip{} We also identified multiple weaknesses in the current state of the art from our comparative experiments with \cesasme{}. \smallskip{}