Refine plan for foundations

This commit is contained in:
Théophile Bastian 2023-10-16 21:41:08 +02:00
parent 82e9418a3d
commit 2701c81409
2 changed files with 62 additions and 18 deletions

View file

@ -1,11 +1,52 @@
# State of the Art
# Foundations
En vrac :
## A dive into processors' microarchitecture
Outils utilisés :
* Pipedream
* Valgrind
* QEmu
### High-level abstraction
* Highest level of abstraction: instructions -> CPU => modify internal state
* instructions => ISA, asm
* state: registers, memory hierarchy, (external effects)
### Microarchitecture
* [big picture figure]
* Roughly speaking:
* frontend (decoder, renamer, other stuff).
* backend (execution ports, execution units)
* register file
* caches
* Instruction --[frontend]--> Mop, muop
* muop --[backend port]--> retired [side effects]
* vast majority of cases: execution units are fully pipelined
* Dependencies are breaking the pipeline!
* Renamer: helps up to a point
* out of order CPUs:
* Frontend in order up to some point
* ROB
* backend out-of-order
* ROB: execution window. ILP limited to this window.
* Hardware counters
* SIMD
## Prerequisites on code analyzers
* Code analyzers: given a program that is assumed to be the
body of a hot loop, derive performance metrics and any information that might
help towards performance debugging.
* Usually static (vs dynamic); focus on static.
* Very close to the machine: assembly, assembled bytes
* Examples with llvm-mca
* Resides in an object or executable file: ELF on Linux and most Unix-based
platforms
* Assembly: straight-line instructions, with (possibly conditional) jumps
* Instruction identified by its program counter
* Notion of basic block
* Regions of interest: hottest basic blocks
## State of the art
Throughput pred. :
* Agner Fog
@ -17,10 +58,7 @@ Throughput pred. :
* OSACA
* UiCA
Benchmark suites:
* Polybench
* SPEC
## Maybe put this somewhere
Backend models:
* To predict the throughput of a kernel, a precise model of the CPU backend is
required
@ -29,4 +67,3 @@ Backend models:
* but this is often incomplete, sometimes even wrong
* Agner Fog
* Uops.info

View file

@ -1,12 +1,10 @@
# Stuff that must be introduced early (intro/foundations)
* Static vs. dynamic
* PC
* ELF
## Intro to CPUs
* ISA
* Assembly
* SIMD
* Basic block
* μarch:
* frontend
* ports
@ -18,6 +16,18 @@
* ROB
* L1-residence
* HW counters
## Foundations on code analyzers
* Define Cycles(K): retired instructions
* Define notion of bottleneck
* Static vs. dynamic
* PC
* ELF
* Basic block
## State of the art
* Tools:
* IACA
* llvm-mca
@ -25,6 +35,3 @@
* uops.info
* UiCA
* PMEvo
* Define Cycles(K): retired instructions
* Define notion of bottleneck