Refine plan for foundations
This commit is contained in:
parent
82e9418a3d
commit
2701c81409
2 changed files with 62 additions and 18 deletions
|
@ -1,11 +1,52 @@
|
|||
# State of the Art
|
||||
# Foundations
|
||||
|
||||
En vrac :
|
||||
## A dive into processors' microarchitecture
|
||||
|
||||
Outils utilisés :
|
||||
* Pipedream
|
||||
* Valgrind
|
||||
* QEmu
|
||||
### High-level abstraction
|
||||
|
||||
* Highest level of abstraction: instructions -> CPU => modify internal state
|
||||
* instructions => ISA, asm
|
||||
* state: registers, memory hierarchy, (external effects)
|
||||
|
||||
### Microarchitecture
|
||||
|
||||
* [big picture figure]
|
||||
* Roughly speaking:
|
||||
* frontend (decoder, renamer, other stuff).
|
||||
* backend (execution ports, execution units)
|
||||
* register file
|
||||
* caches
|
||||
* Instruction --[frontend]--> Mop, muop
|
||||
* muop --[backend port]--> retired [side effects]
|
||||
* vast majority of cases: execution units are fully pipelined
|
||||
* Dependencies are breaking the pipeline!
|
||||
* Renamer: helps up to a point
|
||||
* out of order CPUs:
|
||||
* Frontend in order up to some point
|
||||
* ROB
|
||||
* backend out-of-order
|
||||
* ROB: execution window. ILP limited to this window.
|
||||
|
||||
* Hardware counters
|
||||
|
||||
* SIMD
|
||||
|
||||
## Prerequisites on code analyzers
|
||||
|
||||
* Code analyzers: given a program that is assumed to be the
|
||||
body of a hot loop, derive performance metrics and any information that might
|
||||
help towards performance debugging.
|
||||
* Usually static (vs dynamic); focus on static.
|
||||
* Very close to the machine: assembly, assembled bytes
|
||||
* Examples with llvm-mca
|
||||
* Resides in an object or executable file: ELF on Linux and most Unix-based
|
||||
platforms
|
||||
* Assembly: straight-line instructions, with (possibly conditional) jumps
|
||||
* Instruction identified by its program counter
|
||||
* Notion of basic block
|
||||
* Regions of interest: hottest basic blocks
|
||||
|
||||
## State of the art
|
||||
|
||||
Throughput pred. :
|
||||
* Agner Fog
|
||||
|
@ -17,10 +58,7 @@ Throughput pred. :
|
|||
* OSACA
|
||||
* UiCA
|
||||
|
||||
Benchmark suites:
|
||||
* Polybench
|
||||
* SPEC
|
||||
|
||||
## Maybe put this somewhere
|
||||
Backend models:
|
||||
* To predict the throughput of a kernel, a precise model of the CPU backend is
|
||||
required
|
||||
|
@ -29,4 +67,3 @@ Backend models:
|
|||
* but this is often incomplete, sometimes even wrong
|
||||
* Agner Fog
|
||||
* Uops.info
|
||||
|
||||
|
|
|
@ -1,12 +1,10 @@
|
|||
# Stuff that must be introduced early (intro/foundations)
|
||||
|
||||
* Static vs. dynamic
|
||||
* PC
|
||||
* ELF
|
||||
## Intro to CPUs
|
||||
|
||||
* ISA
|
||||
* Assembly
|
||||
* SIMD
|
||||
* Basic block
|
||||
* μarch:
|
||||
* frontend
|
||||
* ports
|
||||
|
@ -18,6 +16,18 @@
|
|||
* ROB
|
||||
* L1-residence
|
||||
* HW counters
|
||||
|
||||
## Foundations on code analyzers
|
||||
|
||||
* Define Cycles(K): retired instructions
|
||||
* Define notion of bottleneck
|
||||
* Static vs. dynamic
|
||||
* PC
|
||||
* ELF
|
||||
* Basic block
|
||||
|
||||
## State of the art
|
||||
|
||||
* Tools:
|
||||
* IACA
|
||||
* llvm-mca
|
||||
|
@ -25,6 +35,3 @@
|
|||
* uops.info
|
||||
* UiCA
|
||||
* PMEvo
|
||||
|
||||
* Define Cycles(K): retired instructions
|
||||
* Define notion of bottleneck
|
||||
|
|
Loading…
Reference in a new issue