Benching: evaluate hackbench clearly, improve tools

2019-06-10 12:04:52 +02:00 · 2019-06-10 12:04:52 +02:00 · ceeec6ca5d
commit ceeec6ca5d
parent a0f58b592d
5 changed files with 277 additions and 0 deletions
--- a/benching/README.md
+++ b/benching/README.md
@ -0,0 +1,92 @@
+# Benching `eh_elfs`
+
+## Benchmark setup
+
+Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
+
+### Generate the `eh_elfs`
+
+```bash
+../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
+  --keep-holes -O2 --global-switch --enable-deref-arg "$BENCHED_BINARY"
+```
+
+### Record a `perf` session
+
+```bash
+perf record --call-graph dwarf,4096 "$BENCHED_BINARY" [args]
+```
+
+### Set up the environment
+
+```bash
+source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
+```
+
+The first value selects the version of libunwind you will be running, the
+second selects whether you want to run in debug or release mode (use release to
+get readings, debug to check for errors).
+
+You can reset your environment to its previous state by running `deactivate`.
+
+If you pick the `eh_elf` flavour, you will also have to
+
+```bash
+export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
+```
+
+## Extract results
+
+### Base readings
+
+**In release mode** (faster), run
+
+```bash
+perf report 2>&1 >/dev/null
+```
+
+with both `eh_elf` and `vanilla` shells. Compare average time.
+
+### Getting debug output
+
+```bash
+UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null
+```
+
+### Total number of calls to `unw_step`
+
+```bash
+UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning"
+```
+
+### Total number of vanilla errors
+
+With the `vanilla` context,
+
+```bash
+UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning -"
+```
+
+### Total number of fallbacks to original DWARF
+
+With the `eh_elf` context,
+
+```bash
+UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* falling back"
+```
+
+### Total number of fallbacks to original DWARF that actually used DWARF
+
+With the `eh_elf` context,
+
+```bash
+UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* fallback with"
+```
+
+### Get succeeded fallback locations
+
+```bash
+UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null \
+  | grep "step: .* fallback with" -B 15 \
+  | grep "In memory map" | sort | uniq -c
+```
--- a/benching/hackbench/EVALUATION.md
+++ b/benching/hackbench/EVALUATION.md
@ -0,0 +1,48 @@
+# Hackbench - evaluation
+
+Artifacts saved in `evaluation_artifacts`.
+
+## Performance
+
+Using the command line
+
+```bash
+for i in $(seq 1 100); do
+  perf report 2>&1 >/dev/null | tail -n 1 \
+    | python to_report_fmt.py | sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
+done
+```
+
+we save a sequence of 100 performance readings to some file.
+
+Samples:
+* `eh_elf`:  135251 unw/exec
+* `vanilla`: 138233 unw/exec
+
+Average time/unw:
+* `eh_elf`:   102 ns
+* `vanilla`: 2443 ns
+
+Standard deviation:
+* `eh_elf`:  2 ns
+* `vanilla`: 47 ns
+
+Average ratio: 24
+Ratio uncertainty: 1.0
+
+## Distibution of `unw_step` issues
+
+### `eh_elf` case
+
+* success:                              135251 (97.7%)
+* fallback to DWARF:                      1467  (1.0%)
+* fallback to libunwind heuristics:        329  (0.2%)
+* fail to unwind:                         1410  (1.0%)
+* total:                                138457
+
+### `vanilla` case
+
+* success:                              138201 (98.9%)
+* fallback to libunwind heuristics:         32  (0.0%)
+* fail to unwind:                         1411  (1.0%)
+* total:                                139644
--- a/benching/hackbench/README.md
+++ b/benching/hackbench/README.md
@ -0,0 +1,44 @@
+# Running the benchmarks
+
+Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
+
+## Generate the `eh_elfs`
+
+```bash
+../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
+  --keep-holes -O2 --global-switch --enable-deref-arg hackbench
+```
+
+## Record a `perf` session
+
+```bash
+perf record --call-graph dwarf,4096 ./hackbench 10 process 100
+```
+
+You can arbitrarily increase the first number up to ~100 and the second to get
+a longer session. This will most probably take all your computer's resources
+while it is running.
+
+## Set up the environment
+
+```bash
+source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
+```
+
+The first value selects the version of libunwind you will be running, the
+second selects whether you want to run in debug or release mode (use release to
+get readings, debug to check for errors).
+
+You can reset your environment to its previous state by running `deactivate`.
+
+If you pick the `eh_elf` flavour, you will also have to
+
+```bash
+export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
+```
+
+### Actually get readings
+
+```bash
+perf report 2>&1 >/dev/null
+```
--- a/benching/hackbench/to_report_fmt.py
+++ b/benching/hackbench/to_report_fmt.py
@ -0,0 +1,21 @@
+#!/usr/bin/env python3
+
+import re
+import sys
+
+line = input()
+regex = \
+    re.compile(r'Total unwind time: ([0-9]*) s ([0-9]*) ns, ([0-9]*) calls')
+
+match = regex.match(line.strip())
+if not match:
+    print('Badly formatted line', file=sys.stderr)
+    sys.exit(1)
+
+sec = int(match.group(1))
+ns = int(match.group(2))
+calls = int(match.group(3))
+
+time = sec * 10**9 + ns
+
+print("{} & {} & {} & ??".format(calls, time, time // calls))
--- a/benching/tools/gen_perf_stats.py
+++ b/benching/tools/gen_perf_stats.py
@ -0,0 +1,72 @@
+#!/usr/bin/env python3
+
+""" Generates performance statistics for the eh_elf vs vanilla libunwind unwinding,
+based on time series generated beforehand
+
+First run
+```bash
+for i in $(seq 1 100); do
+  perf report 2>&1 >/dev/null | tail -n 1 \
+      | python ../hackbench/to_report_fmt.py \
+          | sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
+done > $SOME_PLACE/$FLAVOUR_times
+```
+
+for each flavour (eh_elf, vanilla)
+
+Then run this script, with `$SOME_PLACE` as argument.
+"""
+
+import numpy as np
+import sys
+import os
+
+
+def read_series(path):
+    with open(path, "r") as handle:
+        for line in handle:
+            yield int(line.strip())
+
+
+FLAVOURS = ["eh_elf", "vanilla"]
+
+path_format = os.path.join(sys.argv[1], "{}_times")
+times = {}
+avgs = {}
+std_deviations = {}
+
+for flv in FLAVOURS:
+    times[flv] = list(read_series(path_format.format(flv)))
+    avgs[flv] = sum(times[flv]) / len(times[flv])
+    std_deviations[flv] = np.sqrt(np.var(times[flv]))
+
+avg_ratio = avgs["vanilla"] / avgs["eh_elf"]
+ratio_uncertainty = (
+    1
+    / avgs["eh_elf"]
+    * (
+        std_deviations["vanilla"]
+        + avgs["vanilla"] / avgs["eh_elf"] * std_deviations["eh_elf"]
+    )
+)
+
+
+def format_flv(flv_dict, formatter):
+    out = ""
+    for flv in FLAVOURS:
+        val = flv_dict[flv]
+        out += "* {}: {}\n".format(flv, formatter.format(val))
+    return out
+
+
+print(
+    "Average time:\n{}\n"
+    "Standard deviation:\n{}\n"
+    "Average ratio: {}\n"
+    "Ratio uncertainty: {}".format(
+        format_flv(avgs, "{} ns"),
+        format_flv(std_deviations, "{}"),
+        avg_ratio,
+        ratio_uncertainty,
+    )
+)