Benching: evaluate hackbench clearly, improve tools

2019-06-10 12:04:52 +02:00 · 2019-06-10 12:04:52 +02:00 · ceeec6ca5d
commit ceeec6ca5d
parent a0f58b592d
5 changed files with 277 additions and 0 deletions
--- a/benching/README.md
+++ b/benching/README.md
@ -0,0 +1,92 @@
 # Benching `eh_elfs`
 ## Benchmark setup
 Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
 ### Generate the `eh_elfs`
 ```bash
 ../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
  --keep-holes -O2 --global-switch --enable-deref-arg "$BENCHED_BINARY"
 ```
 ### Record a `perf` session
 ```bash
 perf record --call-graph dwarf,4096 "$BENCHED_BINARY" [args]
 ```
 ### Set up the environment
 ```bash
 source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
 ```
 The first value selects the version of libunwind you will be running, the
 second selects whether you want to run in debug or release mode (use release to
 get readings, debug to check for errors).
 You can reset your environment to its previous state by running `deactivate`.
 If you pick the `eh_elf` flavour, you will also have to
 ```bash
 export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
 ```
 ## Extract results
 ### Base readings
 **In release mode** (faster), run
 ```bash
 perf report 2>&1 >/dev/null
 ```
 with both `eh_elf` and `vanilla` shells. Compare average time.
 ### Getting debug output
 ```bash
 UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null
 ```
 ### Total number of calls to `unw_step`
 ```bash
 UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning"
 ```
 ### Total number of vanilla errors
 With the `vanilla` context,
 ```bash
 UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning -"
 ```
 ### Total number of fallbacks to original DWARF
 With the `eh_elf` context,
 ```bash
 UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* falling back"
 ```
 ### Total number of fallbacks to original DWARF that actually used DWARF
 With the `eh_elf` context,
 ```bash
 UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* fallback with"
 ```
 ### Get succeeded fallback locations
 ```bash
 UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null \
  | grep "step: .* fallback with" -B 15 \
  | grep "In memory map" | sort | uniq -c
 ```
--- a/benching/hackbench/EVALUATION.md
+++ b/benching/hackbench/EVALUATION.md
@ -0,0 +1,48 @@
 # Hackbench - evaluation
 Artifacts saved in `evaluation_artifacts`.
 ## Performance
 Using the command line
 ```bash
 for i in $(seq 1 100); do
  perf report 2>&1 >/dev/null | tail -n 1 \
    | python to_report_fmt.py | sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
 done
 ```
 we save a sequence of 100 performance readings to some file.
 Samples:
 * `eh_elf`:  135251 unw/exec
 * `vanilla`: 138233 unw/exec
 Average time/unw:
 * `eh_elf`:   102 ns
 * `vanilla`: 2443 ns
 Standard deviation:
 * `eh_elf`:  2 ns
 * `vanilla`: 47 ns
 Average ratio: 24
 Ratio uncertainty: 1.0
 ## Distibution of `unw_step` issues
 ### `eh_elf` case
 * success:                              135251 (97.7%)
 * fallback to DWARF:                      1467  (1.0%)
 * fallback to libunwind heuristics:        329  (0.2%)
 * fail to unwind:                         1410  (1.0%)
 * total:                                138457
 ### `vanilla` case
 * success:                              138201 (98.9%)
 * fallback to libunwind heuristics:         32  (0.0%)
 * fail to unwind:                         1411  (1.0%)
 * total:                                139644
--- a/benching/hackbench/README.md
+++ b/benching/hackbench/README.md
@ -0,0 +1,44 @@
 # Running the benchmarks
 Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
 ## Generate the `eh_elfs`
 ```bash
 ../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
  --keep-holes -O2 --global-switch --enable-deref-arg hackbench
 ```
 ## Record a `perf` session
 ```bash
 perf record --call-graph dwarf,4096 ./hackbench 10 process 100
 ```
 You can arbitrarily increase the first number up to ~100 and the second to get
 a longer session. This will most probably take all your computer's resources
 while it is running.
 ## Set up the environment
 ```bash
 source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
 ```
 The first value selects the version of libunwind you will be running, the
 second selects whether you want to run in debug or release mode (use release to
 get readings, debug to check for errors).
 You can reset your environment to its previous state by running `deactivate`.
 If you pick the `eh_elf` flavour, you will also have to
 ```bash
 export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
 ```
 ### Actually get readings
 ```bash
 perf report 2>&1 >/dev/null
 ```
--- a/benching/hackbench/to_report_fmt.py
+++ b/benching/hackbench/to_report_fmt.py
@ -0,0 +1,21 @@
 #!/usr/bin/env python3
 import re
 import sys
 line = input()
 regex = \
    re.compile(r'Total unwind time: ([0-9]*) s ([0-9]*) ns, ([0-9]*) calls')
 match = regex.match(line.strip())
 if not match:
    print('Badly formatted line', file=sys.stderr)
    sys.exit(1)
 sec = int(match.group(1))
 ns = int(match.group(2))
 calls = int(match.group(3))
 time = sec * 10**9 + ns
 print("{} & {} & {} & ??".format(calls, time, time // calls))
--- a/benching/tools/gen_perf_stats.py
+++ b/benching/tools/gen_perf_stats.py
@ -0,0 +1,72 @@
 #!/usr/bin/env python3
 """ Generates performance statistics for the eh_elf vs vanilla libunwind unwinding,
 based on time series generated beforehand
 First run
 ```bash
 for i in $(seq 1 100); do
  perf report 2>&1 >/dev/null | tail -n 1 \
      | python ../hackbench/to_report_fmt.py \
          | sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
 done > $SOME_PLACE/$FLAVOUR_times
 ```
 for each flavour (eh_elf, vanilla)
 Then run this script, with `$SOME_PLACE` as argument.
 """
 import numpy as np
 import sys
 import os
 def read_series(path):
    with open(path, "r") as handle:
        for line in handle:
            yield int(line.strip())
 FLAVOURS = ["eh_elf", "vanilla"]
 path_format = os.path.join(sys.argv[1], "{}_times")
 times = {}
 avgs = {}
 std_deviations = {}
 for flv in FLAVOURS:
    times[flv] = list(read_series(path_format.format(flv)))
    avgs[flv] = sum(times[flv]) / len(times[flv])
    std_deviations[flv] = np.sqrt(np.var(times[flv]))
 avg_ratio = avgs["vanilla"] / avgs["eh_elf"]
 ratio_uncertainty = (
    1
    / avgs["eh_elf"]
    * (
        std_deviations["vanilla"]
        + avgs["vanilla"] / avgs["eh_elf"] * std_deviations["eh_elf"]
    )
 )
 def format_flv(flv_dict, formatter):
    out = ""
    for flv in FLAVOURS:
        val = flv_dict[flv]
        out += "* {}: {}\n".format(flv, formatter.format(val))
    return out
 print(
    "Average time:\n{}\n"
    "Standard deviation:\n{}\n"
    "Average ratio: {}\n"
    "Ratio uncertainty: {}".format(
        format_flv(avgs, "{} ns"),
        format_flv(std_deviations, "{}"),
        avg_ratio,
        ratio_uncertainty,
    )
 )