Compare commits

...

3 commits

13 changed files with 505 additions and 48 deletions

92
benching/README.md Normal file
View file

@ -0,0 +1,92 @@
# Benching `eh_elfs`
## Benchmark setup
Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
### Generate the `eh_elfs`
```bash
../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
--keep-holes -O2 --global-switch --enable-deref-arg "$BENCHED_BINARY"
```
### Record a `perf` session
```bash
perf record --call-graph dwarf,4096 "$BENCHED_BINARY" [args]
```
### Set up the environment
```bash
source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
```
The first value selects the version of libunwind you will be running, the
second selects whether you want to run in debug or release mode (use release to
get readings, debug to check for errors).
You can reset your environment to its previous state by running `deactivate`.
If you pick the `eh_elf` flavour, you will also have to
```bash
export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
```
## Extract results
### Base readings
**In release mode** (faster), run
```bash
perf report 2>&1 >/dev/null
```
with both `eh_elf` and `vanilla` shells. Compare average time.
### Getting debug output
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null
```
### Total number of calls to `unw_step`
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning"
```
### Total number of vanilla errors
With the `vanilla` context,
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning -"
```
### Total number of fallbacks to original DWARF
With the `eh_elf` context,
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* falling back"
```
### Total number of fallbacks to original DWARF that actually used DWARF
With the `eh_elf` context,
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* fallback with"
```
### Get succeeded fallback locations
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null \
| grep "step: .* fallback with" -B 15 \
| grep "In memory map" | sort | uniq -c
```

3
benching/gzip/.gitignore vendored Normal file
View file

@ -0,0 +1,3 @@
gzip
gzip-1.10
perf.data

View file

@ -0,0 +1,49 @@
# gzip - evaluation
Artifacts saved in `evaluation_artifacts`.
## Performance
Using the command line
```bash
for i in $(seq 1 100); do
perf report 2>&1 >/dev/null | tail -n 1 \
| python ../hackbench/to_report_fmt.py \
| sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
done
```
we save a sequence of 100 performance readings to some file.
Samples:
* `eh_elf`: 331134 unw/exec
* `vanilla`: 331144 unw/exec
Average time/unw:
* `eh_elf`: 83 ns
* `vanilla`: 1304 ns
Standard deviation:
* `eh_elf`: 2 ns
* `vanilla`: 24 ns
Average ratio: 15.7
Ratio uncertainty: 0.8
## Distibution of `unw_step` issues
### `eh_elf` case
* success: 331134 (99.9%)
* fallback to DWARF: 2 (0.0%)
* fallback to libunwind heuristics: 8 (0.0%)
* fail to unwind: 379 (0.1%)
* total: 331523
### `vanilla` case
* success: 331136 (99.9%)
* fallback to libunwind heuristics: 8 (0.0%)
* fail to unwind: 379 (0.1%)
* total: 331523

View file

@ -0,0 +1,48 @@
# Hackbench - evaluation
Artifacts saved in `evaluation_artifacts`.
## Performance
Using the command line
```bash
for i in $(seq 1 100); do
perf report 2>&1 >/dev/null | tail -n 1 \
| python to_report_fmt.py | sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
done
```
we save a sequence of 100 performance readings to some file.
Samples:
* `eh_elf`: 135251 unw/exec
* `vanilla`: 138233 unw/exec
Average time/unw:
* `eh_elf`: 102 ns
* `vanilla`: 2443 ns
Standard deviation:
* `eh_elf`: 2 ns
* `vanilla`: 47 ns
Average ratio: 24
Ratio uncertainty: 1.0
## Distibution of `unw_step` issues
### `eh_elf` case
* success: 135251 (97.7%)
* fallback to DWARF: 1467 (1.0%)
* fallback to libunwind heuristics: 329 (0.2%)
* fail to unwind: 1410 (1.0%)
* total: 138457
### `vanilla` case
* success: 138201 (98.9%)
* fallback to libunwind heuristics: 32 (0.0%)
* fail to unwind: 1411 (1.0%)
* total: 139644

View file

@ -0,0 +1,44 @@
# Running the benchmarks
Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
## Generate the `eh_elfs`
```bash
../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
--keep-holes -O2 --global-switch --enable-deref-arg hackbench
```
## Record a `perf` session
```bash
perf record --call-graph dwarf,4096 ./hackbench 10 process 100
```
You can arbitrarily increase the first number up to ~100 and the second to get
a longer session. This will most probably take all your computer's resources
while it is running.
## Set up the environment
```bash
source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
```
The first value selects the version of libunwind you will be running, the
second selects whether you want to run in debug or release mode (use release to
get readings, debug to check for errors).
You can reset your environment to its previous state by running `deactivate`.
If you pick the `eh_elf` flavour, you will also have to
```bash
export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
```
### Actually get readings
```bash
perf report 2>&1 >/dev/null
```

View file

@ -0,0 +1,21 @@
#!/usr/bin/env python3
import re
import sys
line = input()
regex = \
re.compile(r'Total unwind time: ([0-9]*) s ([0-9]*) ns, ([0-9]*) calls')
match = regex.match(line.strip())
if not match:
print('Badly formatted line', file=sys.stderr)
sys.exit(1)
sec = int(match.group(1))
ns = int(match.group(2))
calls = int(match.group(3))
time = sec * 10**9 + ns
print("{} & {} & {} & ??".format(calls, time, time // calls))

View file

@ -0,0 +1,72 @@
#!/usr/bin/env python3
""" Generates performance statistics for the eh_elf vs vanilla libunwind unwinding,
based on time series generated beforehand
First run
```bash
for i in $(seq 1 100); do
perf report 2>&1 >/dev/null | tail -n 1 \
| python ../hackbench/to_report_fmt.py \
| sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
done > $SOME_PLACE/$FLAVOUR_times
```
for each flavour (eh_elf, vanilla)
Then run this script, with `$SOME_PLACE` as argument.
"""
import numpy as np
import sys
import os
def read_series(path):
with open(path, "r") as handle:
for line in handle:
yield int(line.strip())
FLAVOURS = ["eh_elf", "vanilla"]
path_format = os.path.join(sys.argv[1], "{}_times")
times = {}
avgs = {}
std_deviations = {}
for flv in FLAVOURS:
times[flv] = list(read_series(path_format.format(flv)))
avgs[flv] = sum(times[flv]) / len(times[flv])
std_deviations[flv] = np.sqrt(np.var(times[flv]))
avg_ratio = avgs["vanilla"] / avgs["eh_elf"]
ratio_uncertainty = (
1
/ avgs["eh_elf"]
* (
std_deviations["vanilla"]
+ avgs["vanilla"] / avgs["eh_elf"] * std_deviations["eh_elf"]
)
)
def format_flv(flv_dict, formatter):
out = ""
for flv in FLAVOURS:
val = flv_dict[flv]
out += "* {}: {}\n".format(flv, formatter.format(val))
return out
print(
"Average time:\n{}\n"
"Standard deviation:\n{}\n"
"Average ratio: {}\n"
"Ratio uncertainty: {}".format(
format_flv(avgs, "{} ns"),
format_flv(std_deviations, "{}"),
avg_ratio,
ratio_uncertainty,
)
)

View file

@ -9,9 +9,6 @@
using namespace std;
using namespace dwarf;
typedef std::set<std::pair<int, core::FrameSection::register_def> >
dwarfpp_row_t;
DwarfReader::DwarfReader(const string& path):
root(fileno(ifstream(path)))
{}
@ -30,7 +27,7 @@ static void dump_expr(const core::FrameSection::register_def& reg) {
fprintf(stderr, "\n");
}
SimpleDwarf DwarfReader::read() const {
SimpleDwarf DwarfReader::read() {
const core::FrameSection& fs = root.get_frame_section();
SimpleDwarf output;
@ -42,57 +39,119 @@ SimpleDwarf DwarfReader::read() const {
return output;
}
SimpleDwarf::Fde DwarfReader::read_fde(const core::Fde& fde) const {
void DwarfReader::add_cell_to_row(
const dwarf::core::FrameSection::register_def& reg,
int reg_id,
int ra_reg,
SimpleDwarf::DwRow& cur_row)
{
if(reg_id == DW_FRAME_CFA_COL3) {
cur_row.cfa = read_register(reg);
}
else {
try {
SimpleDwarf::MachineRegister reg_type =
from_dwarfpp_reg(reg_id, ra_reg);
switch(reg_type) {
case SimpleDwarf::REG_RBP:
cur_row.rbp = read_register(reg);
break;
case SimpleDwarf::REG_RBX:
cur_row.rbx = read_register(reg);
break;
case SimpleDwarf::REG_RA:
cur_row.ra = read_register(reg);
break;
default:
break;
}
}
catch(const UnsupportedRegister&) {} // Just ignore it.
}
}
void DwarfReader::append_row_to_fde(
const dwarfpp_row_t& row,
uintptr_t row_addr,
int ra_reg,
SimpleDwarf::Fde& output)
{
SimpleDwarf::DwRow cur_row;
cur_row.ip = row_addr;
for(const auto& cell: row) {
add_cell_to_row(cell.second, cell.first, ra_reg, cur_row);
}
if(cur_row.cfa.type == SimpleDwarf::DwRegister::REG_UNDEFINED)
{
// Not set
throw InvalidDwarf();
}
output.rows.push_back(cur_row);
}
template<typename Key, typename Value>
static std::set<std::pair<Key, Value> > map_to_setpair(
const std::map<Key, Value>& src_map)
{
std::set<std::pair<Key, Value> > out;
for(const auto map_it: src_map) {
out.insert(map_it);
}
return out;
}
void DwarfReader::append_results_to_fde(
const dwarf::core::FrameSection::instrs_results& results,
int ra_reg,
SimpleDwarf::Fde& output)
{
for(const auto row_pair: results.rows) {
append_row_to_fde(
row_pair.second,
row_pair.first.lower(),
ra_reg,
output);
}
if(results.unfinished_row.size() > 0) {
try {
append_row_to_fde(
map_to_setpair(results.unfinished_row),
results.unfinished_row_addr,
ra_reg,
output);
} catch(const InvalidDwarf&) {
// Ignore: the unfinished_row can be undefined
}
}
}
SimpleDwarf::Fde DwarfReader::read_fde(const core::Fde& fde) {
SimpleDwarf::Fde output;
output.fde_offset = fde.get_fde_offset();
output.beg_ip = fde.get_low_pc();
output.end_ip = fde.get_low_pc() + fde.get_func_length();
auto rows = fde.decode().rows;
const core::Cie& cie = *fde.find_cie();
int ra_reg = cie.get_return_address_register_rule();
for(const auto row_pair: rows) {
SimpleDwarf::DwRow cur_row;
// CIE rows
core::FrameSection cie_fs(root.get_dbg(), true);
auto cie_rows = cie_fs.interpret_instructions(
cie,
fde.get_low_pc(),
cie.get_initial_instructions(),
cie.get_initial_instructions_length());
cur_row.ip = row_pair.first.lower();
// FDE rows
auto fde_rows = fde.decode();
const dwarfpp_row_t& row = row_pair.second;
for(const auto& cell: row) {
if(cell.first == DW_FRAME_CFA_COL3) {
cur_row.cfa = read_register(cell.second);
}
else {
try {
SimpleDwarf::MachineRegister reg_type =
from_dwarfpp_reg(cell.first, ra_reg);
switch(reg_type) {
case SimpleDwarf::REG_RBP:
cur_row.rbp = read_register(cell.second);
break;
case SimpleDwarf::REG_RBX:
cur_row.rbx = read_register(cell.second);
break;
case SimpleDwarf::REG_RA:
cur_row.ra = read_register(cell.second);
break;
default:
break;
}
}
catch(const UnsupportedRegister&) {} // Just ignore it.
}
}
if(cur_row.cfa.type == SimpleDwarf::DwRegister::REG_UNDEFINED)
{
// Not set
throw InvalidDwarf();
}
output.rows.push_back(cur_row);
}
// instrs
append_results_to_fde(cie_rows, ra_reg, output);
append_results_to_fde(fde_rows, ra_reg, output);
return output;
}

View file

@ -13,6 +13,9 @@
#include "SimpleDwarf.hpp"
typedef std::set<std::pair<int, dwarf::core::FrameSection::register_def> >
dwarfpp_row_t;
class DwarfReader {
public:
class InvalidDwarf: public std::exception {};
@ -21,14 +24,31 @@ class DwarfReader {
DwarfReader(const std::string& path);
/** Actually read the ELF file, generating a `SimpleDwarf` output. */
SimpleDwarf read() const;
SimpleDwarf read();
private: //meth
SimpleDwarf::Fde read_fde(const dwarf::core::Fde& fde) const;
SimpleDwarf::Fde read_fde(const dwarf::core::Fde& fde);
void append_results_to_fde(
const dwarf::core::FrameSection::instrs_results& results,
int ra_reg,
SimpleDwarf::Fde& output);
SimpleDwarf::DwRegister read_register(
const dwarf::core::FrameSection::register_def& reg) const;
void add_cell_to_row(
const dwarf::core::FrameSection::register_def& reg,
int reg_id,
int ra_reg,
SimpleDwarf::DwRow& cur_row);
void append_row_to_fde(
const dwarfpp_row_t& row,
uintptr_t row_addr,
int ra_reg,
SimpleDwarf::Fde& output);
SimpleDwarf::MachineRegister from_dwarfpp_reg(
int reg_id,
int ra_reg=-1

View file

@ -14,6 +14,7 @@ OBJS=\
PcHoleFiller.o \
EmptyFdeDeleter.o \
ConseqEquivFilter.o \
OverriddenRowFilter.o \
SwitchStatement.o \
NativeSwitchCompiler.o \
FactoredSwitchCompiler.o \

View file

@ -0,0 +1,31 @@
#include "OverriddenRowFilter.hpp"
OverriddenRowFilter::OverriddenRowFilter(bool enable)
: SimpleDwarfFilter(enable)
{}
SimpleDwarf OverriddenRowFilter::do_apply(const SimpleDwarf& dw) const {
SimpleDwarf out;
for(const auto& fde: dw.fde_list) {
out.fde_list.push_back(SimpleDwarf::Fde());
SimpleDwarf::Fde& cur_fde = out.fde_list.back();
cur_fde.fde_offset = fde.fde_offset;
cur_fde.beg_ip = fde.beg_ip;
cur_fde.end_ip = fde.end_ip;
if(fde.rows.empty())
continue;
for(size_t pos=0; pos < fde.rows.size(); ++pos) {
const auto& row = fde.rows[pos];
if(pos == fde.rows.size() - 1
|| row.ip != fde.rows[pos+1].ip)
{
cur_fde.rows.push_back(row);
}
}
}
return out;
}

View file

@ -0,0 +1,15 @@
/** SimpleDwarfFilter to remove the first `n-1` rows of a block of `n`
* contiguous rows that have the exact same address. */
#pragma once
#include "SimpleDwarf.hpp"
#include "SimpleDwarfFilter.hpp"
class OverriddenRowFilter: public SimpleDwarfFilter {
public:
OverriddenRowFilter(bool enable=true);
private:
SimpleDwarf do_apply(const SimpleDwarf& dw) const;
};

View file

@ -13,6 +13,7 @@
#include "PcHoleFiller.hpp"
#include "EmptyFdeDeleter.hpp"
#include "ConseqEquivFilter.hpp"
#include "OverriddenRowFilter.hpp"
#include "settings.hpp"
@ -106,8 +107,9 @@ int main(int argc, char** argv) {
SimpleDwarf filtered_dwarf =
PcHoleFiller(!settings::keep_holes)(
EmptyFdeDeleter()(
OverriddenRowFilter()(
ConseqEquivFilter()(
parsed_dwarf)));
parsed_dwarf))));
FactoredSwitchCompiler* sw_compiler = new FactoredSwitchCompiler(1);
CodeGenerator code_gen(