Compare commits

..

2 commits

74 changed files with 343 additions and 4104 deletions

1
.gitignore vendored
View file

@ -33,4 +33,3 @@
dwarf-assembly
__pycache__
platform.info

View file

@ -1,9 +1,10 @@
# Dwarf Assembly
A compiler from DWARF unwinding data to native x86\_64 binaries.
Some experiments around compiling the most used Dwarf informations (ELF debug
data) directly into assembly.
This repository also contains various experiments, tools, benchmarking scripts,
stats scripts, etc. to work on this compiler.
This project is a big work in progress, don't expect anything to be stable for
now.
## Dependencies
@ -16,8 +17,7 @@ As of now, this project relies on the following libraries:
- [libsrk31cxx](https://github.com/stephenrkell/libsrk31cxx)
These libraries are expected to be installed somewhere your compiler can find
them. If you are using Archlinux, you can check
[these `PKGBUILD`s](https://git.tobast.fr/m2-internship/pkgbuilds).
them.
## Scripts and directories
@ -26,40 +26,4 @@ them. If you are using Archlinux, you can check
* `./compare_sizes.py`: compare the sizes of the `.eh_frame` of a binary (and
its dependencies) with the sizes of the `.text` of the generated ELFs.
* `./extract_pc.py`: extracts a list of valid program counters of an ELF and
produce a file as read by `dwarf-assembly`, **deprecated**.
* `benching`: all about benchmarking
* `env`: environment variables manager to ease the use of various `eh_elf`s in
parallel, for experiments.
* `shared`: code shared between various subprojects
* `src`: the compiler code itself
* `stack_walker`: a primitive stack walker using `eh_elf`s
* `stack_walker_libunwind`: a primitive stack walker using vanilla `libunwind`
* `stats`: a statistics gathering module
* `tests`: some tests regarding `eh_elf`s, **deprecated**.
## How to use
To compile `eh_elf`s for a given ELF file, say `foo.bin`, it is recommended to
use `generate_eh_elf.py`. Help can be obtained with `--help`. A standard
command is
```bash
./generate_eh_elf.py --deps --enable-deref-arg --global-switch -o eh_elfs foo.bin
```
This will compile `foo.bin` and all the shared objects it relies on into
`eh_elf`s, in the directory `./eh_elfs`, using a dereferencing argument (which
is necessary for `perf-eh_elfs`).
## Generate the intermediary C file
If you're curious about the intermediary C file generated for a given ELF file
`foo.bin`, you must call `dwarf-assembly` directly. A parameter `--help` can be
passed; a standard command is
```bash
./dwarf-assembly --global-switch --enable-deref-arg foo.bin
```
**Beware**! This will generate the C code on the standard output.
produce a file as read by `dwarf-assembly`

View file

@ -1,92 +0,0 @@
# Benching `eh_elfs`
## Benchmark setup
Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
### Generate the `eh_elfs`
```bash
../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
--keep-holes -O2 --global-switch --enable-deref-arg "$BENCHED_BINARY"
```
### Record a `perf` session
```bash
perf record --call-graph dwarf,4096 "$BENCHED_BINARY" [args]
```
### Set up the environment
```bash
source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
```
The first value selects the version of libunwind you will be running, the
second selects whether you want to run in debug or release mode (use release to
get readings, debug to check for errors).
You can reset your environment to its previous state by running `deactivate`.
If you pick the `eh_elf` flavour, you will also have to
```bash
export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
```
## Extract results
### Base readings
**In release mode** (faster), run
```bash
perf report 2>&1 >/dev/null
```
with both `eh_elf` and `vanilla` shells. Compare average time.
### Getting debug output
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null
```
### Total number of calls to `unw_step`
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning"
```
### Total number of vanilla errors
With the `vanilla` context,
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* returning -"
```
### Total number of fallbacks to original DWARF
With the `eh_elf` context,
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* falling back"
```
### Total number of fallbacks to original DWARF that actually used DWARF
With the `eh_elf` context,
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null | grep -c "step:.* fallback with"
```
### Get succeeded fallback locations
```bash
UNW_DEBUG_LEVEL=5 perf report 2>&1 >/dev/null \
| grep "step: .* fallback with" -B 15 \
| grep "In memory map" | sort | uniq -c
```

View file

@ -1 +0,0 @@
/tests

View file

@ -1,5 +0,0 @@
#!/bin/bash
csmith "$@" | \
sed 's/#include "csmith.h"/#include <bench.h>\n#include <csmith.h>/g' |\
sed 's/return /bench_unwinding(); return /g'

View file

@ -1,124 +0,0 @@
#!/usr/bin/env python3
""" Generates the call graph (in dot format) of a C code generated by CSmith.
This does not parse C code, but performs only string lookups. In particular, it
assumes functions are named `func_[0-9]+` or `main`, and that a function
implementation is of the form (whitespaces meaningful)
(static)? RETURN_TYPE func_[0-9]+(.*)
{
...
}
"""
import sys
import re
def build_graph(prog):
func_declare_re = re.compile(r'(?:static )?\S.* (func_\d+|main) ?\(.*\)$')
func_call_re = re.compile(r'func_\d+')
graph = {}
cur_function = None
for line in prog:
func_declare_groups = func_declare_re.match(line)
if func_declare_groups:
func_name = func_declare_groups.group(1)
cur_function = func_name
graph[func_name] = []
elif line == '}':
cur_function = None
else:
if cur_function is None:
continue # Not interresting outside of functions
last_find_pos = 0
call_match = func_call_re.search(line, pos=last_find_pos)
while call_match is not None:
graph[cur_function].append(call_match.group(0))
last_find_pos = call_match.end()
call_match = func_call_re.search(line, pos=last_find_pos)
reachable = set()
def mark_reachable(node):
nonlocal reachable, graph
if node in reachable:
return
reachable.add(node)
for child in graph[node]:
mark_reachable(child)
mark_reachable('main')
delete = []
for node in graph:
if node not in reachable:
delete.append(node)
for node in delete:
print('> Deleted: {}'.format(node), file=sys.stderr)
graph.pop(node)
return graph
def dump_graph(graph):
print('digraph prog {')
for node in graph:
for call in graph[node]:
if call in graph:
print('\t{} -> {}'.format(node, call))
else:
print('Wtf is {} (called from {})?'.format(node, call),
file=sys.stderr)
print('}')
def dump_stats(graph, out_file):
entry_point = 'main'
depth_of = {}
def find_depth(node):
nonlocal depth_of
if node in depth_of:
return depth_of[node]
callees = graph[node]
if callees:
depth = max(map(find_depth, callees)) + 1
else:
depth = 1
depth_of[node] = depth
return depth
print("Call chain max depth: {}".format(find_depth(entry_point)),
file=out_file)
def get_prog_lines():
def do_read(handle):
return handle.readlines()
if len(sys.argv) > 1:
with open(sys.argv[1], 'r') as handle:
return do_read(handle)
else:
return do_read(sys.stdin)
def main():
prog = get_prog_lines()
graph = build_graph(prog)
dump_graph(graph)
dump_stats(graph, out_file=sys.stderr)
if __name__ == '__main__':
main()

View file

@ -1,3 +0,0 @@
gzip
gzip-1.10
perf.data

View file

@ -1,49 +0,0 @@
# gzip - evaluation
Artifacts saved in `evaluation_artifacts`.
## Performance
Using the command line
```bash
for i in $(seq 1 100); do
perf report 2>&1 >/dev/null | tail -n 1 \
| python ../hackbench/to_report_fmt.py \
| sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
done
```
we save a sequence of 100 performance readings to some file.
Samples:
* `eh_elf`: 331134 unw/exec
* `vanilla`: 331144 unw/exec
Average time/unw:
* `eh_elf`: 83 ns
* `vanilla`: 1304 ns
Standard deviation:
* `eh_elf`: 2 ns
* `vanilla`: 24 ns
Average ratio: 15.7
Ratio uncertainty: 0.8
## Distibution of `unw_step` issues
### `eh_elf` case
* success: 331134 (99.9%)
* fallback to DWARF: 2 (0.0%)
* fallback to libunwind heuristics: 8 (0.0%)
* fail to unwind: 379 (0.1%)
* total: 331523
### `vanilla` case
* success: 331136 (99.9%)
* fallback to libunwind heuristics: 8 (0.0%)
* fail to unwind: 379 (0.1%)
* total: 331523

View file

@ -1,5 +0,0 @@
/eh_elfs*
/bench*
/perf.data*
/perfperf.data*
/hackbench

View file

@ -1,48 +0,0 @@
# Hackbench - evaluation
Artifacts saved in `evaluation_artifacts`.
## Performance
Using the command line
```bash
for i in $(seq 1 100); do
perf report 2>&1 >/dev/null | tail -n 1 \
| python to_report_fmt.py | sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
done
```
we save a sequence of 100 performance readings to some file.
Samples:
* `eh_elf`: 135251 unw/exec
* `vanilla`: 138233 unw/exec
Average time/unw:
* `eh_elf`: 102 ns
* `vanilla`: 2443 ns
Standard deviation:
* `eh_elf`: 2 ns
* `vanilla`: 47 ns
Average ratio: 24
Ratio uncertainty: 1.0
## Distibution of `unw_step` issues
### `eh_elf` case
* success: 135251 (97.7%)
* fallback to DWARF: 1467 (1.0%)
* fallback to libunwind heuristics: 329 (0.2%)
* fail to unwind: 1410 (1.0%)
* total: 138457
### `vanilla` case
* success: 138201 (98.9%)
* fallback to libunwind heuristics: 32 (0.0%)
* fail to unwind: 1411 (1.0%)
* total: 139644

View file

@ -1,4 +0,0 @@
all: hackbench
hackbench: hackbench.c
gcc -Wall -Wextra -O2 -std=c11 -lpthread -o $@ $<

View file

@ -1,44 +0,0 @@
# Running the benchmarks
Pick some name for your `eh_elfs` directory. We will call it `$EH_ELF_DIR`.
## Generate the `eh_elfs`
```bash
../../generate_eh_elf.py --deps -o "$EH_ELF_DIR" \
--keep-holes -O2 --global-switch --enable-deref-arg hackbench
```
## Record a `perf` session
```bash
perf record --call-graph dwarf,4096 ./hackbench 10 process 100
```
You can arbitrarily increase the first number up to ~100 and the second to get
a longer session. This will most probably take all your computer's resources
while it is running.
## Set up the environment
```bash
source ../../env/apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
```
The first value selects the version of libunwind you will be running, the
second selects whether you want to run in debug or release mode (use release to
get readings, debug to check for errors).
You can reset your environment to its previous state by running `deactivate`.
If you pick the `eh_elf` flavour, you will also have to
```bash
export LD_LIBRARY_PATH="$EH_ELF_DIR:$LD_LIBRARY_PATH"
```
### Actually get readings
```bash
perf report 2>&1 >/dev/null
```

View file

@ -1,399 +0,0 @@
/*
* This is the latest version of hackbench.c, that tests scheduler and
* unix-socket (or pipe) performance.
*
* Usage: hackbench [-pipe] <num groups> [process|thread] [loops]
*
* Build it with:
* gcc -g -Wall -O2 -o hackbench hackbench.c -lpthread
*/
#if 0
Date: Fri, 04 Jan 2008 14:06:26 +0800
From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: LKML <linux-kernel@vger.kernel.org>
Subject: Improve hackbench
Cc: Ingo Molnar <mingo@elte.hu>, Arjan van de Ven <arjan@infradead.org>
hackbench tests the Linux scheduler. The original program is at
http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c
Based on this multi-process version, a nice person created a multi-thread
version. Pls. see
http://www.bullopensource.org/posix/pi-futex/hackbench_pth.c
When I integrated them into my automation testing system, I found
a couple of issues and did some improvements.
1) Merge hackbench: I integrated hackbench_pth.c into hackbench and added a
new parameter which can be used to choose process mode or thread mode. The
default mode is process.
2) It runs too fast and ends in a couple of seconds. Sometimes it's too hard to debug
the issues. On my ia64 Montecito machines, the result looks weird when comparing
process mode and thread mode.
I want a stable result and hope the testing could run for a stable longer time, so I
might use performance tools to debug issues.
I added another new parameter,`loops`, which can be used to change variable loops,
so more messages will be passed from writers to receivers. Parameter 'loops' is equal to
100 by default.
For example on my 8-core x86_64:
[ymzhang@lkp-st01-x8664 hackbench]$ uname -a
Linux lkp-st01-x8664 2.6.24-rc6 #1 SMP Fri Dec 21 08:32:31 CST 2007 x86_64 x86_64 x86_64 GNU/Linux
[ymzhang@lkp-st01-x8664 hackbench]$ ./hackbench
Usage: hackbench [-pipe] <num groups> [process|thread] [loops]
[ymzhang@lkp-st01-x8664 hackbench]$ ./hackbench 150 process 1000
Time: 151.533
[ymzhang@lkp-st01-x8664 hackbench]$ ./hackbench 150 thread 1000
Time: 153.666
With the same new parameters, I did captured the SLUB issue discussed on LKML recently.
3) hackbench_pth.c will fail on ia64 machine because pthread_attr_setstacksize always
fails if the stack size is less than 196*1024. I moved this statement within a __ia64__ check.
This new program could be compiled with command line:
#gcc -g -Wall -o hackbench hackbench.c -lpthread
Thank Ingo for his great comments!
-yanmin
---
* Nathan Lynch <ntl@pobox.com> wrote:
> Here's a fixlet for the hackbench program found at
>
> http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c
>
> When redirecting hackbench output I am seeing multiple copies of the
> "Running with %d*40 (== %d) tasks" line. Need to flush the buffered
> output before forking.
#endif
/* Test groups of 20 processes spraying to 20 receivers */
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <sys/time.h>
#include <sys/poll.h>
#include <limits.h>
#define DATASIZE 100
static unsigned int loops = 100;
/*
* 0 means thread mode and others mean process (default)
*/
static unsigned int process_mode = 1;
static int use_pipes = 0;
struct sender_context {
unsigned int num_fds;
int ready_out;
int wakefd;
int out_fds[0];
};
struct receiver_context {
unsigned int num_packets;
int in_fds[2];
int ready_out;
int wakefd;
};
static void barf(const char *msg)
{
fprintf(stderr, "%s (error: %s)\n", msg, strerror(errno));
exit(1);
}
static void print_usage_exit()
{
printf("Usage: hackbench [-pipe] <num groups> [process|thread] [loops]\n");
exit(1);
}
static void fdpair(int fds[2])
{
if (use_pipes) {
if (pipe(fds) == 0)
return;
} else {
if (socketpair(AF_UNIX, SOCK_STREAM, 0, fds) == 0)
return;
}
barf("Creating fdpair");
}
/* Block until we're ready to go */
static void ready(int ready_out, int wakefd)
{
char dummy;
struct pollfd pollfd = { .fd = wakefd, .events = POLLIN };
/* Tell them we're ready. */
if (write(ready_out, &dummy, 1) != 1)
barf("CLIENT: ready write");
/* Wait for "GO" signal */
if (poll(&pollfd, 1, -1) != 1)
barf("poll");
}
/* Sender sprays loops messages down each file descriptor */
static void *sender(struct sender_context *ctx)
{
char data[DATASIZE];
unsigned int i, j;
ready(ctx->ready_out, ctx->wakefd);
/* Now pump to every receiver. */
for (i = 0; i < loops; i++) {
for (j = 0; j < ctx->num_fds; j++) {
int ret, done = 0;
again:
ret = write(ctx->out_fds[j], data + done, sizeof(data)-done);
if (ret < 0)
barf("SENDER: write");
done += ret;
if (done < sizeof(data))
goto again;
}
}
return NULL;
}
/* One receiver per fd */
static void *receiver(struct receiver_context* ctx)
{
unsigned int i;
if (process_mode)
close(ctx->in_fds[1]);
/* Wait for start... */
ready(ctx->ready_out, ctx->wakefd);
/* Receive them all */
for (i = 0; i < ctx->num_packets; i++) {
char data[DATASIZE];
int ret, done = 0;
again:
ret = read(ctx->in_fds[0], data + done, DATASIZE - done);
if (ret < 0)
barf("SERVER: read");
done += ret;
if (done < DATASIZE)
goto again;
}
return NULL;
}
pthread_t create_worker(void *ctx, void *(*func)(void *))
{
pthread_attr_t attr;
pthread_t childid;
int err;
if (process_mode) {
/* process mode */
/* Fork the receiver. */
switch (fork()) {
case -1: barf("fork()");
case 0:
(*func) (ctx);
exit(0);
}
return (pthread_t) 0;
}
if (pthread_attr_init(&attr) != 0)
barf("pthread_attr_init:");
#ifndef __ia64__
if (pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN) != 0)
barf("pthread_attr_setstacksize");
#endif
if ((err=pthread_create(&childid, &attr, func, ctx)) != 0) {
fprintf(stderr, "pthread_create failed: %s (%d)\n", strerror(err), err);
exit(-1);
}
return (childid);
}
void reap_worker(pthread_t id)
{
int status;
if (process_mode) {
/* process mode */
wait(&status);
if (!WIFEXITED(status))
exit(1);
} else {
void *status;
pthread_join(id, &status);
}
}
/* One group of senders and receivers */
static unsigned int group(pthread_t *pth,
unsigned int num_fds,
int ready_out,
int wakefd)
{
unsigned int i;
struct sender_context* snd_ctx = malloc (sizeof(struct sender_context)
+num_fds*sizeof(int));
for (i = 0; i < num_fds; i++) {
int fds[2];
struct receiver_context* ctx = malloc (sizeof(*ctx));
if (!ctx)
barf("malloc()");
/* Create the pipe between client and server */
fdpair(fds);
ctx->num_packets = num_fds*loops;
ctx->in_fds[0] = fds[0];
ctx->in_fds[1] = fds[1];
ctx->ready_out = ready_out;
ctx->wakefd = wakefd;
pth[i] = create_worker(ctx, (void *)(void *)receiver);
snd_ctx->out_fds[i] = fds[1];
if (process_mode)
close(fds[0]);
}
/* Now we have all the fds, fork the senders */
for (i = 0; i < num_fds; i++) {
snd_ctx->ready_out = ready_out;
snd_ctx->wakefd = wakefd;
snd_ctx->num_fds = num_fds;
pth[num_fds+i] = create_worker(snd_ctx, (void *)(void *)sender);
}
/* Close the fds we have left */
if (process_mode)
for (i = 0; i < num_fds; i++)
close(snd_ctx->out_fds[i]);
/* Return number of children to reap */
return num_fds * 2;
}
int main(int argc, char *argv[])
{
unsigned int i, num_groups = 10, total_children;
struct timeval start, stop, diff;
unsigned int num_fds = 20;
int readyfds[2], wakefds[2];
char dummy;
pthread_t *pth_tab;
if (argv[1] && strcmp(argv[1], "-pipe") == 0) {
use_pipes = 1;
argc--;
argv++;
}
if (argc >= 2 && (num_groups = atoi(argv[1])) == 0)
print_usage_exit();
printf("Running with %d*40 (== %d) tasks.\n",
num_groups, num_groups*40);
#ifdef MMAP
{
printf("Memory map:\n");
FILE* mmap = fopen("/proc/self/maps", "r");
char* line = (char*) malloc(512);
size_t line_size = 512;
while(getline(&line, &line_size, mmap) >= 0) {
printf(">> %s", line);
}
free(line);
puts("");
fclose(mmap);
}
#endif // MMAP
fflush(NULL);
if (argc > 2) {
if ( !strcmp(argv[2], "process") )
process_mode = 1;
else if ( !strcmp(argv[2], "thread") )
process_mode = 0;
else
print_usage_exit();
}
if (argc > 3)
loops = atoi(argv[3]);
pth_tab = malloc(num_fds * 2 * num_groups * sizeof(pthread_t));
if (!pth_tab)
barf("main:malloc()");
fdpair(readyfds);
fdpair(wakefds);
total_children = 0;
for (i = 0; i < num_groups; i++)
total_children += group(pth_tab+total_children, num_fds, readyfds[1], wakefds[0]);
/* Wait for everyone to be ready */
for (i = 0; i < total_children; i++)
if (read(readyfds[0], &dummy, 1) != 1)
barf("Reading for readyfds");
gettimeofday(&start, NULL);
/* Kick them off */
if (write(wakefds[1], &dummy, 1) != 1)
barf("Writing to start them");
/* Reap them all */
for (i = 0; i < total_children; i++)
reap_worker(pth_tab[i]);
gettimeofday(&stop, NULL);
/* Print time... */
timersub(&stop, &start, &diff);
printf("Time: %lu.%03lu\n", diff.tv_sec, diff.tv_usec/1000);
exit(0);
}

View file

@ -1,21 +0,0 @@
#!/usr/bin/env python3
import re
import sys
line = input()
regex = \
re.compile(r'Total unwind time: ([0-9]*) s ([0-9]*) ns, ([0-9]*) calls')
match = regex.match(line.strip())
if not match:
print('Badly formatted line', file=sys.stderr)
sys.exit(1)
sec = int(match.group(1))
ns = int(match.group(2))
calls = int(match.group(3))
time = sec * 10**9 + ns
print("{} & {} & {} & ??".format(calls, time, time // calls))

View file

@ -1,8 +0,0 @@
def slow_fibo(n):
if n <= 1:
return 1
return slow_fibo(n - 1) + slow_fibo(n - 2)
if __name__ == "__main__":
slow_fibo(35)

View file

@ -1,18 +0,0 @@
#!/bin/bash
if [ "$#" -lt 1 ] ; then
>&2 echo "Missing argument: directory"
exit 1
fi
BENCH_DIR="$(echo $1 | sed 's@/$@@g')"
ENV_APPLY="$(readlink -f "$(dirname $0)/../../env/apply")"
if ! [ -f "$ENV_APPLY" ] ; then
>&2 echo "Cannot find helper scripts. Abort."
exit 1
fi
function status_report {
echo -e "\e[33;1m[$BENCH_DIR]\e[0m $1"
}

View file

@ -1,101 +0,0 @@
#!/bin/bash
source "$(dirname $0)/common.sh"
TMP_FILE=$(mktemp)
if [ -z "$EH_ELFS_NAME" ]; then
EH_ELFS_NAME="eh_elfs"
fi
function get_perf_output {
envtype=$1
source $ENV_APPLY "$envtype" "dbg"
LD_LIBRARY_PATH="$BENCH_DIR/$EH_ELFS_NAME:$LD_LIBRARY_PATH" \
UNW_DEBUG_LEVEL=15 \
perf report -i "$BENCH_DIR/perf.data" 2>$TMP_FILE >/dev/null
deactivate
}
function count_successes {
cat $TMP_FILE | tail -n 1 | sed 's/^.*, \([0-9]*\) calls.*$/\1/g'
}
function count_total_calls {
cat $TMP_FILE | grep -c "^ >.*step:.* returning"
}
function count_errors {
cat $TMP_FILE | grep -c "^ >.*step:.* returning -"
}
function count_eh_fallbacks {
cat $TMP_FILE | grep -c "step:.* falling back"
}
function count_vanilla_fallbacks {
cat $TMP_FILE | grep -c "step:.* frame-chain"
}
function count_fallbacks_to_dwarf {
cat $TMP_FILE | grep -c "step:.* fallback with"
}
function count_fallbacks_failed {
cat $TMP_FILE | grep -c "step:.* dwarf_step also failed"
}
function count_fail_after_fallback_to_dwarf {
cat $TMP_FILE \
| "$(dirname $0)/line_patterns.py" \
"fallback with" \
"step:.* unw_step called" \
~"step:.* unw_step called" \
"step:.* returning -" \
| grep Complete -c
}
function report {
flavour="$1"
status_report "$flavour issues distribution"
successes=$(count_successes)
failures=$(count_errors)
total=$(count_total_calls)
if [ "$flavour" = "eh_elf" ]; then
fallbacks=$(count_eh_fallbacks)
fallbacks_to_dwarf=$(count_fallbacks_to_dwarf)
fallbacks_to_dwarf_failed_after=$(count_fail_after_fallback_to_dwarf)
fallbacks_failed=$(count_fallbacks_failed)
fallbacks_to_heuristics="$(( $fallbacks \
- $fallbacks_to_dwarf \
- $fallbacks_failed))"
echo -e "* success:\t\t\t\t$successes"
echo -e "* fallback to DWARF:\t\t\t$fallbacks_to_dwarf"
echo -e "* …of which failed at next step:\t$fallbacks_to_dwarf_failed_after"
echo -e "* fallback to libunwind heuristics:\t$fallbacks_to_heuristics"
computed_sum=$(( $successes + $fallbacks - $fallbacks_failed + $failures ))
else
fallbacks=$(count_vanilla_fallbacks)
successes=$(( $successes - $fallbacks ))
echo -e "* success:\t\t\t\t$successes"
echo -e "* fallback to libunwind heuristics:\t$fallbacks"
computed_sum=$(( $successes + $fallbacks + $failures ))
fi
echo -e "* fail to unwind:\t\t\t$failures"
echo -e "* total:\t\t\t\t$total"
if [ "$computed_sum" -ne "$total" ] ; then
echo "-- WARNING: missing cases (computed sum $computed_sum != $total) --"
fi
}
# eh_elf stats
get_perf_output "eh_elf"
report "eh_elf"
# Vanilla stats
get_perf_output "vanilla"
report "vanilla"
rm "$TMP_FILE"

View file

@ -1,86 +0,0 @@
#!/bin/bash
source "$(dirname $0)/common.sh"
TMP_FILE=$(mktemp)
function get_perf_output {
envtype=$1
source $ENV_APPLY "$envtype" "dbg"
LD_LIBRARY_PATH="$BENCH_DIR/eh_elfs:$LD_LIBRARY_PATH" \
UNW_DEBUG_LEVEL=15 \
perf report -i "$BENCH_DIR/perf.data" 2>$TMP_FILE >/dev/null
deactivate
}
function count_successes {
cat $TMP_FILE | tail -n 1 | sed 's/^.*, \([0-9]*\) calls.*$/\1/g'
}
function count_total_calls {
cat $TMP_FILE | grep -c "^ >.*step:.* returning"
}
function count_errors {
cat $TMP_FILE | grep -c "^ >.*step:.* returning -"
}
function count_eh_fallbacks {
cat $TMP_FILE | grep -c "step:.* falling back"
}
function count_vanilla_fallbacks {
cat $TMP_FILE | grep -c "step:.* frame-chain"
}
function count_fallbacks_to_dwarf {
cat $TMP_FILE | grep -c "step:.* fallback with"
}
function count_fallbacks_failed {
cat $TMP_FILE | grep -c "step:.* dwarf_step also failed"
}
function report {
flavour="$1"
status_report "$flavour issues distribution"
successes=$(count_successes)
failures=$(count_errors)
total=$(count_total_calls)
if [ "$flavour" = "eh_elf" ]; then
fallbacks=$(count_eh_fallbacks)
fallbacks_to_dwarf=$(count_fallbacks_to_dwarf)
fallbacks_failed=$(count_fallbacks_failed)
fallbacks_to_heuristics="$(( $fallbacks \
- $fallbacks_to_dwarf \
- $fallbacks_failed))"
echo -e "* success:\t\t\t\t$successes"
echo -e "* fallback to DWARF:\t\t\t$fallbacks_to_dwarf"
echo -e "* fallback to libunwind heuristics:\t$fallbacks_to_heuristics"
computed_sum=$(( $successes + $fallbacks - $fallbacks_failed + $failures ))
else
fallbacks=$(count_vanilla_fallbacks)
successes=$(( $successes - $fallbacks ))
echo -e "* success:\t\t\t\t$successes"
echo -e "* fallback to libunwind heuristics:\t$fallbacks"
computed_sum=$(( $successes + $fallbacks + $failures ))
fi
echo -e "* fail to unwind:\t\t\t$failures"
echo -e "* total:\t\t\t\t$total"
if [ "$computed_sum" -ne "$total" ] ; then
echo "-- WARNING: missing cases (computed sum $computed_sum != $total) --"
fi
}
# eh_elf stats
get_perf_output "eh_elf"
report "eh_elf"
# Vanilla stats
get_perf_output "vanilla"
report "vanilla"
rm "$TMP_FILE"

View file

@ -1,27 +0,0 @@
OUTPUT="$1"
NB_ITER=10
if [ "$#" -lt 1 ] ; then
>&2 echo "Missing argument: output directory."
exit 1
fi
if [ -z "$EH_ELFS" ]; then
>&2 echo "Missing environment: EH_ELFS. Aborting."
exit 1
fi
mkdir -p "$OUTPUT"
for flavour in 'eh_elf' 'vanilla' 'vanilla-nocache'; do
>&2 echo "$flavour..."
source "$(dirname "$0")/../../env/apply" "$flavour" release
for iter in $(seq 1 $NB_ITER); do
>&2 echo -e "\t$iter..."
LD_LIBRARY_PATH="$EH_ELFS:$LD_LIBRARY_PATH" \
perf report 2>&1 >/dev/null | tail -n 1 \
| python "$(dirname $0)/to_report_fmt.py" \
| sed 's/^.* & .* & \([0-9]*\) & .*$/\1/g'
done > "$OUTPUT/${flavour}_times"
deactivate
done

View file

@ -1,106 +0,0 @@
#!/usr/bin/env python3
""" Generates performance statistics for the eh_elf vs vanilla libunwind unwinding,
based on time series generated beforehand
Intended to be run from `statistics.sh`
"""
from collections import namedtuple
import numpy as np
import sys
import os
Datapoint = namedtuple("Datapoint", ["nb_frames", "total_time", "avg_time"])
def read_series(path):
with open(path, "r") as handle:
for line in handle:
nb_frames, total_time, avg_time = map(int, line.strip().split())
yield Datapoint(nb_frames, total_time, avg_time)
FLAVOURS = ["eh_elf", "vanilla"]
WITH_NOCACHE = False
if "WITH_NOCACHE" in os.environ:
WITH_NOCACHE = True
FLAVOURS.append("vanilla-nocache")
path_format = os.path.join(sys.argv[1], "{}_times")
datapoints = {}
avg_times = {}
total_times = {}
avgs_total = {}
avgs = {}
std_deviations = {}
unwound_frames = {}
for flv in FLAVOURS:
datapoints[flv] = list(read_series(path_format.format(flv)))
avg_times[flv] = list(map(lambda x: x.avg_time, datapoints[flv]))
total_times[flv] = list(map(lambda x: x.total_time, datapoints[flv]))
avgs[flv] = sum(avg_times[flv]) / len(avg_times[flv])
avgs_total[flv] = sum(total_times[flv]) / len(total_times[flv])
std_deviations[flv] = np.sqrt(np.var(avg_times[flv]))
cur_unwound_frames = list(map(lambda x: x.nb_frames, datapoints[flv]))
unwound_frames[flv] = cur_unwound_frames[0]
for run_id, unw_frames in enumerate(cur_unwound_frames[1:]):
if unw_frames != unwound_frames[flv]:
print(
"{}, run {}: unwound {} frames, reference unwound {}".format(
flv, run_id + 1, unw_frames, unwound_frames[flv]
),
file=sys.stderr,
)
avg_ratio = avgs["vanilla"] / avgs["eh_elf"]
ratio_uncertainty = (
1
/ avgs["eh_elf"]
* (
std_deviations["vanilla"]
+ avgs["vanilla"] / avgs["eh_elf"] * std_deviations["eh_elf"]
)
)
def format_flv(flv_dict, formatter, alterator=None):
out = ""
for flv in FLAVOURS:
val = flv_dict[flv]
altered = alterator(val) if alterator else val
out += "* {}: {}\n".format(flv, formatter.format(altered))
return out
def get_ratios(avgs):
def avg_of(flavour):
return avgs[flavour] / avgs["eh_elf"]
if WITH_NOCACHE:
return "\n\tcached: {}\n\tuncached: {}".format(
avg_of("vanilla"), avg_of("vanilla-nocache")
)
else:
return avg_of("vanilla")
print(
"Unwound frames:\n{}\n"
"Average whole unwinding time (one run):\n{}\n"
"Average time to unwind one frame:\n{}\n"
"Standard deviation:\n{}\n"
"Average ratio: {}\n"
"Ratio uncertainty: {}".format(
format_flv(unwound_frames, "{}"),
format_flv(avgs_total, "{} μs", alterator=lambda x: x // 1000),
format_flv(avgs, "{} ns"),
format_flv(std_deviations, "{}"),
get_ratios(avgs),
ratio_uncertainty,
)
)

View file

@ -1,69 +0,0 @@
#!/usr/bin/env python3
import sys
import re
class Match:
def __init__(self, re_str, negate=False):
self.re = re.compile(re_str)
self.negate = negate
def matches(self, line):
return self.re.search(line) is not None
class Matcher:
def __init__(self, match_objs):
self.match_objs = match_objs
self.match_pos = 0
self.matches = 0
if not self.match_objs:
raise Exception("No match expressions provided")
if self.match_objs[-1].negate:
raise Exception("The last match object must be a positive expression")
def feed(self, line):
for cur_pos, exp in enumerate(self.match_objs[self.match_pos :]):
cur_pos = cur_pos + self.match_pos
if not exp.negate: # Stops the for here, whether matching or not
if exp.matches(line):
self.match_pos = cur_pos + 1
print(
"Passing positive {}, advance to {}".format(
cur_pos, self.match_pos
)
)
if self.match_pos >= len(self.match_objs):
print("> Complete match, reset.")
self.matches += 1
self.match_pos = 0
return
else:
if exp.matches(line):
print("Failing negative [{}] {}, reset".format(exp.negate, cur_pos))
old_match_pos = self.match_pos
self.match_pos = 0
if old_match_pos != 0:
print("> Refeed: ", end="")
self.feed(line)
return
def get_args(args):
out_args = []
for arg in args:
negate = False
if arg[0] == "~":
negate = True
arg = arg[1:]
out_args.append(Match(arg, negate))
return out_args
if __name__ == "__main__":
matcher = Matcher(get_args(sys.argv[1:]))
for line in sys.stdin:
matcher.feed(line)
print(matcher.matches)

View file

@ -1,43 +0,0 @@
#!/bin/bash
source "$(dirname $0)/common.sh"
TEMP_DIR="$(mktemp -d)"
NB_RUNS=10
function collect_perf_time_data {
envtype=$1
source $ENV_APPLY "$envtype" "release"
LD_LIBRARY_PATH="$BENCH_DIR/eh_elfs:$LD_LIBRARY_PATH" \
perf report -i "$BENCH_DIR/perf.data" 2>&1 >/dev/null \
| tail -n 1 \
| python "$(dirname $0)/to_report_fmt.py" \
| sed 's/^\([0-9]*\) & \([0-9]*\) & \([0-9]*\) & .*$/\1 \2 \3/g'
deactivate
}
function collect_perf_time_data_runs {
envtype=$1
outfile=$2
status_report "Collecting $envtype data over $NB_RUNS runs"
rm -f "$outfile"
for run in $(seq 1 $NB_RUNS); do
collect_perf_time_data "$envtype" >> "$outfile"
done
}
eh_elf_data="$TEMP_DIR/eh_elf_times"
vanilla_data="$TEMP_DIR/vanilla_times"
collect_perf_time_data_runs "eh_elf" "$eh_elf_data"
collect_perf_time_data_runs "vanilla" "$vanilla_data"
if [ -n "$WITH_NOCACHE" ]; then
vanilla_nocache_data="$TEMP_DIR/vanilla-nocache_times"
collect_perf_time_data_runs "vanilla-nocache" "$vanilla_nocache_data"
fi
status_report "benchmark statistics"
python "$(dirname "$0")/gen_perf_stats.py" "$TEMP_DIR"
rm -rf "$TEMP_DIR"

View file

@ -1,21 +0,0 @@
#!/usr/bin/env python3
import re
import sys
line = input()
regex = \
re.compile(r'Total unwind time: ([0-9]*) s ([0-9]*) ns, ([0-9]*) calls')
match = regex.match(line.strip())
if not match:
print('Badly formatted line', file=sys.stderr)
sys.exit(1)
sec = int(match.group(1))
ns = int(match.group(2))
calls = int(match.group(3))
time = sec * 10**9 + ns
print("{} & {} & {} & ??".format(calls, time, time // calls))

View file

@ -9,7 +9,7 @@ import os
import subprocess
from collections import namedtuple
from shared_python import elf_so_deps, readlink_rec, DEFAULT_AUX_DIRS
from shared_python import elf_so_deps
''' An ELF object, including the path to the ELF itself, and the path to its
@ -83,11 +83,6 @@ def objects_list(args):
out = []
eh_elfs_dirs = (
args.eh_elfs
+ ([] if args.no_dft_aux else DEFAULT_AUX_DIRS)
)
if args.deps:
objects = set(args.object)
for obj in args.object:
@ -97,10 +92,8 @@ def objects_list(args):
else:
objects = args.object
objects = list(map(readlink_rec, objects))
for obj in objects:
out.append(ElfObject(obj, matching_eh_elf(eh_elfs_dirs, obj)))
out.append(ElfObject(obj, matching_eh_elf(args.eh_elfs, obj)))
return out
@ -120,33 +113,20 @@ def process_args():
parser.add_argument('--eh-elfs', required=True, action='append',
help=("Indicate the directory in which eh_elfs are "
"located"))
parser.add_argument('-A', '--no-dft-aux', action='store_true',
help=("Do not use the default eh_elf locations"))
parser.add_argument('object', nargs='+',
help="The ELF object(s) to process")
return parser.parse_args()
def get_or_default(obj, field, default=None):
''' Access a field of a subscriptable, returning a default if there is no
such field '''
if field not in obj:
return default
return obj[field]
def main():
args = process_args()
objs = objects_list(args)
col_names = [
'Shared object',
'Orig prog size',
'Orig eh_frame',
'Gen eh_elf .text',
'+ .rodata',
'% of prog size',
'Growth',
]
@ -163,27 +143,20 @@ def main():
'{:<' + col_len[1] + '} '
'{:<' + col_len[2] + '} '
'{:<' + col_len[3] + '} '
'{:<' + col_len[4] + '} '
'{:<' + col_len[5] + '} '
'{:<' + col_len[6] + '}')
'{:<' + col_len[4] + '}')
row_format = ('{:>' + col_len[0] + '} '
'{:>' + col_len[1] + '} '
'{:>' + col_len[2] + '} '
'{:>' + col_len[3] + '} '
'{:>' + col_len[4] + '} '
'{:>' + col_len[5] + '} '
'{:>' + col_len[6] + '}')
'{:>' + col_len[4] + '}')
print(header_format.format(
col_names[0],
col_names[1],
col_names[2],
col_names[3],
col_names[4],
col_names[5],
col_names[6],
))
total_program_size = 0
total_eh_frame_size = 0
total_eh_elf_text_size = 0
total_eh_elf_size = 0
@ -192,32 +165,19 @@ def main():
elf_sections = get_elf_sections(obj.elf)
eh_elf_sections = get_elf_sections(obj.eh_elf)
text_size = get_or_default(
elf_sections, '.text', {'size': 0})['size']
rodata_size = get_or_default(
elf_sections, '.rodata', {'size': 0})['size']
eh_frame_size = get_or_default(
elf_sections, '.eh_frame', {'size': 0})['size']
eh_elf_text_size = get_or_default(
eh_elf_sections, '.text', {'size': 0})['size']
eh_elf_size = eh_elf_text_size + \
get_or_default(
eh_elf_sections, '.rodata', {'size': 0})['size']
eh_frame_size = elf_sections['.eh_frame']['size']
eh_elf_text_size = eh_elf_sections['.text']['size']
eh_elf_size = eh_elf_text_size + eh_elf_sections['.rodata']['size']
program_size = text_size + rodata_size
total_program_size += program_size
total_eh_frame_size += eh_frame_size
total_eh_elf_text_size += eh_elf_text_size
total_eh_elf_size += eh_elf_size
print(row_format.format(
displayed_name_filter(obj),
format_size(program_size),
format_size(eh_frame_size),
format_size(eh_elf_text_size),
format_size(eh_elf_size),
'{:.2f}'.format(eh_elf_size / program_size * 100),
'{:.2f}'.format(eh_elf_size / eh_frame_size)))
# Checking for missed big sections
@ -230,11 +190,9 @@ def main():
print(row_format.format(
'Total',
format_size(total_program_size),
format_size(total_eh_frame_size),
format_size(total_eh_elf_size),
format_size(total_eh_elf_text_size),
'{:.2f}'.format(total_eh_elf_size / total_program_size * 100),
'{:.2f}'.format(total_eh_elf_size / total_eh_frame_size)))

118
env/apply vendored
View file

@ -1,118 +0,0 @@
#!/bin/bash
## Source this file.
## Usage: apply [vanilla | vanilla-nocache | *eh_elf] [dbg | *release]
# ==== INPUT ACQUISITION ====
flavour="eh_elf"
dbg="release"
while [ "$#" -gt 0 ] ; do
case "$1" in
"vanilla" | "vanilla-nocache" | "eh_elf")
flavour="$1"
;;
"dbg" | "release")
dbg="$1"
;;
*)
>&2 echo "Unknown argument: $1"
exit 1
;;
esac
shift
done
# ==== UNSET PREVIOUS ENVIRONMENT ====
type -t deactivate
[ -n "$(type -t deactivate)" ] && deactivate
# ==== DEFINE DEACTIVATE ====
function deactivate {
export CPATH="$CPATH_EHELFSAVE"
export LIBRARY_PATH="$LIBRARY_PATH_EHELFSAVE"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH_EHELFSAVE"
export PS1="$PS1_EHELFSAVE"
export PATH="$PATH_EHELFSAVE"
unset CPATH_EHELFSAVE
unset LIBRARY_PATH_EHELFSAVE
unset LD_LIBRARY_PATH_EHELFSAVE
unset PS1_EHELFSAVE
unset PATH_EHELFSAVE
unset deactivate
}
# ==== PREFIX ====
export PERF_PREFIX="$HOME/local/perf-$flavour"
LIBUNWIND_PREFIX="$HOME/local/libunwind"
case "$flavour" in
"vanilla" | "vanilla-nocache" )
LIBUNWIND_PREFIX="${LIBUNWIND_PREFIX}-vanilla"
;;
"eh_elf" )
LIBUNWIND_PREFIX="${LIBUNWIND_PREFIX}-eh_elf"
;;
* )
>&2 echo "$flavour: unknown flavour"
exit 1
;;
esac
case "$dbg" in
"dbg" )
LIBUNWIND_PREFIX="${LIBUNWIND_PREFIX}-dbg"
;;
"release" )
LIBUNWIND_PREFIX="${LIBUNWIND_PREFIX}-release"
;;
* )
>&2 echo "$dbg: unknown debug mode (release | dbg)"
exit 1
;;
esac
export LIBUNWIND_PREFIX
# ==== EXPORTING ENV VARS ====
function colon_prepend {
if [ -z "$2" ]; then
echo "$1"
elif [ -z "$1" ] ; then
echo "$2"
else
echo "$1:$2"
fi
}
function ifpath {
if [ -e "$1" ] ; then
echo "$1"
fi
}
export CPATH_EHELFSAVE="$CPATH"
export LIBRARY_PATH_EHELFSAVE="$LIBRARY_PATH"
export LD_LIBRARY_PATH_EHELFSAVE="$LD_LIBRARY_PATH"
export PATH_EHELFSAVE="$PATH"
export PS1_EHELFSAVE="$PS1"
export CPATH="$(colon_prepend \
"$LIBUNWIND_PREFIX/include/:$PERF_PREFIX/include" "$CPATH")"
export LIBRARY_PATH="$(colon_prepend \
"$LIBUNWIND_PREFIX/lib/:$PERF_PREFIX/lib" "$LIBRARY_PATH")"
export LD_LIBRARY_PATH="$(colon_prepend \
"$LIBUNWIND_PREFIX/lib/:$PERF_PREFIX/lib" "$LD_LIBRARY_PATH")"
export PATH="$(colon_prepend \
"$(colon_prepend \
"$(ifpath "$LIBUNWIND_PREFIX/bin")" \
"$(ifpath "$PERF_PREFIX/bin")")" \
"$PATH")"
export PS1="($flavour $dbg) $PS1"
unset ifpath
unset colon_prepend

View file

@ -11,427 +11,195 @@ import sys
import subprocess
import tempfile
import argparse
from enum import Enum
from shared_python import (
elf_so_deps,
do_remote,
is_newer,
to_eh_elf_path,
find_eh_elf_dir,
DEFAULT_AUX_DIRS,
)
from shared_python import elf_so_deps, do_remote, is_newer
from extract_pc import generate_pc_list
DWARF_ASSEMBLY_BIN = os.path.join(
os.path.dirname(os.path.abspath(sys.argv[0])), "dwarf-assembly"
)
C_BIN = "gcc" if "C" not in os.environ else os.environ["C"]
os.path.dirname(os.path.abspath(sys.argv[0])),
'dwarf-assembly')
C_BIN = (
'gcc' if 'C' not in os.environ
else os.environ['C'])
class SwitchGenPolicy(Enum):
""" The various switch generation policies possible """
SWITCH_PER_FUNC = "--switch-per-func"
GLOBAL_SWITCH = "--global-switch"
class Config:
""" Holds the run's settings """
default_aux = DEFAULT_AUX_DIRS
def __init__(
self,
output,
aux,
no_dft_aux,
objects,
sw_gen_policy=SwitchGenPolicy.GLOBAL_SWITCH,
force=False,
use_pc_list=False,
c_opt_level="3",
enable_deref_arg=False,
keep_holes=False,
cc_debug=False,
remote=None,
):
self.output = "." if output is None else output
self.aux = aux + ([] if no_dft_aux else self.default_aux)
self.objects = objects
self.sw_gen_policy = sw_gen_policy
self.force = force
self.use_pc_list = use_pc_list
self.c_opt_level = c_opt_level
self.enable_deref_arg = enable_deref_arg
self.keep_holes = keep_holes
self.cc_debug = cc_debug
self.remote = remote
@staticmethod
def default_aux_str():
return ", ".join(Config.default_aux)
def dwarf_assembly_args(self):
""" Arguments to `dwarf_assembly` """
out = []
out.append(self.sw_gen_policy.value)
if self.enable_deref_arg:
out.append("--enable-deref-arg")
if self.keep_holes:
out.append("--keep-holes")
return out
def cc_opts(self):
""" Options to pass to the C compiler """
out = ["-fPIC"]
if self.cc_debug:
out.append("-g")
out.append(self.opt_level())
return out
def opt_level(self):
""" The optimization level to pass to gcc """
return "-O{}".format(self.c_opt_level)
def aux_dirs(self):
""" Get the list of auxiliary directories """
return self.aux
def gen_dw_asm_c(obj_path, out_path, config, pc_list_path=None):
""" Generate the C code produced by dwarf-assembly from `obj_path`, saving
it as `out_path` """
dw_assembly_args = config.dwarf_assembly_args()
if pc_list_path is not None:
dw_assembly_args += ["--pc-list", pc_list_path]
def gen_dw_asm_c(obj_path, out_path, dwarf_assembly_args):
''' Generate the C code produced by dwarf-assembly from `obj_path`, saving
it as `out_path` '''
try:
with open(out_path, "w") as out_handle:
with open(out_path, 'w') as out_handle:
# TODO enhance error handling
command_args = [DWARF_ASSEMBLY_BIN, obj_path] + dw_assembly_args
dw_asm_output = subprocess.check_output(command_args).decode("utf-8")
dw_asm_output = subprocess.check_output(
[DWARF_ASSEMBLY_BIN, obj_path] + dwarf_assembly_args) \
.decode('utf-8')
out_handle.write(dw_asm_output)
except subprocess.CalledProcessError as exn:
raise Exception(
(
"Cannot generate C code from object file {} using {}: process "
"terminated with exit code {}."
).format(obj_path, DWARF_ASSEMBLY_BIN, exn.returncode)
)
("Cannot generate C code from object file {} using {}: process "
"terminated with exit code {}.").format(
obj_path,
DWARF_ASSEMBLY_BIN,
exn.returncode))
def resolve_symlink_chain(objpath):
""" Resolves a symlink chain. This returns a pair `(new_obj, chain)`,
`new_obj` being the canonical path for `objpath`, and `chain` being a list
representing the path followed, eg. `[(objpath, a), (a, b), (b, new_obj)]`.
The goal of this function is to allow reproducing symlink architectures at
the eh_elf level. """
chain = []
out_path = objpath
while os.path.islink(out_path):
new_path = os.readlink(out_path)
if not os.path.isabs(new_path):
new_path = os.path.join(os.path.dirname(out_path), new_path)
chain.append((out_path, new_path))
out_path = new_path
return (out_path, chain)
def find_out_dir(obj_path, config):
""" Find the directory in which the eh_elf corresponding to `obj_path` will
be outputted, among the output directory and the aux directories """
return find_eh_elf_dir(obj_path, config.aux_dirs(), config.output)
def gen_eh_elf(obj_path, config):
""" Generate the eh_elf corresponding to `obj_path`, saving it as
def gen_eh_elf(obj_path, args, dwarf_assembly_args=None):
''' Generate the eh_elf corresponding to `obj_path`, saving it as
`out_dir/$(basename obj_path).eh_elf.so` (or in the current working
directory if out_dir is None) """
directory if out_dir is None) '''
out_dir = find_out_dir(obj_path, config)
obj_path, link_chain = resolve_symlink_chain(obj_path)
if args.output is None:
out_dir = '.'
else:
out_dir = args.output
if dwarf_assembly_args is None:
dwarf_assembly_args = []
print("> {}...".format(os.path.basename(obj_path)))
link_chain = map(
lambda elt: (
to_eh_elf_path(elt[0], out_dir),
os.path.basename(to_eh_elf_path(elt[1], out_dir)),
),
link_chain,
)
out_base_name = os.path.basename(obj_path) + '.eh_elf'
out_so_path = os.path.join(out_dir, (out_base_name + '.so'))
pc_list_dir = os.path.join(out_dir, 'pc_list')
out_base_name = to_eh_elf_path(obj_path, out_dir, base=True)
out_so_path = to_eh_elf_path(obj_path, out_dir, base=False)
pc_list_dir = os.path.join(out_dir, "pc_list")
if is_newer(out_so_path, obj_path) and not config.force:
if is_newer(out_so_path, obj_path) and not args.force:
return # The object is recent enough, no need to recreate it
if os.path.exists(out_dir) and not os.path.isdir(out_dir):
raise Exception("The output path {} is not a directory.".format(out_dir))
if not os.path.exists(out_dir):
os.makedirs(out_dir, exist_ok=True)
with tempfile.TemporaryDirectory() as compile_dir:
# Generate PC list
pc_list_path = None
if config.use_pc_list:
pc_list_path = os.path.join(pc_list_dir, out_base_name + ".pc_list")
if args.use_pc_list:
pc_list_path = \
os.path.join(pc_list_dir, out_base_name + '.pc_list')
os.makedirs(pc_list_dir, exist_ok=True)
print("\tGenerating PC list…")
print('\tGenerating PC list…')
generate_pc_list(obj_path, pc_list_path)
dwarf_assembly_args += ['--pc-list', pc_list_path]
# Generate the C source file
print("\tGenerating C…")
c_path = os.path.join(compile_dir, (out_base_name + ".c"))
gen_dw_asm_c(obj_path, c_path, config, pc_list_path)
c_path = os.path.join(compile_dir, (out_base_name + '.c'))
gen_dw_asm_c(obj_path, c_path, dwarf_assembly_args)
# Compile it into a .o
print("\tCompiling into .o…")
o_path = os.path.join(compile_dir, (out_base_name + ".o"))
if config.remote:
o_path = os.path.join(compile_dir, (out_base_name + '.o'))
opt_level = args.c_opt_level
if opt_level is None:
opt_level = '-O3'
if args.remote:
remote_out = do_remote(
config.remote,
[C_BIN, "-o", out_base_name + ".o", "-c", out_base_name + ".c"]
+ config.cc_opts(),
args.remote,
[C_BIN,
'-o', out_base_name + '.o',
'-c', out_base_name + '.c',
opt_level, '-fPIC'],
send_files=[c_path],
retr_files=[(out_base_name + ".o", o_path)],
)
retr_files=[(out_base_name + '.o', o_path)])
call_rc = 1 if remote_out is None else 0
else:
call_rc = subprocess.call(
[C_BIN, "-o", o_path, "-c", c_path, config.opt_level(), "-fPIC"]
)
[C_BIN, '-o', o_path, '-c', c_path, opt_level, '-fPIC'])
if call_rc != 0:
raise Exception("Failed to compile to a .o file")
# Compile it into a .so
print("\tCompiling into .so…")
call_rc = subprocess.call([C_BIN, "-o", out_so_path, "-shared", o_path])
call_rc = subprocess.call(
[C_BIN, '-o', out_so_path, '-shared', o_path])
if call_rc != 0:
raise Exception("Failed to compile to a .so file")
# Re-create symlinks
for elt in link_chain:
if os.path.exists(elt[0]):
if not os.path.islink(elt[0]):
raise Exception(
"{}: file already exists and is not a symlink.".format(elt[0])
)
os.remove(elt[0])
os.symlink(elt[1], elt[0])
def gen_all_eh_elf(obj_path, args, dwarf_assembly_args=None):
''' Call `gen_eh_elf` on obj_path and all its dependencies '''
if dwarf_assembly_args is None:
dwarf_assembly_args = []
def gen_all_eh_elf(obj_path, config):
""" Call `gen_eh_elf` on obj_path and all its dependencies """
deps = elf_so_deps(obj_path)
deps.append(obj_path)
for dep in deps:
gen_eh_elf(dep, config)
def gen_eh_elfs(obj_path, out_dir, global_switch=True, deps=True, remote=None):
""" Call gen{_all,}_eh_elf with args setup accordingly with the given
options """
switch_gen_policy = (
SwitchGenPolicy.GLOBAL_SWITCH
if global_switch
else SwitchGenPolicy.SWITCH_PER_FUNC
)
config = Config(
out_dir, [], False, [obj_path], sw_gen_policy=switch_gen_policy, remote=remote
)
if deps:
return gen_all_eh_elf([obj_path], config)
return gen_eh_elf([obj_path], config)
gen_eh_elf(dep, args, dwarf_assembly_args)
def process_args():
""" Process `sys.argv` arguments """
''' Process `sys.argv` arguments '''
parser = argparse.ArgumentParser(
description="Compile ELFs into their related eh_elfs"
description="Compile ELFs into their related eh_elfs",
)
parser.add_argument(
"--deps",
action="store_const",
const=gen_all_eh_elf,
default=gen_eh_elf,
dest="gen_func",
help=("Also generate eh_elfs for the shared objects " "this object depends on"),
)
parser.add_argument(
"-o",
"--output",
metavar="path",
help=(
"Save the generated objects at the given path "
"instead of the current working directory"
),
)
parser.add_argument(
"-a",
"--aux",
action="append",
default=[],
help=(
"Alternative output directories. These "
"directories are searched for existing matching "
"eh_elfs, and if found, these files are updated "
"instead of creating new files in the --output "
"directory. By default, some aux directories "
"are always considered, unless -A is passed: "
"{}."
).format(Config.default_aux_str()),
)
parser.add_argument(
"-A",
"--no-dft-aux",
action="store_true",
help=("Do not use the default auxiliary output " "directories: {}.").format(
Config.default_aux_str()
),
)
parser.add_argument(
"--remote",
metavar="ssh_args",
help=(
"Execute the heavyweight commands on the remote "
"machine, using `ssh ssh_args`."
),
)
parser.add_argument(
"--use-pc-list",
action="store_true",
help=(
"Generate a PC list using `extract_pc.py` for "
parser.add_argument('--deps', action='store_const',
const=gen_all_eh_elf, default=gen_eh_elf,
dest='gen_func',
help=("Also generate eh_elfs for the shared objects "
"this object depends on"))
parser.add_argument('-o', '--output', metavar="path",
help=("Save the generated objects at the given path "
"instead of the current working directory"))
parser.add_argument('--remote', metavar='ssh_args',
help=("Execute the heavyweight commands on the remote "
"machine, using `ssh ssh_args`."))
parser.add_argument('--use-pc-list', action='store_true',
help=("Generate a PC list using `extract_pc.py` for "
"each processed ELF file, and call "
"dwarf-assembly accordingly."
),
)
parser.add_argument(
"--force",
"-f",
action="store_true",
help=(
"Force re-generation of the output files, even "
"dwarf-assembly accordingly."))
parser.add_argument('--force', '-f', action='store_true',
help=("Force re-generation of the output files, even "
"when those files are newer than the target "
"ELF."
),
)
parser.add_argument(
"--enable-deref-arg",
action="store_true",
help=(
"Pass the `--enable-deref-arg` to "
"dwarf-assembly, enabling an extra `deref` "
"argument for each lookup function, allowing "
"to work on remote address spaces."
),
)
parser.add_argument(
"--keep-holes",
action="store_true",
help=(
"Keep holes between FDEs instead of filling "
"them with junk. More accurate, less compact."
),
)
parser.add_argument(
"-g",
"--cc-debug",
action="store_true",
help=("Compile the source file with -g for easy " "debugging"),
)
"ELF."))
# c_opt_level
opt_level_grp = parser.add_mutually_exclusive_group()
opt_level_grp.add_argument(
"-O0",
action="store_const",
const="0",
dest="c_opt_level",
help=("Compile C file with this optimization " "level."),
)
opt_level_grp.add_argument(
"-O1",
action="store_const",
const="1",
dest="c_opt_level",
help=("Compile C file with this optimization " "level."),
)
opt_level_grp.add_argument(
"-O2",
action="store_const",
const="2",
dest="c_opt_level",
help=("Compile C file with this optimization " "level."),
)
opt_level_grp.add_argument(
"-O3",
action="store_const",
const="3",
dest="c_opt_level",
help=("Compile C file with this optimization " "level."),
)
opt_level_grp.add_argument(
"-Os",
action="store_const",
const="s",
dest="c_opt_level",
help=("Compile C file with this optimization " "level."),
)
opt_level_grp.set_defaults(c_opt_level="3")
opt_level_grp.add_argument('-O0', action='store_const', const='-O0',
dest='c_opt_level',
help=("Compile C file with this optimization "
"level."))
opt_level_grp.add_argument('-O1', action='store_const', const='-O1',
dest='c_opt_level',
help=("Compile C file with this optimization "
"level."))
opt_level_grp.add_argument('-O2', action='store_const', const='-O2',
dest='c_opt_level',
help=("Compile C file with this optimization "
"level."))
opt_level_grp.add_argument('-O3', action='store_const', const='-O3',
dest='c_opt_level',
help=("Compile C file with this optimization "
"level."))
opt_level_grp.add_argument('-Os', action='store_const', const='-Os',
dest='c_opt_level',
help=("Compile C file with this optimization "
"level."))
switch_gen_policy = parser.add_mutually_exclusive_group(required=True)
switch_gen_policy.add_argument(
"--switch-per-func",
dest="sw_gen_policy",
action="store_const",
const=SwitchGenPolicy.SWITCH_PER_FUNC,
help=("Passed to dwarf-assembly."),
)
switch_gen_policy.add_argument(
"--global-switch",
dest="sw_gen_policy",
action="store_const",
const=SwitchGenPolicy.GLOBAL_SWITCH,
help=("Passed to dwarf-assembly."),
)
parser.add_argument("object", nargs="+", help="The ELF object(s) to process")
switch_generation_policy = \
parser.add_mutually_exclusive_group(required=True)
switch_generation_policy.add_argument('--switch-per-func',
action='store_const', const='',
help=("Passed to dwarf-assembly."))
switch_generation_policy.add_argument('--global-switch',
action='store_const', const='',
help=("Passed to dwarf-assembly."))
parser.add_argument('object', nargs='+',
help="The ELF object(s) to process")
return parser.parse_args()
def main():
args = process_args()
config = Config(
output=args.output,
aux=args.aux,
no_dft_aux=args.no_dft_aux,
objects=args.object,
sw_gen_policy=args.sw_gen_policy,
force=args.force,
use_pc_list=args.use_pc_list,
c_opt_level=args.c_opt_level,
enable_deref_arg=args.enable_deref_arg,
keep_holes=args.keep_holes,
cc_debug=args.cc_debug,
remote=args.remote,
)
DW_ASSEMBLY_OPTS = {
'switch_per_func': '--switch-per-func',
'global_switch': '--global-switch',
}
dwarf_assembly_opts = []
args_dict = vars(args)
for opt in DW_ASSEMBLY_OPTS:
if opt in args and args_dict[opt] is not None:
dwarf_assembly_opts.append(DW_ASSEMBLY_OPTS[opt])
for obj in args.object:
args.gen_func(obj, config)
args.gen_func(obj, args, dwarf_assembly_opts)
if __name__ == "__main__":

29
libelfunwind/README.md Normal file
View file

@ -0,0 +1,29 @@
# libelfunwind
Reproduces the interface of libunwind, and exports a `libunwind.so` binary, so
that changing `LD_LIBRARY_PATH` makes it possible to use `dwarf-assembly`
instead of `dwarf` as an unwinding process in eg. `perf`.
## `perf` currently used libunwind features
```
# Types
unw_accessors_t
unw_word_t
unw_regnum_t
unw_proc_info_t
unw_fpreg_t
unw_dyn_info_t
unw_addr_space_t
unw_cursor_t
# Functions
unw_create_addr_space
unw_destroy_addr_space
unw_flush_cache
unw_get_reg
unw_init_remote
unw_is_signal_frame
unw_set_caching_policy
unw_step
```

40
libelfunwind/libunwind.h Normal file
View file

@ -0,0 +1,40 @@
/**
* This is a custom version of libunwind making use of dwarf-assembly and its
* eh_elfs as an unwinding procedure.
**/
#include <stdint.h>
struct unw_context_t {};
struct unw_cursor_t {};
typedef int unw_regnum_t;
typedef uint64_t unw_word_t;
int unw_getcontext(unw_context_t *);
int unw_init_local(unw_cursor_t *, unw_context_t *);
// int unw_init_remote(unw_cursor_t *, unw_addr_space_t, void *);
int unw_step(unw_cursor_t *);
int unw_get_reg(unw_cursor_t *, unw_regnum_t, unw_word_t *);
int unw_get_fpreg(unw_cursor_t *, unw_regnum_t, unw_fpreg_t *);
int unw_set_reg(unw_cursor_t *, unw_regnum_t, unw_word_t);
int unw_set_fpreg(unw_cursor_t *, unw_regnum_t, unw_fpreg_t);
int unw_resume(unw_cursor_t *);
unw_addr_space_t unw_local_addr_space;
unw_addr_space_t unw_create_addr_space(unw_accessors_t, int);
void unw_destroy_addr_space(unw_addr_space_t);
unw_accessors_t unw_get_accessors(unw_addr_space_t);
void unw_flush_cache(unw_addr_space_t, unw_word_t, unw_word_t);
int unw_set_caching_policy(unw_addr_space_t, unw_caching_policy_t);
const char *unw_regname(unw_regnum_t);
int unw_get_proc_info(unw_cursor_t *, unw_proc_info_t *);
int unw_get_save_loc(unw_cursor_t *, int, unw_save_loc_t *);
int unw_is_fpreg(unw_regnum_t);
int unw_is_signal_frame(unw_cursor_t *);
int unw_get_proc_name(unw_cursor_t *, char *, size_t, unw_word_t *);
void _U_dyn_register(unw_dyn_info_t *);
void _U_dyn_cancel(unw_dyn_info_t *);

View file

@ -1,23 +1,7 @@
#include <stdint.h>
typedef enum {
UNWF_RIP=0,
UNWF_RSP=1,
UNWF_RBP=2,
UNWF_RBX=3,
UNWF_ERROR=7
} unwind_flags_t;
typedef struct {
uint8_t flags;
uintptr_t rip, rsp, rbp, rbx;
uintptr_t rip, rsp, rbp;
} unwind_context_t;
typedef uintptr_t (*deref_func_t)(uintptr_t);
typedef unwind_context_t (*_fde_func_t)(unwind_context_t, uintptr_t);
typedef unwind_context_t (*_fde_func_with_deref_t)(
unwind_context_t,
uintptr_t,
deref_func_t);

View file

@ -6,42 +6,6 @@ import os
from collections import namedtuple
DEFAULT_AUX_DIRS = [
'~/.cache/eh_elfs',
]
def to_eh_elf_path(so_path, out_dir, base=False):
''' Transform a library path into its eh_elf counterpart '''
base_path = os.path.basename(so_path) + '.eh_elf'
if base:
return base_path
return os.path.join(out_dir, base_path + '.so')
def find_eh_elf_dir(obj_path, aux_dirs, out_dir):
''' Find the directory in which the eh_elf corresponding to `obj_path` will
be outputted, among the output directory and the aux directories '''
for candidate in aux_dirs:
eh_elf_path = to_eh_elf_path(obj_path, candidate)
if os.path.exists(eh_elf_path):
return candidate
# No match among the aux dirs
return out_dir
def readlink_rec(path):
''' Returns the canonical path of `path`, resolving multiple layers of
symlinks '''
while os.path.islink(path):
path = os.path.join(
os.path.dirname(path),
os.readlink(path))
return path
def is_newer(file1, file2):
''' Returns True iff file1 is newer than file2 '''
try:
@ -88,9 +52,6 @@ def do_remote(remote, command, send_files=None, retr_files=None):
The command is executed on the machine described by `remote` (see ssh(1)).
If `preload` is set, then the remote file at this path will be sourced
before running any command, allowing to set PATH and other variables.
send_files is a list of file paths that must be first copied at the root of
a temporary directory on `remote` before running the command. Consider
yourself jailed in that directory.
@ -103,11 +64,6 @@ def do_remote(remote, command, send_files=None, retr_files=None):
otherwise, on the local machine.
'''
if send_files is None:
send_files = []
if retr_files is None:
retr_files = []
def ssh_do(cmd_args, working_directory=None):
try:
cmd = ['ssh', remote]

View file

@ -1,54 +1,22 @@
#include "CodeGenerator.hpp"
#include "gen_context_struct.hpp"
#include "settings.hpp"
#include "../shared/context_struct.h"
#include <algorithm>
#include <limits>
#include <exception>
#include <sstream>
using namespace std;
class UnhandledRegister: public std::exception {};
static const char* PRELUDE =
"#include <assert.h>\n"
"\n"
;
struct UnwFlags {
UnwFlags():
error(false), rip(false), rsp(false), rbp(false), rbx(false) {}
bool error, rip, rsp, rbp, rbx;
uint8_t to_uint8() const {
uint8_t out = 0;
if(rip)
out |= (1 << UNWF_RIP);
if(rsp)
out |= (1 << UNWF_RSP);
if(rbp)
out |= (1 << UNWF_RBP);
if(rbx)
out |= (1 << UNWF_RBX);
if(error)
out |= (1 << UNWF_ERROR);
return out;
}
};
CodeGenerator::CodeGenerator(
const SimpleDwarf& dwarf,
std::ostream& os,
NamingScheme naming_scheme,
AbstractSwitchCompiler* sw_compiler) :
dwarf(dwarf), os(os), pc_list(nullptr),
naming_scheme(naming_scheme), switch_compiler(sw_compiler)
NamingScheme naming_scheme):
dwarf(dwarf), os(os), pc_list(nullptr), naming_scheme(naming_scheme)
{
if(!settings::pc_list.empty()) {
pc_list = make_unique<PcListReader>(settings::pc_list);
@ -60,41 +28,6 @@ void CodeGenerator::generate() {
gen_of_dwarf();
}
SwitchStatement CodeGenerator::gen_fresh_switch() const {
SwitchStatement out;
out.switch_var = "pc";
ostringstream default_oss;
UnwFlags flags;
flags.error = true;
default_oss
<< "out_ctx.flags = " << (int) flags.to_uint8() << "u;\n"
<< "return out_ctx;\n";
out.default_case = default_oss.str();
return out;
}
void CodeGenerator::switch_append_fde(
SwitchStatement& sw,
const SimpleDwarf::Fde& fde) const
{
for(size_t fde_row_id=0; fde_row_id < fde.rows.size(); ++fde_row_id)
{
SwitchStatement::SwitchCase sw_case;
uintptr_t up_bound = fde.end_ip - 1;
if(fde_row_id != fde.rows.size() - 1)
up_bound = fde.rows[fde_row_id + 1].ip - 1;
sw_case.low_bound = fde.rows[fde_row_id].ip;
sw_case.high_bound = up_bound;
ostringstream case_oss;
gen_of_row_content(fde.rows[fde_row_id], case_oss);
sw_case.content.code = case_oss.str();
sw.cases.push_back(sw_case);
}
}
void CodeGenerator::gen_of_dwarf() {
os << CONTEXT_STRUCT_STR << '\n'
<< PRELUDE << '\n' << endl;
@ -122,10 +55,9 @@ void CodeGenerator::gen_of_dwarf() {
case settings::SGP_GlobalSwitch:
{
gen_unwind_func_header("_eh_elf");
SwitchStatement sw_stmt = gen_fresh_switch();
for(const auto& fde: dwarf.fde_list)
switch_append_fde(sw_stmt, fde);
(*switch_compiler)(os, sw_stmt);
for(const auto& fde: dwarf.fde_list) {
gen_switchpart_of_fde(fde);
}
gen_unwind_func_footer();
break;
}
@ -133,85 +65,84 @@ void CodeGenerator::gen_of_dwarf() {
}
void CodeGenerator::gen_unwind_func_header(const std::string& name) {
string deref_arg;
if(settings::enable_deref_arg)
deref_arg = ", deref_func_t deref";
os << "unwind_context_t "
<< name
<< "(unwind_context_t ctx, uintptr_t pc" << deref_arg << ") {\n"
<< "\tunwind_context_t out_ctx;" << endl;
<< "(unwind_context_t ctx, uintptr_t pc) {\n"
<< "\tunwind_context_t out_ctx;\n"
<< "\tswitch(pc) {" << endl;
}
void CodeGenerator::gen_unwind_func_footer() {
os << "}" << endl;
os << "\t\tdefault: assert(0);\n"
<< "\t}\n"
<< "}" << endl;
}
void CodeGenerator::gen_function_of_fde(const SimpleDwarf::Fde& fde) {
gen_unwind_func_header(naming_scheme(fde));
SwitchStatement sw_stmt = gen_fresh_switch();
switch_append_fde(sw_stmt, fde);
(*switch_compiler)(os, sw_stmt);
gen_switchpart_of_fde(fde);
gen_unwind_func_footer();
}
void CodeGenerator::gen_of_row_content(
const SimpleDwarf::DwRow& row,
std::ostream& stream) const
void CodeGenerator::gen_switchpart_of_fde(const SimpleDwarf::Fde& fde) {
os << "\t\t/********** FDE: 0x" << std::hex << fde.fde_offset
<< ", PC = 0x" << fde.beg_ip << std::dec << " */" << std::endl;
for(size_t fde_row_id=0; fde_row_id < fde.rows.size(); ++fde_row_id)
{
UnwFlags flags;
uintptr_t up_bound = fde.end_ip - 1;
if(fde_row_id != fde.rows.size() - 1)
up_bound = fde.rows[fde_row_id + 1].ip - 1;
try {
if(!check_reg_valid(row.ra)) {
// RA might be undefined (last frame), but if it is defined and we
// don't implement it (eg. EXPR), it is an error
flags.error = true;
goto write_flags;
gen_of_row(fde.rows[fde_row_id], up_bound);
}
}
if(check_reg_valid(row.cfa)) {
flags.rsp = true;
stream << "out_ctx.rsp = ";
gen_of_reg(row.cfa, stream);
stream << ';' << endl;
}
else { // rsp is required (CFA)
flags.error = true;
goto write_flags;
void CodeGenerator::gen_of_row(
const SimpleDwarf::DwRow& row,
uintptr_t row_end)
{
gen_case(row.ip, row_end);
os << "\t\t\t" << "out_ctx.rsp = ";
gen_of_reg(row.cfa);
os << ';' << endl;
os << "\t\t\t" << "out_ctx.rbp = ";
gen_of_reg(row.rbp);
os << ';' << endl;
os << "\t\t\t" << "out_ctx.rip = ";
gen_of_reg(row.ra);
os << ';' << endl;
os << "\t\t\treturn " << "out_ctx" << ";" << endl;
}
if(check_reg_defined(row.rbp)) {
flags.rbp = true;
stream << "out_ctx.rbp = ";
gen_of_reg(row.rbp, stream);
stream << ';' << endl;
void CodeGenerator::gen_case(uintptr_t low_bound, uintptr_t high_bound) {
if(pc_list == nullptr) {
os << "\t\tcase " << std::hex << "0x" << low_bound
<< " ... 0x" << high_bound << ":" << std::dec << endl;
}
else {
const auto& first_it = lower_bound(
pc_list->get_list().begin(),
pc_list->get_list().end(),
low_bound);
const auto& last_it = upper_bound(
pc_list->get_list().begin(),
pc_list->get_list().end(),
high_bound);
if(check_reg_defined(row.ra)) {
flags.rip = true;
stream << "out_ctx.rip = ";
gen_of_reg(row.ra, stream);
stream << ';' << endl;
if(first_it == pc_list->get_list().end())
throw CodeGenerator::InvalidPcList();
os << std::hex;
for(auto it = first_it; it != last_it; ++it)
os << "\t\tcase 0x" << *it << ":\n";
os << std::dec;
}
if(check_reg_defined(row.rbx)) {
flags.rbx = true;
stream << "out_ctx.rbx = ";
gen_of_reg(row.rbx, stream);
stream << ';' << endl;
}
} catch(const UnhandledRegister& exn) {
// This should not happen, since we check_reg_*, but heh.
flags.error = true;
stream << ";\n";
}
write_flags:
stream << "out_ctx.flags = " << (int)flags.to_uint8() << "u;" << endl;
stream << "return " << "out_ctx" << ";" << endl;
}
static const char* ctx_of_dw_name(SimpleDwarf::MachineRegister reg) {
@ -222,68 +153,28 @@ static const char* ctx_of_dw_name(SimpleDwarf::MachineRegister reg) {
return "ctx.rsp";
case SimpleDwarf::REG_RBP:
return "ctx.rbp";
case SimpleDwarf::REG_RBX:
return "ctx.rbx";
case SimpleDwarf::REG_RA:
throw CodeGenerator::NotImplementedCase();
}
return "";
}
bool CodeGenerator::check_reg_defined(
const SimpleDwarf::DwRegister& reg) const
{
void CodeGenerator::gen_of_reg(const SimpleDwarf::DwRegister& reg) {
switch(reg.type) {
case SimpleDwarf::DwRegister::REG_UNDEFINED:
case SimpleDwarf::DwRegister::REG_NOT_IMPLEMENTED:
return false;
default:
return true;
}
}
bool CodeGenerator::check_reg_valid(const SimpleDwarf::DwRegister& reg) const {
return reg.type != SimpleDwarf::DwRegister::REG_NOT_IMPLEMENTED;
}
void CodeGenerator::gen_of_reg(const SimpleDwarf::DwRegister& reg,
ostream& stream) const
{
switch(reg.type) {
case SimpleDwarf::DwRegister::REG_UNDEFINED:
// This function is not supposed to be called on an undefined
// register
throw UnhandledRegister();
os << std::numeric_limits<uintptr_t>::max() << "ull";
break;
case SimpleDwarf::DwRegister::REG_REGISTER:
stream << ctx_of_dw_name(reg.reg)
os << ctx_of_dw_name(reg.reg)
<< " + (" << reg.offset << ")";
break;
case SimpleDwarf::DwRegister::REG_CFA_OFFSET: {
if(settings::enable_deref_arg) {
stream << "deref(out_ctx.rsp + ("
<< reg.offset
<< "))";
}
else {
stream << "*((uintptr_t*)(out_ctx.rsp + ("
case SimpleDwarf::DwRegister::REG_CFA_OFFSET:
os << "*((uintptr_t*)(out_ctx.rsp + ("
<< reg.offset
<< ")))";
}
break;
}
case SimpleDwarf::DwRegister::REG_PLT_EXPR: {
/*
if(settings::enable_deref_arg)
stream << "(deref(";
else
stream << "*((uintptr_t*)(";
*/
stream << "(((ctx.rip & 15) >= 11) ? 8 : 0) + ctx.rsp";
break;
}
case SimpleDwarf::DwRegister::REG_NOT_IMPLEMENTED:
stream << "0";
throw UnhandledRegister();
os << "0; assert(0)";
break;
}
}

View file

@ -8,7 +8,6 @@
#include "SimpleDwarf.hpp"
#include "PcListReader.hpp"
#include "SwitchStatement.hpp"
class CodeGenerator {
public:
@ -24,8 +23,7 @@ class CodeGenerator {
/** Create a CodeGenerator to generate code for the given dwarf, on the
* given std::ostream object (eg. cout). */
CodeGenerator(const SimpleDwarf& dwarf, std::ostream& os,
NamingScheme naming_scheme,
AbstractSwitchCompiler* sw_compiler);
NamingScheme naming_scheme);
/// Actually generate the code on the given stream
void generate();
@ -36,33 +34,24 @@ class CodeGenerator {
uintptr_t beg, end;
};
SwitchStatement gen_fresh_switch() const;
void switch_append_fde(
SwitchStatement& sw,
const SimpleDwarf::Fde& fde) const;
void gen_of_dwarf();
void gen_unwind_func_header(const std::string& name);
void gen_unwind_func_footer();
void gen_function_of_fde(const SimpleDwarf::Fde& fde);
void gen_of_row_content(
void gen_switchpart_of_fde(const SimpleDwarf::Fde& fde);
void gen_of_row(
const SimpleDwarf::DwRow& row,
std::ostream& stream) const;
uintptr_t row_end);
void gen_case(uintptr_t low_bound, uintptr_t high_bound);
void gen_of_reg(
const SimpleDwarf::DwRegister& reg,
std::ostream& stream) const;
const SimpleDwarf::DwRegister& reg);
void gen_lookup(const std::vector<LookupEntry>& entries);
bool check_reg_defined(const SimpleDwarf::DwRegister& reg) const;
bool check_reg_valid(const SimpleDwarf::DwRegister& reg) const;
private:
SimpleDwarf dwarf;
std::ostream& os;
std::unique_ptr<PcListReader> pc_list;
NamingScheme naming_scheme;
std::unique_ptr<AbstractSwitchCompiler> switch_compiler;
};

View file

@ -1,50 +0,0 @@
#include "ConseqEquivFilter.hpp"
using namespace std;
ConseqEquivFilter::ConseqEquivFilter(bool enable): SimpleDwarfFilter(enable) {}
static bool equiv_reg(
const SimpleDwarf::DwRegister& r1,
const SimpleDwarf::DwRegister& r2)
{
return r1.type == r2.type
&& r1.offset == r2.offset
&& r1.reg == r2.reg;
}
static bool equiv_row(
const SimpleDwarf::DwRow& r1,
const SimpleDwarf::DwRow& r2)
{
return r1.ip == r2.ip
&& equiv_reg(r1.cfa, r2.cfa)
&& equiv_reg(r1.rbp, r2.rbp)
&& equiv_reg(r1.rbx, r2.rbx)
&& equiv_reg(r1.ra, r2.ra);
}
SimpleDwarf ConseqEquivFilter::do_apply(const SimpleDwarf& dw) const {
SimpleDwarf out;
for(const auto& fde: dw.fde_list) {
out.fde_list.push_back(SimpleDwarf::Fde());
SimpleDwarf::Fde& cur_fde = out.fde_list.back();
cur_fde.fde_offset = fde.fde_offset;
cur_fde.beg_ip = fde.beg_ip;
cur_fde.end_ip = fde.end_ip;
if(fde.rows.empty())
continue;
cur_fde.rows.push_back(fde.rows.front());
for(size_t pos=1; pos < fde.rows.size(); ++pos) {
const auto& row = fde.rows[pos];
if(!equiv_row(row, cur_fde.rows.back())) {
cur_fde.rows.push_back(row);
}
}
}
return out;
}

View file

@ -1,16 +0,0 @@
/** SimpleDwarfFilter to keep a unique Dwarf row for each group of consecutive
* lines that are equivalent, that is, that share the same considered
* registers' values. */
#pragma once
#include "SimpleDwarf.hpp"
#include "SimpleDwarfFilter.hpp"
class ConseqEquivFilter: public SimpleDwarfFilter {
public:
ConseqEquivFilter(bool enable=true);
private:
SimpleDwarf do_apply(const SimpleDwarf& dw) const;
};

View file

@ -1,7 +1,5 @@
#include "DwarfReader.hpp"
#include "plt_std_expr.hpp"
#include <fstream>
#include <fileno.hpp>
#include <set>
@ -9,25 +7,14 @@
using namespace std;
using namespace dwarf;
typedef std::set<std::pair<int, core::FrameSection::register_def> >
dwarfpp_row_t;
DwarfReader::DwarfReader(const string& path):
root(fileno(ifstream(path)))
{}
// Debug function -- dumps an expression
static void dump_expr(const core::FrameSection::register_def& reg) {
assert(reg.k == core::FrameSection::register_def::SAVED_AT_EXPR
|| reg.k == core::FrameSection::register_def::VAL_OF_EXPR);
const encap::loc_expr& expr = reg.saved_at_expr_r();
for(const auto& elt: expr) {
fprintf(stderr, "(%02x, %02llx, %02llx, %02llx) :: ",
elt.lr_atom, elt.lr_number, elt.lr_number2, elt.lr_offset);
}
fprintf(stderr, "\n");
}
SimpleDwarf DwarfReader::read() {
SimpleDwarf DwarfReader::read() const {
const core::FrameSection& fs = root.get_frame_section();
SimpleDwarf output;
@ -39,51 +26,46 @@ SimpleDwarf DwarfReader::read() {
return output;
}
void DwarfReader::add_cell_to_row(
const dwarf::core::FrameSection::register_def& reg,
int reg_id,
int ra_reg,
SimpleDwarf::DwRow& cur_row)
{
if(reg_id == DW_FRAME_CFA_COL3) {
cur_row.cfa = read_register(reg);
SimpleDwarf::Fde DwarfReader::read_fde(const core::Fde& fde) const {
SimpleDwarf::Fde output;
output.fde_offset = fde.get_fde_offset();
output.beg_ip = fde.get_low_pc();
output.end_ip = fde.get_low_pc() + fde.get_func_length();
auto rows = fde.decode().rows;
const core::Cie& cie = *fde.find_cie();
int ra_reg = cie.get_return_address_register_rule();
for(const auto row_pair: rows) {
SimpleDwarf::DwRow cur_row;
cur_row.ip = row_pair.first.lower();
const dwarfpp_row_t& row = row_pair.second;
for(const auto& cell: row) {
if(cell.first == DW_FRAME_CFA_COL3) {
cur_row.cfa = read_register(cell.second);
}
else {
try {
SimpleDwarf::MachineRegister reg_type =
from_dwarfpp_reg(reg_id, ra_reg);
from_dwarfpp_reg(cell.first, ra_reg);
switch(reg_type) {
case SimpleDwarf::REG_RBP:
cur_row.rbp = read_register(reg);
break;
case SimpleDwarf::REG_RBX:
cur_row.rbx = read_register(reg);
cur_row.rbp = read_register(cell.second);
break;
case SimpleDwarf::REG_RA:
cur_row.ra = read_register(reg);
cur_row.ra = read_register(cell.second);
break;
default:
break;
}
}
catch(const UnsupportedRegister&) {} // Just ignore it.
catch(UnsupportedRegister) {} // Just ignore it.
}
}
void DwarfReader::append_row_to_fde(
const dwarfpp_row_t& row,
uintptr_t row_addr,
int ra_reg,
SimpleDwarf::Fde& output)
{
SimpleDwarf::DwRow cur_row;
cur_row.ip = row_addr;
for(const auto& cell: row) {
add_cell_to_row(cell.second, cell.first, ra_reg, cur_row);
}
if(cur_row.cfa.type == SimpleDwarf::DwRegister::REG_UNDEFINED)
{
// Not set
@ -93,66 +75,6 @@ void DwarfReader::append_row_to_fde(
output.rows.push_back(cur_row);
}
template<typename Key, typename Value>
static std::set<std::pair<Key, Value> > map_to_setpair(
const std::map<Key, Value>& src_map)
{
std::set<std::pair<Key, Value> > out;
for(const auto map_it: src_map) {
out.insert(map_it);
}
return out;
}
void DwarfReader::append_results_to_fde(
const dwarf::core::FrameSection::instrs_results& results,
int ra_reg,
SimpleDwarf::Fde& output)
{
for(const auto row_pair: results.rows) {
append_row_to_fde(
row_pair.second,
row_pair.first.lower(),
ra_reg,
output);
}
if(results.unfinished_row.size() > 0) {
try {
append_row_to_fde(
map_to_setpair(results.unfinished_row),
results.unfinished_row_addr,
ra_reg,
output);
} catch(const InvalidDwarf&) {
// Ignore: the unfinished_row can be undefined
}
}
}
SimpleDwarf::Fde DwarfReader::read_fde(const core::Fde& fde) {
SimpleDwarf::Fde output;
output.fde_offset = fde.get_fde_offset();
output.beg_ip = fde.get_low_pc();
output.end_ip = fde.get_low_pc() + fde.get_func_length();
const core::Cie& cie = *fde.find_cie();
int ra_reg = cie.get_return_address_register_rule();
// CIE rows
core::FrameSection cie_fs(root.get_dbg(), true);
auto cie_rows = cie_fs.interpret_instructions(
cie,
fde.get_low_pc(),
cie.get_initial_instructions(),
cie.get_initial_instructions_length());
// FDE rows
auto fde_rows = fde.decode();
// instrs
append_results_to_fde(cie_rows, ra_reg, output);
append_results_to_fde(fde_rows, ra_reg, output);
return output;
}
@ -182,13 +104,6 @@ SimpleDwarf::DwRegister DwarfReader::read_register(
output.type = SimpleDwarf::DwRegister::REG_UNDEFINED;
break;
case core::FrameSection::register_def::SAVED_AT_EXPR:
if(is_plt_expr(reg))
output.type = SimpleDwarf::DwRegister::REG_PLT_EXPR;
else if(!interpret_simple_expr(reg, output))
output.type = SimpleDwarf::DwRegister::REG_NOT_IMPLEMENTED;
break;
default:
output.type = SimpleDwarf::DwRegister::REG_NOT_IMPLEMENTED;
break;
@ -215,76 +130,7 @@ SimpleDwarf::MachineRegister DwarfReader::from_dwarfpp_reg(
return SimpleDwarf::REG_RSP;
case lib::DWARF_X86_64_RBP:
return SimpleDwarf::REG_RBP;
case lib::DWARF_X86_64_RBX:
return SimpleDwarf::REG_RBX;
default:
throw UnsupportedRegister();
}
}
static bool compare_dw_expr(
const encap::loc_expr& e1,
const encap::loc_expr& e2)
{
const std::vector<encap::expr_instr>& e1_vec =
static_cast<const vector<encap::expr_instr>&>(e1);
const std::vector<encap::expr_instr>& e2_vec =
static_cast<const vector<encap::expr_instr>&>(e2);
return e1_vec == e2_vec;
}
bool DwarfReader::is_plt_expr(
const core::FrameSection::register_def& reg) const
{
if(reg.k != core::FrameSection::register_def::SAVED_AT_EXPR)
return false;
const encap::loc_expr& expr = reg.saved_at_expr_r();
bool res = compare_dw_expr(expr, REFERENCE_PLT_EXPR);
return res;
}
bool DwarfReader::interpret_simple_expr(
const dwarf::core::FrameSection::register_def& reg,
SimpleDwarf::DwRegister& output
) const
{
bool deref = false;
if(reg.k == core::FrameSection::register_def::SAVED_AT_EXPR)
deref = true;
else if(reg.k == core::FrameSection::register_def::VAL_OF_EXPR)
deref = false;
else
return false;
const encap::loc_expr& expr = reg.saved_at_expr_r();
if(expr.size() > 2 || expr.empty())
return false;
const auto& exp_reg = expr[0];
if(0x70 <= exp_reg.lr_atom && exp_reg.lr_atom <= 0x8f) { // DW_OP_breg<n>
int reg_id = exp_reg.lr_atom - 0x70;
try {
output.reg = from_dwarfpp_reg(reg_id, -1); // Cannot be CFA anyway
output.offset = exp_reg.lr_number;
} catch(const UnsupportedRegister& /* exn */) {
return false; // Unsupported register
}
}
if(expr.size() == 2) { // OK if deref
if(expr[1].lr_atom == 0x06) { // deref
if(deref)
return false;
deref = true;
}
else
return false;
}
if(deref)
return false; // TODO try stats? Mabye it's worth implementing
output.type = SimpleDwarf::DwRegister::REG_REGISTER;
return true;
}

View file

@ -13,9 +13,6 @@
#include "SimpleDwarf.hpp"
typedef std::set<std::pair<int, dwarf::core::FrameSection::register_def> >
dwarfpp_row_t;
class DwarfReader {
public:
class InvalidDwarf: public std::exception {};
@ -24,44 +21,19 @@ class DwarfReader {
DwarfReader(const std::string& path);
/** Actually read the ELF file, generating a `SimpleDwarf` output. */
SimpleDwarf read();
SimpleDwarf read() const;
private: //meth
SimpleDwarf::Fde read_fde(const dwarf::core::Fde& fde);
void append_results_to_fde(
const dwarf::core::FrameSection::instrs_results& results,
int ra_reg,
SimpleDwarf::Fde& output);
SimpleDwarf::Fde read_fde(const dwarf::core::Fde& fde) const;
SimpleDwarf::DwRegister read_register(
const dwarf::core::FrameSection::register_def& reg) const;
void add_cell_to_row(
const dwarf::core::FrameSection::register_def& reg,
int reg_id,
int ra_reg,
SimpleDwarf::DwRow& cur_row);
void append_row_to_fde(
const dwarfpp_row_t& row,
uintptr_t row_addr,
int ra_reg,
SimpleDwarf::Fde& output);
SimpleDwarf::MachineRegister from_dwarfpp_reg(
int reg_id,
int ra_reg=-1
) const;
bool is_plt_expr(
const dwarf::core::FrameSection::register_def& reg) const;
bool interpret_simple_expr(
const dwarf::core::FrameSection::register_def& reg,
SimpleDwarf::DwRegister& output
) const;
class UnsupportedRegister: public std::exception {};
private:

View file

@ -1,21 +0,0 @@
#include "EmptyFdeDeleter.hpp"
#include <algorithm>
#include <cstdio>
using namespace std;
EmptyFdeDeleter::EmptyFdeDeleter(bool enable): SimpleDwarfFilter(enable) {}
SimpleDwarf EmptyFdeDeleter::do_apply(const SimpleDwarf& dw) const {
SimpleDwarf out(dw);
auto fde = out.fde_list.begin();
while(fde != out.fde_list.end()) {
if(fde->rows.empty())
fde = out.fde_list.erase(fde);
else
++fde;
}
return out;
}

View file

@ -1,15 +0,0 @@
/** Deletes empty FDEs (that is, FDEs with no rows) from the FDEs collection.
* This is used to ensure they do not interfere with PcHoleFiller and such. */
#pragma once
#include "SimpleDwarf.hpp"
#include "SimpleDwarfFilter.hpp"
class EmptyFdeDeleter: public SimpleDwarfFilter {
public:
EmptyFdeDeleter(bool enable=true);
private:
SimpleDwarf do_apply(const SimpleDwarf& dw) const;
};

View file

@ -1,133 +0,0 @@
#include "FactoredSwitchCompiler.hpp"
#include <sstream>
#include <string>
#include <iostream>
using namespace std;
FactoredSwitchCompiler::FactoredSwitchCompiler(int indent):
AbstractSwitchCompiler(indent), cur_label_id(0)
{
}
void FactoredSwitchCompiler::to_stream(
std::ostream& os, const SwitchStatement& sw)
{
if(sw.cases.empty()) {
std::cerr << "WARNING: empty unwinding data!\n";
os
<< indent_str("/* WARNING: empty unwinding data! */\n")
<< indent_str(sw.default_case) << "\n";
return;
}
JumpPointMap jump_points;
uintptr_t low_bound = sw.cases.front().low_bound,
high_bound = sw.cases.back().high_bound;
os << indent() << "if("
<< "0x" << hex << low_bound << " <= " << sw.switch_var
<< " && " << sw.switch_var << " <= 0x" << high_bound << dec << ") {\n";
indent_count++;
gen_binsearch_tree(os, jump_points, sw.switch_var,
sw.cases.begin(), sw.cases.end(),
make_pair(low_bound, high_bound));
indent_count--;
os << indent() << "}\n";
os << indent() << "_factor_default:\n"
<< indent_str(sw.default_case) << "\n"
<< indent() << "/* ===== LABELS ============================== */\n\n";
gen_jump_points_code(os, jump_points);
}
FactoredSwitchCompiler::FactorJumpPoint
FactoredSwitchCompiler::get_jump_point(
FactoredSwitchCompiler::JumpPointMap& jump_map,
const SwitchStatement::SwitchCaseContent& sw_case)
{
#ifdef STATS
stats.refer_count++;
#endif//STATS
auto pregen = jump_map.find(sw_case);
if(pregen != jump_map.end()) // Was previously generated
return pregen->second;
#ifdef STATS
stats.generated_count++;
#endif//STATS
// Wasn't generated previously -- we'll generate it here
size_t label_id = cur_label_id++;
ostringstream label_ss;
label_ss << "_factor_" << label_id;
FactorJumpPoint label_name = label_ss.str();
jump_map.insert(make_pair(sw_case, label_name));
return label_name;
}
void FactoredSwitchCompiler::gen_jump_points_code(std::ostream& os,
const FactoredSwitchCompiler::JumpPointMap& jump_map)
{
for(const auto& block: jump_map) {
os << indent() << block.second << ":\n"
<< indent_str(block.first.code) << "\n\n";
}
os << indent() << "assert(0);\n";
}
void FactoredSwitchCompiler::gen_binsearch_tree(
std::ostream& os,
FactoredSwitchCompiler::JumpPointMap& jump_map,
const std::string& sw_var,
const FactoredSwitchCompiler::case_iterator_t& begin,
const FactoredSwitchCompiler::case_iterator_t& end,
const loc_range_t& loc_range)
{
size_t iter_delta = end - begin;
if(iter_delta == 0)
os << indent() << "assert(0);\n";
else if(iter_delta == 1) {
FactorJumpPoint jump_point = get_jump_point(
jump_map, begin->content);
if(loc_range.first < begin->low_bound) {
os << indent() << "if(" << sw_var << " < 0x"
<< hex << begin->low_bound << dec
<< ") goto _factor_default; "
<< "// IP=0x" << hex << loc_range.first << " ... 0x"
<< begin->low_bound - 1 << "\n";
}
if(begin->high_bound + 1 < loc_range.second) {
os << indent() << "if(0x" << hex << begin->high_bound << dec
<< " < " << sw_var << ") goto _factor_default; "
<< "// IP=0x" << hex << begin->high_bound + 1 << " ... 0x"
<< loc_range.second - 1 << "\n";
}
os << indent() << "// IP=0x" << hex << begin->low_bound
<< " ... 0x" << begin->high_bound << dec << "\n"
<< indent() << "goto " << jump_point << ";\n";
}
else {
const case_iterator_t mid = begin + iter_delta / 2;
os << indent() << "if(" << sw_var << " < 0x"
<< hex << mid->low_bound << dec << ") {\n";
indent_count++;
gen_binsearch_tree(
os, jump_map, sw_var, begin, mid,
make_pair(loc_range.first, mid->low_bound));
indent_count--;
os << indent() << "} else {\n";
indent_count++;
gen_binsearch_tree(
os, jump_map, sw_var, mid, end,
make_pair(mid->low_bound, loc_range.second));
indent_count--;
os << indent() << "}\n";
}
}

View file

@ -1,52 +0,0 @@
/** A switch generator that tries to factor out most of the redundancy between
* switch blocks, generating manually a switch-like template */
#pragma once
#include "SwitchStatement.hpp"
#include <map>
class FactoredSwitchCompiler: public AbstractSwitchCompiler {
public:
#ifdef STATS
struct Stats {
Stats(): generated_count(0), refer_count(0) {}
int generated_count, refer_count;
};
const Stats& get_stats() const { return stats; }
#endif
FactoredSwitchCompiler(int indent=0);
private:
typedef std::string FactorJumpPoint;
typedef std::map<SwitchStatement::SwitchCaseContent, FactorJumpPoint>
JumpPointMap;
typedef std::vector<SwitchStatement::SwitchCase>::const_iterator
case_iterator_t;
typedef std::pair<uintptr_t, uintptr_t> loc_range_t;
private:
virtual void to_stream(std::ostream& os, const SwitchStatement& sw);
FactorJumpPoint get_jump_point(JumpPointMap& jump_map,
const SwitchStatement::SwitchCaseContent& sw_case);
void gen_jump_points_code(std::ostream& os,
const JumpPointMap& jump_map);
void gen_binsearch_tree(
std::ostream& os,
JumpPointMap& jump_map,
const std::string& sw_var,
const case_iterator_t& begin,
const case_iterator_t& end,
const loc_range_t& loc_range // [beg, end[
);
size_t cur_label_id;
#ifdef STATS
Stats stats;
#endif//STATS
};

View file

@ -1,8 +1,7 @@
CXX=g++
CXXLOCS?=-L. -I.
CXXFL?=
CXXFLAGS=$(CXXLOCS) -Wall -Wextra -std=c++14 -O2 -g $(CXXFL)
CXXLIBS=-ldwarf -ldwarfpp -lsrk31c++ -lc++fileno -lelf
CXXFLAGS=$(CXXLOCS) -Wall -Wextra -std=c++14 -O2 -g
CXXLIBS=-lelf -ldwarf -ldwarfpp -lsrk31c++ -lc++fileno
TARGET=dwarf-assembly
OBJS=\
@ -10,14 +9,6 @@ OBJS=\
SimpleDwarf.o \
CodeGenerator.o \
PcListReader.o \
SimpleDwarfFilter.o \
PcHoleFiller.o \
EmptyFdeDeleter.o \
ConseqEquivFilter.o \
OverriddenRowFilter.o \
SwitchStatement.o \
NativeSwitchCompiler.o \
FactoredSwitchCompiler.o \
settings.o \
main.o

View file

@ -1,29 +0,0 @@
#include "NativeSwitchCompiler.hpp"
using namespace std;
NativeSwitchCompiler::NativeSwitchCompiler(
int indent):
AbstractSwitchCompiler(indent)
{}
void NativeSwitchCompiler::to_stream(ostream& os, const SwitchStatement& sw) {
os << indent() << "switch(" << sw.switch_var << ") {\n";
indent_count++;
for(const auto& cur_case: sw.cases) {
os << indent() << "case 0x"
<< hex << cur_case.low_bound << " ... 0x" << cur_case.high_bound
<< dec << ":\n";
indent_count++;
os << indent_str(cur_case.content.code);
indent_count--;
}
os << indent() << "default:\n";
indent_count++;
os << indent_str(sw.default_case);
indent_count--;
os << indent() << "}\n";
indent_count--;
}

View file

@ -1,12 +0,0 @@
/** Compiles a SwitchStatement to a native C switch */
#pragma once
#include "SwitchStatement.hpp"
class NativeSwitchCompiler: public AbstractSwitchCompiler {
public:
NativeSwitchCompiler(int indent=0);
private:
virtual void to_stream(std::ostream& os, const SwitchStatement& sw);
};

View file

@ -1,31 +0,0 @@
#include "OverriddenRowFilter.hpp"
OverriddenRowFilter::OverriddenRowFilter(bool enable)
: SimpleDwarfFilter(enable)
{}
SimpleDwarf OverriddenRowFilter::do_apply(const SimpleDwarf& dw) const {
SimpleDwarf out;
for(const auto& fde: dw.fde_list) {
out.fde_list.push_back(SimpleDwarf::Fde());
SimpleDwarf::Fde& cur_fde = out.fde_list.back();
cur_fde.fde_offset = fde.fde_offset;
cur_fde.beg_ip = fde.beg_ip;
cur_fde.end_ip = fde.end_ip;
if(fde.rows.empty())
continue;
for(size_t pos=0; pos < fde.rows.size(); ++pos) {
const auto& row = fde.rows[pos];
if(pos == fde.rows.size() - 1
|| row.ip != fde.rows[pos+1].ip)
{
cur_fde.rows.push_back(row);
}
}
}
return out;
}

View file

@ -1,15 +0,0 @@
/** SimpleDwarfFilter to remove the first `n-1` rows of a block of `n`
* contiguous rows that have the exact same address. */
#pragma once
#include "SimpleDwarf.hpp"
#include "SimpleDwarfFilter.hpp"
class OverriddenRowFilter: public SimpleDwarfFilter {
public:
OverriddenRowFilter(bool enable=true);
private:
SimpleDwarf do_apply(const SimpleDwarf& dw) const;
};

View file

@ -1,26 +0,0 @@
#include "PcHoleFiller.hpp"
#include <algorithm>
#include <cstdio>
using namespace std;
PcHoleFiller::PcHoleFiller(bool enable): SimpleDwarfFilter(enable) {}
SimpleDwarf PcHoleFiller::do_apply(const SimpleDwarf& dw) const {
SimpleDwarf out(dw);
sort(out.fde_list.begin(), out.fde_list.end(),
[](const SimpleDwarf::Fde& a, const SimpleDwarf::Fde& b) {
return a.beg_ip < b.beg_ip;
});
for(size_t pos=0; pos < out.fde_list.size() - 1; ++pos) {
if(out.fde_list[pos].end_ip > out.fde_list[pos + 1].beg_ip) {
fprintf(stderr, "WARNING: FDE %016lx-%016lx and %016lx-%016lx\n",
out.fde_list[pos].beg_ip, out.fde_list[pos].end_ip,
out.fde_list[pos + 1].beg_ip, out.fde_list[pos + 1].end_ip);
}
out.fde_list[pos].end_ip = out.fde_list[pos + 1].beg_ip;
}
return out;
}

View file

@ -1,15 +0,0 @@
/** Ensures there is no "hole" between two consecutive PC ranges, to optimize
* generated code size. */
#pragma once
#include "SimpleDwarf.hpp"
#include "SimpleDwarfFilter.hpp"
class PcHoleFiller: public SimpleDwarfFilter {
public:
PcHoleFiller(bool enable=true);
private:
SimpleDwarf do_apply(const SimpleDwarf& dw) const;
};

View file

@ -23,20 +23,3 @@ a list of all PCs in the ELF. The file contains one 8-bytes chunk per PC,
which is the PC in little endian.
`--pc-list PC_LIST_FILE_PATH`
### Dereferencing function
The lookup functions can also take an additional argument, a pointer to a
function of prototype
```C
uintptr_t deref(uintptr_t address)
```
that will, in spirit, contain a `return *((uintptr_t*)address);`.
This argument can be used to work on remote address spaces instead of local
address spaces, eg. to work with `libunwind`.
To enable the presence of this argument, you must pass the option
`--enable-deref-arg`

View file

@ -1,15 +1,4 @@
#include "SimpleDwarf.hpp"
#include "../shared/context_struct.h"
uint8_t SimpleDwarf::to_shared_flag(SimpleDwarf::MachineRegister mreg) {
switch(mreg) {
case REG_RIP: return (1 << UNWF_RIP);
case REG_RSP: return (1 << UNWF_RSP);
case REG_RBP: return (1 << UNWF_RBP);
case REG_RBX: return (1 << UNWF_RBX);
default: return 0;
}
}
static std::ostream& operator<<(
std::ostream& out,
@ -25,9 +14,6 @@ static std::ostream& operator<<(
case SimpleDwarf::REG_RBP:
out << "rbp";
break;
case SimpleDwarf::REG_RBX:
out << "rbx";
break;
case SimpleDwarf::REG_RA:
out << "RA";
break;
@ -63,7 +49,6 @@ std::ostream& operator<<(std::ostream& out, const SimpleDwarf::DwRow& row) {
out << std::hex << row.ip << std::dec
<< '\t' << row.cfa
<< '\t' << row.rbp
<< '\t' << row.rbx
<< '\t' << row.ra;
out << std::endl;
return out;

View file

@ -13,14 +13,12 @@
struct SimpleDwarf {
/** A machine register (eg. %rip) among the supported ones (x86_64 only
* for now) */
static const std::size_t HANDLED_REGISTERS_COUNT = 5;
static const std::size_t HANDLED_REGISTERS_COUNT = 4;
enum MachineRegister {
REG_RIP, REG_RSP, REG_RBP, REG_RBX,
REG_RIP, REG_RSP, REG_RBP,
REG_RA ///< A bit of cheating: not a machine register
};
static uint8_t to_shared_flag(MachineRegister mreg);
struct DwRegister {
/** Holds a single Dwarf register value */
@ -32,10 +30,6 @@ struct SimpleDwarf {
defined at some later IP in the same DIE) */
REG_REGISTER, ///< Value of a machine register plus offset
REG_CFA_OFFSET, ///< Value stored at some offset from CFA
REG_PLT_EXPR, /**< Value is the evaluation of the standard PLT
expression, ie `((rip & 15) >= 11) >> 3 + rsp`
This is hardcoded because it's the only expression
found so far, thus worth implementing. */
REG_NOT_IMPLEMENTED ///< This type of register is not supported
};
@ -50,7 +44,6 @@ struct SimpleDwarf {
uintptr_t ip; ///< Instruction pointer
DwRegister cfa; ///< Canonical Frame Address
DwRegister rbp; ///< Base pointer register
DwRegister rbx; ///< RBX, sometimes used for unwinding
DwRegister ra; ///< Return address
friend std::ostream& operator<<(std::ostream &, const DwRow&);
@ -58,8 +51,8 @@ struct SimpleDwarf {
struct Fde {
uintptr_t fde_offset; ///< This FDE's offset in the original DWARF
uintptr_t beg_ip, ///< This FDE's start instruction pointer incl.
end_ip; ///< This FDE's end instruction pointer excl.
uintptr_t beg_ip, ///< This FDE's start instruction pointer
end_ip; ///< This FDE's end instruction pointer
std::vector<DwRow> rows; ///< Dwarf rows for this FDE
friend std::ostream& operator<<(std::ostream &, const Fde&);

View file

@ -1,14 +0,0 @@
#include "SimpleDwarfFilter.hpp"
SimpleDwarfFilter::SimpleDwarfFilter(bool enable): enable(enable)
{}
SimpleDwarf SimpleDwarfFilter::apply(const SimpleDwarf& dw) const {
if(!enable)
return dw;
return do_apply(dw);
}
SimpleDwarf SimpleDwarfFilter::operator()(const SimpleDwarf& dw) {
return apply(dw);
}

View file

@ -1,28 +0,0 @@
/** An abstract parent class for any SimpleDwarf filter, that is. a class that
* transforms some SimpleDwarf into some other SimpleDwarf. */
#pragma once
#include <vector>
#include "SimpleDwarf.hpp"
class SimpleDwarfFilter {
public:
/** Constructor
*
* @param apply set to false to disable this filter. This setting is
* convenient for compact filter-chaining code. */
SimpleDwarfFilter(bool enable=true);
/// Applies the filter
SimpleDwarf apply(const SimpleDwarf& dw) const;
/// Same as apply()
SimpleDwarf operator()(const SimpleDwarf& dw);
private:
virtual SimpleDwarf do_apply(const SimpleDwarf& dw) const = 0;
bool enable;
};

View file

@ -1,48 +0,0 @@
#include "SwitchStatement.hpp"
#include <sstream>
using namespace std;
AbstractSwitchCompiler::AbstractSwitchCompiler(
int indent)
: indent_count(indent)
{
}
void AbstractSwitchCompiler::operator()(
ostream& os, const SwitchStatement& sw)
{
to_stream(os, sw);
}
string AbstractSwitchCompiler::operator()(const SwitchStatement& sw) {
ostringstream os;
(*this)(os, sw);
return os.str();
}
std::string AbstractSwitchCompiler::indent_str(const std::string& str) {
ostringstream out;
int last_find = -1;
size_t find_pos;
while((find_pos = str.find('\n', last_find + 1)) != string::npos) {
out << indent()
<< str.substr(last_find + 1, find_pos - last_find); // includes \n
last_find = find_pos;
}
if(last_find + 1 < (ssize_t)str.size()) {
out << indent()
<< str.substr(last_find + 1)
<< '\n';
}
return out.str();
}
std::string AbstractSwitchCompiler::indent() const {
return string(indent_count, '\t');
}
std::string AbstractSwitchCompiler::endcl() const {
return string("\n") + indent();
}

View file

@ -1,46 +0,0 @@
/** Contains an abstract switch statement, which can be turned to C code later
* on. */
#pragma once
#include <string>
#include <vector>
#include <ostream>
#include <memory>
struct SwitchStatement {
struct SwitchCaseContent {
std::string code;
bool operator==(const SwitchCaseContent& oth) const {
return code == oth.code;
}
bool operator<(const SwitchCaseContent& oth) const {
return code < oth.code;
}
};
struct SwitchCase {
uintptr_t low_bound, high_bound;
SwitchCaseContent content;
};
std::string switch_var;
std::string default_case;
std::vector<SwitchCase> cases;
};
class AbstractSwitchCompiler {
public:
AbstractSwitchCompiler(int indent=0);
void operator()(std::ostream& os, const SwitchStatement& sw);
std::string operator()(const SwitchStatement& sw);
protected:
virtual void to_stream(
std::ostream& os, const SwitchStatement& sw) = 0;
std::string indent_str(const std::string& str) ;
std::string indent() const;
std::string endcl() const;
int indent_count;
};

View file

@ -7,13 +7,6 @@
#include "SimpleDwarf.hpp"
#include "DwarfReader.hpp"
#include "CodeGenerator.hpp"
#include "SwitchStatement.hpp"
#include "NativeSwitchCompiler.hpp"
#include "FactoredSwitchCompiler.hpp"
#include "PcHoleFiller.hpp"
#include "EmptyFdeDeleter.hpp"
#include "ConseqEquivFilter.hpp"
#include "OverriddenRowFilter.hpp"
#include "settings.hpp"
@ -63,14 +56,6 @@ MainOptions options_parse(int argc, char** argv) {
settings::pc_list = argv[option_pos];
}
}
else if(option == "--enable-deref-arg") {
settings::enable_deref_arg = true;
}
else if(option == "--keep-holes") {
settings::keep_holes = true;
}
}
if(!seen_switch_gen_policy) {
@ -89,8 +74,6 @@ MainOptions options_parse(int argc, char** argv) {
cerr << "Usage: "
<< argv[0]
<< " [--switch-per-func | --global-switch] "
<< " [--enable-deref-arg]"
<< " [--keep-holes]"
<< "[--pc-list PC_LIST_FILE] elf_path"
<< endl;
}
@ -104,38 +87,15 @@ int main(int argc, char** argv) {
MainOptions opts = options_parse(argc, argv);
SimpleDwarf parsed_dwarf = DwarfReader(opts.elf_path).read();
SimpleDwarf filtered_dwarf =
PcHoleFiller(!settings::keep_holes)(
EmptyFdeDeleter()(
OverriddenRowFilter()(
ConseqEquivFilter()(
parsed_dwarf))));
FactoredSwitchCompiler* sw_compiler = new FactoredSwitchCompiler(1);
CodeGenerator code_gen(
filtered_dwarf,
parsed_dwarf,
cout,
[](const SimpleDwarf::Fde& fde) {
std::ostringstream ss;
ss << "_fde_" << fde.beg_ip;
return ss.str();
},
//new NativeSwitchCompiler()
sw_compiler
);
});
code_gen.generate();
#ifdef STATS
cerr << "Factoring stats:\nRefers: "
<< sw_compiler->get_stats().refer_count
<< "\nGenerated: "
<< sw_compiler->get_stats().generated_count
<< "\nAvoided: "
<< sw_compiler->get_stats().refer_count
- sw_compiler->get_stats().generated_count
<< "\n";
#endif
return 0;
}

View file

@ -1,63 +0,0 @@
#pragma once
#include <dwarfpp/expr.hpp>
static const dwarf::encap::loc_expr REFERENCE_PLT_EXPR(
std::vector<dwarf::encap::expr_instr> {
{
{
.lr_atom = 0x77,
.lr_number = 8,
.lr_number2 = 0,
.lr_offset = 0
},
{
.lr_atom = 0x80,
.lr_number = 0,
.lr_number2 = 0,
.lr_offset = 2
},
{
.lr_atom = 0x3f,
.lr_number = 15,
.lr_number2 = 0,
.lr_offset = 4
},
{
.lr_atom = 0x1a,
.lr_number = 0,
.lr_number2 = 0,
.lr_offset = 5
},
{
.lr_atom = 0x3b,
.lr_number = 11,
.lr_number2 = 0,
.lr_offset = 6
},
{
.lr_atom = 0x2a,
.lr_number = 0,
.lr_number2 = 0,
.lr_offset = 7
},
{
.lr_atom = 0x33,
.lr_number = 3,
.lr_number2 = 0,
.lr_offset = 8
},
{
.lr_atom = 0x24,
.lr_number = 0,
.lr_number2 = 0,
.lr_offset = 9
},
{
.lr_atom = 0x22,
.lr_number = 0,
.lr_number2 = 0,
.lr_offset = 10
}
}
});

View file

@ -3,6 +3,4 @@
namespace settings {
SwitchGenerationPolicy switch_generation_policy = SGP_SwitchPerFunc;
std::string pc_list = "";
bool enable_deref_arg = false;
bool keep_holes = false;
}

View file

@ -15,7 +15,4 @@ namespace settings {
extern SwitchGenerationPolicy switch_generation_policy;
extern std::string pc_list;
extern bool enable_deref_arg;
extern bool keep_holes; /**< Keep holes between FDEs. Larger eh_elf files,
but more accurate unwinding. */
}

View file

@ -214,9 +214,6 @@ _fde_func_t fde_handler_for_pc(uintptr_t pc, MemoryMapEntry& mmap_entry) {
}
bool unwind_context(unwind_context_t& ctx) {
if(ctx.rbp == 0 || ctx.rip + 1 == 0)
return false;
MemoryMapEntry* mmap_entry = get_mmap_entry(ctx.rip);
if(mmap_entry == nullptr)
return false;
@ -228,9 +225,8 @@ bool unwind_context(unwind_context_t& ctx) {
uintptr_t tr_pc = ctx.rip - mmap_entry->beg;
ctx = fde_func(ctx, tr_pc);
if(ctx.rip + 1 == 0 && ctx.rsp + 1 == 0 && ctx.rbp + 1 == 0) // no entry
if(ctx.rbp == 0 || ctx.rip + 1 == 0)
return false;
return true;
}
@ -240,15 +236,3 @@ void walk_stack(const std::function<void(const unwind_context_t&)>& mapped) {
mapped(ctx);
} while(unwind_context(ctx));
}
uintptr_t get_register(const unwind_context_t& ctx, StackWalkerRegisters reg) {
switch(reg) {
case SW_REG_RIP:
return ctx.rip;
case SW_REG_RSP:
return ctx.rsp;
case SW_REG_RBP:
return ctx.rbp;
}
assert(0);
}

View file

@ -9,12 +9,6 @@
#include "../shared/context_struct.h"
/** Handled registers list */
enum StackWalkerRegisters {
SW_REG_RIP,
SW_REG_RSP,
SW_REG_RBP
};
/** Initialize the stack walker. This must be called only once.
*
@ -41,7 +35,3 @@ bool unwind_context(unwind_context_t& ctx);
/** Call the passed function once per frame in the call stack, most recent
* frame first, with the current context as its sole argument. */
void walk_stack(const std::function<void(const unwind_context_t&)>& mapped);
/** Get a register's value on an unwind_context_t. This is useful for other
* implementations of stack_walker that use different unwind_context_t */
uintptr_t get_register(const unwind_context_t& ctx, StackWalkerRegisters reg);

View file

@ -1,18 +0,0 @@
CXX=g++
CXXLIBS=-lunwind -lunwind-x86_64
CXXFLAGS=-O2 -fPIC -Wall -Wextra -rdynamic
TARGET=libstack_walker.so
OBJS=stack_walker.o
all: $(TARGET)
$(TARGET): $(OBJS)
$(CXX) $(CXXFLAGS) -shared $< $(CXXLIBS) -o $@
%.o: %.cpp
$(CXX) $(CXXFLAGS) -c $< -o $@
.PHONY: clean
clean:
rm -f $(OBJS) $(TARGET_BASE)*.so

View file

@ -1,96 +0,0 @@
#include "../stack_walker/stack_walker.hpp"
#include <libunwind.h>
#include <vector>
#include <cassert>
struct UnwPtrs {
unw_context_t* ctx;
unw_cursor_t* cursor;
};
static std::vector<UnwPtrs> to_delete;
bool stack_walker_init() {
return true;
}
void stack_walker_close() {
for(auto& del: to_delete) {
delete del.ctx;
delete del.cursor;
}
}
static unw_cursor_t* get_cursor(const unwind_context_t& context) {
// This is subtly dirty.
// unwind_context_t::rip is uint64_t, thus suitable for a pointer.
return (unw_cursor_t*) context.rip;
}
static void set_cursor(unwind_context_t& context, unw_cursor_t* unw_cursor) {
// This is subtly dirty.
// unwind_context_t::rip is uint64_t, thus suitable for a pointer.
context.rip = (uintptr_t) unw_cursor;
}
unwind_context_t get_context() {
int rc;
unw_context_t* unw_context = new unw_context_t;
rc = unw_getcontext(unw_context);
if(rc < 0)
assert(0);
unw_cursor_t* cursor = new unw_cursor_t;
rc = unw_init_local(cursor, unw_context);
if(rc < 0)
assert(0);
rc = unw_step(cursor);
if(rc < 0)
assert(0);
unwind_context_t out;
set_cursor(out, cursor);
to_delete.push_back(UnwPtrs({unw_context, cursor}));
return out;
}
bool unwind_context(unwind_context_t& ctx) {
int rc = unw_step(get_cursor(ctx));
return rc > 0;
}
void walk_stack(const std::function<void(const unwind_context_t&)>& mapped) {
unwind_context_t context = get_context();
do {
mapped(context);
} while(unwind_context(context));
}
uintptr_t get_register(const unwind_context_t& ctx, StackWalkerRegisters reg) {
unw_cursor_t* cursor = get_cursor(ctx);
unw_regnum_t regnum = 0;
switch(reg) {
case SW_REG_RIP:
regnum = UNW_X86_64_RIP;
break;
case SW_REG_RSP:
regnum = UNW_X86_64_RSP;
break;
case SW_REG_RBP:
regnum = UNW_X86_64_RBP;
break;
}
uintptr_t out;
int rc = unw_get_reg(cursor, regnum, &out);
if(rc < 0)
assert(false);
return out;
}

View file

@ -1,37 +0,0 @@
/** Provides stack walking facilities exploiting the eh_elf objects, that can
* be loaded at runtime. */
#pragma once
#include <cstdint>
#include <functional>
#include "../shared/context_struct.h"
/** Initialize the stack walker. This must be called only once.
*
* \return true iff everything was correctly initialized.
*/
bool stack_walker_init();
/** Deallocate everything that was allocated by the stack walker */
void stack_walker_close();
/** Get the unwind context of the point from which this function was called.
*
* This context must then be exploited straight away: it is unsafe to alter the
* call stack before using it, in particular by returning from the calling
* function. */
unwind_context_t get_context();
/** Unwind the passed context once, in place.
*
* Returns `true` if the context was actually unwinded, or `false` if the end
* of the call stack was reached. */
bool unwind_context(unwind_context_t& ctx);
/** Call the passed function once per frame in the call stack, most recent
* frame first, with the current context as its sole argument. */
void walk_stack(const std::function<void(const unwind_context_t&)>& mapped);

3
stats/.gitignore vendored
View file

@ -1,3 +0,0 @@
venv
elf_data*
gathered

View file

@ -1,11 +0,0 @@
# Statistical scripts
Computes stats about a whole lot of stuff.
## Setup
```sh
virtualenv -p python3 venv # Do this only once
source venv/bin/activate # Do this for every new shell working running the script
pip install -r requirements.txt # Do this only once
```

View file

View file

@ -1,106 +0,0 @@
#!/usr/bin/env python3
from stats_accu import StatsAccumulator
import gather_stats
import argparse
import sys
class Config:
def __init__(self):
args = self.parse_args()
self._cores = args.cores
self.feature = args.feature
if args.feature == 'gather':
self.output = args.output
elif args.feature == 'sample':
self.size = int(args.size)
self.output = args.output
elif args.feature == 'analyze':
self.data_file = args.data_file
@property
def cores(self):
if self._cores <= 0:
return None
return self._cores
def parse_args(self):
parser = argparse.ArgumentParser(
description="Gather statistics about system-related ELFs")
parser.add_argument('--cores', '-j', default=1, type=int,
help=("Use N cores for processing. Defaults to "
"1. 0 to use up all cores."))
subparsers = parser.add_subparsers(help='Subcommands')
# Sample stats
parser_sample = subparsers.add_parser(
'sample',
help='Same as gather, but for a random subset of files')
parser_sample.set_defaults(feature='sample')
parser_sample.add_argument('--size', '-n',
default=1000,
help=('Pick this number of files'))
parser_sample.add_argument('--output', '-o',
default='elf_data',
help=('Output data to this file. Defaults '
'to "elf_data"'))
# Gather stats
parser_gather = subparsers.add_parser(
'gather',
help=('Gather system data into a file, to allow multiple '
'analyses without re-scanning the whole system.'))
parser_gather.set_defaults(feature='gather')
parser_gather.add_argument('--output', '-o',
default='elf_data',
help=('Output data to this file. Defaults '
'to "elf_data"'))
# Analyze stats
parser_analyze = subparsers.add_parser(
'analyze',
help='Analyze data gathered by a previous run.')
parser_analyze.set_defaults(feature='analyze')
parser_analyze.add_argument('data_file',
default='elf_data',
help=('Analyze this data file. Defaults '
'to "elf_data".'))
# TODO histogram?
out = parser.parse_args()
if 'feature' not in out:
print("No subcommand specified.", file=sys.stderr)
parser.print_usage(file=sys.stderr)
sys.exit(1)
return out
def main():
config = Config()
if config.feature == 'gather':
stats_accu = gather_stats.gather_system_files(config)
stats_accu.dump(config.output)
elif config.feature == 'sample':
stats_accu = gather_stats.gather_system_files(
config,
sample_size=config.size)
stats_accu.dump(config.output)
elif config.feature == 'analyze':
print("Not implemented", file=sys.stderr)
stats_accu = StatsAccumulator.load(config.data_file)
sys.exit(1)
if __name__ == '__main__':
main()

View file

@ -1,147 +0,0 @@
from elftools.common.exceptions import DWARFError
from pyelftools_overlay import system_elfs, get_cfi
from elftools.dwarf import callframe
import concurrent.futures
import random
from stats_accu import \
StatsAccumulator, SingleFdeData, FdeData, DwarfInstr
class ProcessWrapper:
def __init__(self, fct):
self._fct = fct
def __call__(self, elf_descr):
try:
path, elftype = elf_descr
print("Processing {}".format(path))
cfi = get_cfi(path)
if not cfi:
return None
return self._fct(path, elftype, cfi)
except DWARFError:
return None
def process_wrapper(fct):
return ProcessWrapper(fct)
@process_wrapper
def process_elf(path, elftype, cfi):
''' Process a single file '''
data = FdeData()
for entry in cfi:
if isinstance(entry, callframe.CIE): # Is a CIE
process_cie(entry, data)
elif isinstance(entry, callframe.FDE): # Is a FDE
process_fde(entry, data)
return SingleFdeData(path, elftype, data)
def incr_cell(table, key):
''' Increments table[key], or sets it to 1 if unset '''
if key in table:
table[key] += 1
else:
table[key] = 1
def process_cie(cie, data):
''' Process a CIE '''
pass # Nothing needed from a CIE
def process_fde(fde, data):
''' Process a FDE '''
data.fde_count += 1
decoded = fde.get_decoded()
row_count = len(decoded.table)
incr_cell(data.fde_with_lines, row_count)
for row in decoded.table:
process_reg(data.regs.cfa, row['cfa'])
for entry in row:
if isinstance(entry, int):
process_reg(data.regs.regs[entry], row[entry])
def process_reg(out_reg, reg_def):
''' Process a register '''
if isinstance(reg_def, callframe.CFARule):
if reg_def.reg is not None:
out_reg.regs[reg_def.reg] += 1
else:
pass # TODO exprs
else:
incr_cell(out_reg.instrs, DwarfInstr.of_pyelf(reg_def.type))
if reg_def.type == callframe.RegisterRule.REGISTER:
out_reg.regs[reg_def.arg] += 1
elif (reg_def.type == callframe.RegisterRule.EXPRESSION) \
or (reg_def.type == callframe.RegisterRule.VAL_EXPRESSION):
pass # TODO exprs
def gather_system_files(config, sample_size=None):
stats_accu = StatsAccumulator()
elf_list = []
for elf_path in system_elfs():
elf_list.append(elf_path)
if sample_size is not None:
elf_list_sampled = random.sample(elf_list, sample_size)
elf_list = elf_list_sampled
if config.cores > 1:
with concurrent.futures.ProcessPoolExecutor(max_workers=config.cores)\
as executor:
for fde in executor.map(process_elf, elf_list):
stats_accu.add_fde(fde)
else:
for elf in elf_list:
stats_accu.add_fde(process_elf(elf))
return stats_accu
def map_system_files(mapper, sample_size=None, cores=None, include=None,
elflist=None):
''' `mapper` must take (path, elf_type, cfi) '''
if cores is None:
cores = 1
if include is None:
include = []
mapper = process_wrapper(mapper)
if elflist is None:
elf_list = []
for elf_path in system_elfs():
elf_list.append(elf_path)
if sample_size is not None:
elf_list_sampled = random.sample(elf_list, sample_size)
elf_list = elf_list_sampled
elf_list += list(map(lambda x: (x, None), include))
else:
elf_list = elflist
if cores > 1:
with concurrent.futures.ProcessPoolExecutor(max_workers=cores)\
as executor:
out = executor.map(mapper, elf_list)
else:
out = map(mapper, elf_list)
return out, elf_list

View file

@ -1,228 +0,0 @@
from elftools.dwarf import callframe
import gather_stats
import itertools
import functools
REGS_IDS = {
'RAX': 0,
'RDX': 1,
'RCX': 2,
'RBX': 3,
'RSI': 4,
'RDI': 5,
'RBP': 6,
'RSP': 7,
'R8': 8,
'R9': 9,
'R10': 10,
'R11': 11,
'R12': 12,
'R13': 13,
'R14': 14,
'R15': 15,
'RIP': 16
}
ID_TO_REG = [
'RAX',
'RDX',
'RCX',
'RBX',
'RSI',
'RDI',
'RBP',
'RSP',
'R8',
'R9',
'R10',
'R11',
'R12',
'R13',
'R14',
'R15',
'RIP',
]
HANDLED_REGS = list(map(lambda x: REGS_IDS[x], [
'RIP',
'RSP',
'RBP',
'RBX',
]))
ONLY_HANDLED_REGS = True # only analyzed handled regs columns
PLT_EXPR = [119, 8, 128, 0, 63, 26, 59, 42, 51, 36, 34] # Handled exp
def accumulate_regs(reg_list):
out = [0] * 17
for lst in reg_list:
for pos in range(len(lst)):
out[pos] += lst[pos]
return out
def filter_none(lst):
for x in lst:
if x:
yield x
def deco_filter_none(fct):
def wrap(lst):
return fct(filter_none(lst))
return wrap
class FdeProcessor:
def __init__(self, fct, reducer=None):
self._fct = fct
self._reducer = reducer
def __call__(self, path, elftype, cfi):
out = []
for entry in cfi:
if isinstance(entry, callframe.FDE):
decoded = entry.get_decoded()
out.append(self._fct(path, entry, decoded))
if self._reducer is not None and len(out) >= 2:
out = [self._reducer(out)]
return out
class FdeProcessorReduced:
def __init__(self, reducer):
self._reducer = reducer
def __call__(self, fct):
return FdeProcessor(fct, self._reducer)
def fde_processor(fct):
return FdeProcessor(fct)
def fde_processor_reduced(reducer):
return FdeProcessorReduced(reducer)
def is_handled_expr(expr):
if expr == PLT_EXPR:
return True
if len(expr) == 2 and 0x70 <= expr[0] <= 0x89:
if expr[0] - 0x70 in HANDLED_REGS:
return True
return False
# @fde_processor
def find_non_cfa(path, fde, decoded):
regs_seen = 0
non_handled_regs = 0
non_handled_exp = 0
cfa_dat = [0, 0] # Seen, expr
rule_type = {
callframe.RegisterRule.UNDEFINED: 0,
callframe.RegisterRule.SAME_VALUE: 0,
callframe.RegisterRule.OFFSET: 0,
callframe.RegisterRule.VAL_OFFSET: 0,
callframe.RegisterRule.REGISTER: 0,
callframe.RegisterRule.EXPRESSION: 0,
callframe.RegisterRule.VAL_EXPRESSION: 0,
callframe.RegisterRule.ARCHITECTURAL: 0,
}
problematic_paths = set()
for row in decoded.table:
for entry in row:
reg_def = row[entry]
if entry == 'cfa':
cfa_dat[0] += 1
if reg_def.expr:
cfa_dat[1] += 1
if not is_handled_expr(reg_def.expr):
non_handled_exp += 1
problematic_paths.add(path)
elif reg_def:
if reg_def.reg not in HANDLED_REGS:
non_handled_regs += 1
problematic_paths.add(path)
if not isinstance(entry, int): # CFA or PC
continue
if ONLY_HANDLED_REGS and entry not in HANDLED_REGS:
continue
rule_type[reg_def.type] += 1
reg_rule = reg_def.type
if reg_rule in [callframe.RegisterRule.OFFSET,
callframe.RegisterRule.VAL_OFFSET]:
regs_seen += 1 # CFA
elif reg_rule == callframe.RegisterRule.REGISTER:
regs_seen += 1
if reg_def.arg not in HANDLED_REGS:
problematic_paths.add(path)
non_handled_regs += 1
elif reg_rule in [callframe.RegisterRule.EXPRESSION,
callframe.RegisterRule.VAL_EXPRESSION]:
expr = reg_def.arg
if not is_handled_expr(reg_def.arg):
problematic_paths.add(path)
with open('/tmp/exprs', 'a') as handle:
handle.write('[{} - {}] {}\n'.format(
path, fde.offset,
', '.join(map(lambda x: hex(x), expr))))
non_handled_exp += 1
return (regs_seen, non_handled_regs, non_handled_exp, rule_type, cfa_dat,
problematic_paths)
def reduce_non_cfa(lst):
def merge_dict(d1, d2):
for x in d1:
d1[x] += d2[x]
return d1
def merge_list(l1, l2):
out = []
for pos in range(len(l1)): # Implicit assumption len(l1) == len(l2)
out.append(l1[pos] + l2[pos])
return out
def merge_elts(accu, elt):
accu_regs, accu_nh, accu_exp, accu_rt, accu_cfa, accu_paths = accu
elt_regs, elt_nh, elt_exp, elt_rt, elt_cfa, elf_paths = elt
return (
accu_regs + elt_regs,
accu_nh + elt_nh,
accu_exp + elt_exp,
merge_dict(accu_rt, elt_rt),
merge_list(accu_cfa, elt_cfa),
accu_paths.union(elf_paths),
)
return functools.reduce(merge_elts, lst)
@deco_filter_none
def flatten_non_cfa(result):
flat = itertools.chain.from_iterable(result)
out = reduce_non_cfa(flat)
out_cfa = {
'seen': out[4][0],
'expr': out[4][1],
'offset': out[4][0] - out[4][1],
}
out = (out[0],
(out[1], out[0] + out_cfa['offset']),
(out[2], out[3]['EXPRESSION'] + out_cfa['expr']),
out[3],
out_cfa,
out[5])
return out

View file

@ -1,110 +0,0 @@
""" Overlay of PyElfTools for quick access to what we want here """
from elftools.elf.elffile import ELFFile
from elftools.common.exceptions import ELFError, DWARFError
from stats_accu import ElfType
import os
ELF_BLACKLIST = [
'/usr/lib/libavcodec.so',
]
def get_cfi(path):
''' Get the CFI entries from the ELF at the provided path '''
try:
with open(path, 'rb') as file_handle:
elf_file = ELFFile(file_handle)
if not elf_file.has_dwarf_info():
print("No DWARF")
return None
dw_info = elf_file.get_dwarf_info()
if dw_info.has_CFI():
cfis = dw_info.CFI_entries()
elif dw_info.has_EH_CFI():
cfis = dw_info.EH_CFI_entries()
else:
print("No CFI")
return None
except ELFError:
print("ELF Error")
return None
except DWARFError:
print("DWARF Error")
return None
except PermissionError:
print("Permission Error")
return None
except KeyError:
print("Key Error")
return None
return cfis
def system_elfs():
''' Iterator over system libraries '''
def readlink_rec(path):
if not os.path.islink(path):
return path
return readlink_rec(
os.path.join(os.path.dirname(path),
os.readlink(path)))
sysbin_dirs = [
('/lib', ElfType.ELF_LIB),
('/usr/lib', ElfType.ELF_LIB),
('/usr/local/lib', ElfType.ELF_LIB),
('/bin', ElfType.ELF_BINARY),
('/usr/bin', ElfType.ELF_BINARY),
('/usr/local/bin', ElfType.ELF_BINARY),
('/sbin', ElfType.ELF_BINARY),
]
to_explore = sysbin_dirs
seen_elfs = set()
while to_explore:
bindir, elftype = to_explore.pop()
if not os.path.isdir(bindir):
continue
for direntry in os.scandir(bindir):
if not direntry.is_file():
if direntry.is_dir():
to_explore.append((direntry.path, elftype))
continue
canonical_name = readlink_rec(direntry.path)
for blacked in ELF_BLACKLIST:
if canonical_name.startswith(blacked):
continue
if canonical_name in seen_elfs:
continue
valid_elf = True
try:
with open(canonical_name, 'rb') as handle:
magic_bytes = handle.read(4)
if magic_bytes != b'\x7fELF':
valid_elf = False
elf_class = handle.read(1)
if elf_class != b'\x02': # ELF64
valid_elf = False
except Exception:
continue
if not valid_elf:
continue
if not os.path.isfile(canonical_name):
continue
seen_elfs.add(canonical_name)
yield (canonical_name, elftype)

View file

@ -1 +0,0 @@
git+https://github.com/eliben/pyelftools

View file

@ -1,263 +0,0 @@
from elftools.dwarf import callframe
import enum
import subprocess
import re
import json
import collections
from math import ceil
class ProportionFinder:
''' Finds figures such as median, etc. on the original structure of a
dictionnary mapping a value to its occurrence count '''
def __init__(self, count_per_value):
self.cumulative = []
prev_count = 0
for key in sorted(count_per_value.keys()):
n_count = prev_count + count_per_value[key]
self.cumulative.append(
(key, n_count))
prev_count = n_count
self.elem_count = prev_count
def find_at_proportion(self, proportion):
if not self.cumulative: # Empty list
return None
low_bound = ceil(self.elem_count * proportion)
def binsearch(beg, end):
med = ceil((beg + end) / 2)
if beg + 1 == end:
return self.cumulative[beg][0]
if self.cumulative[med - 1][1] < low_bound:
return binsearch(med, end)
return binsearch(beg, med)
return binsearch(0, len(self.cumulative))
def elf_so_deps(path):
''' Get the list of shared objects dependencies of the given ELF object.
This is obtained by running `ldd`. '''
deps_list = []
try:
ldd_output = subprocess.check_output(['/usr/bin/ldd', path]) \
.decode('utf-8')
ldd_re = re.compile(r'^.* => (.*) \(0x[0-9a-fA-F]*\)$')
ldd_lines = ldd_output.strip().split('\n')
for line in ldd_lines:
line = line.strip()
match = ldd_re.match(line)
if match is None:
continue # Just ignore that line — it might be eg. linux-vdso
deps_list.append(match.group(1))
return deps_list
except subprocess.CalledProcessError as exn:
raise Exception(
("Cannot get dependencies for {}: ldd terminated with exit code "
"{}.").format(path, exn.returncode))
class ElfType(enum.Enum):
ELF_LIB = enum.auto()
ELF_BINARY = enum.auto()
class DwarfInstr(enum.Enum):
@staticmethod
def of_pyelf(val):
_table = {
callframe.RegisterRule.UNDEFINED: DwarfInstr.INSTR_UNDEF,
callframe.RegisterRule.SAME_VALUE: DwarfInstr.INSTR_SAME_VALUE,
callframe.RegisterRule.OFFSET: DwarfInstr.INSTR_OFFSET,
callframe.RegisterRule.VAL_OFFSET: DwarfInstr.INSTR_VAL_OFFSET,
callframe.RegisterRule.REGISTER: DwarfInstr.INSTR_REGISTER,
callframe.RegisterRule.EXPRESSION: DwarfInstr.INSTR_EXPRESSION,
callframe.RegisterRule.VAL_EXPRESSION:
DwarfInstr.INSTR_VAL_EXPRESSION,
callframe.RegisterRule.ARCHITECTURAL:
DwarfInstr.INSTR_ARCHITECTURAL,
}
return _table[val]
INSTR_UNDEF = enum.auto()
INSTR_SAME_VALUE = enum.auto()
INSTR_OFFSET = enum.auto()
INSTR_VAL_OFFSET = enum.auto()
INSTR_REGISTER = enum.auto()
INSTR_EXPRESSION = enum.auto()
INSTR_VAL_EXPRESSION = enum.auto()
INSTR_ARCHITECTURAL = enum.auto()
def intify_dict(d):
out = {}
for key in d:
try:
nKey = int(key)
except Exception:
nKey = key
try:
out[nKey] = int(d[key])
except ValueError:
out[nKey] = d[key]
return out
class RegData:
def __init__(self, instrs=None, regs=None, exprs=None):
if instrs is None:
instrs = {}
if regs is None:
regs = [0]*17
if exprs is None:
exprs = {}
self.instrs = intify_dict(instrs)
self.regs = regs
self.exprs = intify_dict(exprs)
@staticmethod
def map_dict_keys(fnc, dic):
out = {}
for key in dic:
out[fnc(key)] = dic[key]
return out
def dump(self):
return {
'instrs': RegData.map_dict_keys(lambda x: x.value, self.instrs),
'regs': self.regs,
'exprs': self.exprs,
}
@staticmethod
def load(data):
return RegData(
instrs=RegData.map_dict_keys(
lambda x: DwarfInstr(int(x)),
data['instrs']),
regs=data['regs'],
exprs=data['exprs'],
)
class RegsList:
def __init__(self, cfa=None, regs=None):
if cfa is None:
cfa = RegsList.fresh_reg()
if regs is None:
regs = [RegsList.fresh_reg() for _ in range(17)]
self.cfa = cfa
self.regs = regs
@staticmethod
def fresh_reg():
return RegData()
def dump(self):
return {
'cfa': RegData.dump(self.cfa),
'regs': [RegData.dump(r) for r in self.regs],
}
@staticmethod
def load(data):
return RegsList(
cfa=RegData.load(data['cfa']),
regs=[RegData.load(r) for r in data['regs']],
)
class FdeData:
def __init__(self, fde_count=0, fde_with_lines=None, regs=None):
if fde_with_lines is None:
fde_with_lines = {}
if regs is None:
regs = RegsList()
self.fde_count = fde_count
self.fde_with_lines = intify_dict(fde_with_lines)
self.regs = regs
def dump(self):
return {
'fde_count': self.fde_count,
'fde_with_lines': self.fde_with_lines,
'regs': self.regs.dump(),
}
@staticmethod
def load(data):
return FdeData(
fde_count=int(data['fde_count']),
fde_with_lines=data['fde_with_lines'],
regs=RegsList.load(data['regs']))
class SingleFdeData:
def __init__(self, path, elf_type, data):
self.path = path
self.elf_type = elf_type
self.data = data # < of type FdeData
self.gather_deps()
def gather_deps(self):
""" Collect ldd data on the binary """
# self.deps = elf_so_deps(self.path)
self.deps = []
def dump(self):
return {
'path': self.path,
'elf_type': self.elf_type.value,
'data': self.data.dump()
}
@staticmethod
def load(data):
return SingleFdeData(
data['path'],
ElfType(int(data['elf_type'])),
FdeData.load(data['data']))
class StatsAccumulator:
def __init__(self):
self.fdes = []
def add_fde(self, fde_data):
if fde_data:
self.fdes.append(fde_data)
def get_fdes(self):
return self.fdes
def add_stats_accu(self, stats_accu):
for fde in stats_accu.get_fdes():
self.add_fde(fde)
def dump(self, path):
dict_form = [fde.dump() for fde in self.fdes]
with open(path, 'w') as handle:
handle.write(json.dumps(dict_form))
@staticmethod
def load(path):
with open(path, 'r') as handle:
text = handle.read()
out = StatsAccumulator()
out.fdes = [SingleFdeData.load(data) for data in json.loads(text)]
return out

View file

@ -1,22 +1,10 @@
CXX=g++
CXXFLAGS=-Wall -Wextra -O0 -g -I../stack_walker -rdynamic
TARGET_BASE=stack_walked
TARGETS= \
$(TARGET_BASE).per_func.bin \
$(TARGET_BASE).global.bin \
$(TARGET_BASE).libunwind.bin
all: $(TARGETS)
stack_walked.libunwind.bin: stack_walked.cpp
LD_RUN_PATH=../stack_walker_libunwind \
$(CXX) $(CXXFLAGS) -o $@ $^ \
-L../stack_walker_libunwind -ldl -lstack_walker
all: stack_walked.per_func.bin stack_walked.global.bin
stack_walked.%.bin: stack_walked.cpp
LD_RUN_PATH=eh_elfs.$*:../stack_walker \
$(CXX) $(CXXFLAGS) -o $@ $^ \
LD_RUN_PATH=eh_elfs.$* $(CXX) $(CXXFLAGS) -o $@ $^ \
-L../stack_walker -ldl -lstack_walker.$*
eh_elfs.per_func: stack_walked.per_func.bin

View file

@ -14,9 +14,9 @@ void stack_filled() {
printf("#%d - %s — %%rip = 0x%lx, %%rbp = 0x%lx, %%rsp = 0x%lx\n",
++frame_id,
(dl_rc != 0) ? func_info.dli_sname : "[Unknown func]",
get_register(ctx, SW_REG_RIP),
get_register(ctx, SW_REG_RBP),
get_register(ctx, SW_REG_RSP));
ctx.rip,
ctx.rbp,
ctx.rsp);
});
}