dwarf: If the dwarf_readu8 call to set op fails, and if there are register
states pushed onto the stack, the stack is not emptied before the function
returns. This change addresses that.
Most of the rest is eliminating ‘goto fail’ from the code.
0-valued hints are used when they are just initial values with no use as
hints. And I think there were some other problems as well.
This patch cleans up and stores hint values with 1 added, so that 0-valued
hints can be ignored.
unw_get_proc_info must always load the unwind info so that unw_resume
works with GNU_args_size expressions, but must not update
use_prev_expr unless we are unw_step()ing.
blame rev: 4b63a536ee
reported-by: Doug Moore <dougm@rice.edu>
Centralize gnu_debuglink logic in elfxx. Remove previous duplicated logic
from Gfind_proc_info and os-linux.
Logic is roughly the same as previous load_debug_frame, but uses VLAs
instead of malloc.
This fixes GCC 4.9.3 warnings (Linux/mipsel):
In file included from dwarf/Lfind_unwind_table.c:4:0:
dwarf/Gfind_unwind_table.c: In function '_ULmips_dwarf_find_unwind_table':
dwarf/Gfind_unwind_table.c:140:14: warning: cast from pointer to integer of
different size [-Wpointer-to-int-cast]
addr = (unw_word_t) (hdr + 1);
^
dwarf/Gfind_unwind_table.c:196:50: warning: cast from pointer to integer of
different size [-Wpointer-to-int-cast]
+ (addr - (unw_word_t) edi->ei.image
^
dwarf/Gfind_unwind_table.c:202:40: warning: cast from pointer to integer of
different size [-Wpointer-to-int-cast]
+ ((unw_word_t) hdr - (unw_word_t)
edi->ei.image
^
dwarf/Gfind_unwind_table.c:202:59: warning: cast from pointer to integer of
different size [-Wpointer-to-int-cast]
+ ((unw_word_t) hdr - (unw_word_t)
edi->ei.image
^
```
src/dwarf/Gfind_proc_info-lsb.c:536:16: warning: 'eh_frame' may be used uninitialized in this function [-Wmaybe-uninitialized]
Elf_W (Addr) eh_frame;
^
```
Introduced-in: 25413c729a
It is possible to have multiple CFA_args_size adjustments for a single
frame. If the CFA_args_size adjustment is immediately following the
return from a function which can raise an exception, it is possible to
incorrectly adjust the stack pointer. Consider the following:
...
.cfi_escape 0x2e, 0x00
call f
.Ltmp:
.cfi_escape 0x2e, 0x10
lea label@GOTOFF(%ebx), %eax
...
Because we process the CFI program up to and *INCLUDING* IP, where the
IP is the RA, we would process the associated DW_CFA_GNU_args_size for
the post-call instruction. The result would be a DW_CFA_GNU_args_size
of 0x10 rather than 0x00, resulting in an incorrect stack adjustment.
Handle this by processing the CFI operation but not adjusting the state
record unless we are below the current IP.
Add interface for configurable dwarf cache size
* Use item size and round up to nearest power of 2.
* Initial cache still exists in BSS. Without this, it means we would fail
backtrace when out of memory. The test-mem test fails without this
When resuming execution, DW_CFA_GNU_args_size from the current frame
must be added back to the stack pointer. Clang now generates these frequently
at -O3. A simple repro for x86_64, that will crash with clang ~3.9 or newer:
void f(int, int,int,int,int,int,int,int,int);
int main() {
try {
f(0,1,2,3,4,5,6,7,8);
} catch (int) {
return 0;
}
return 1;
}
Where f is something that throws an int, but in a different translation unit to
prevent optimization.
This results in cfi instructions before the call:
.cfi_escape 0x2e, 0x20
Grabbing the args_size means fully parsing the cfi in the current frame, which
is unfortunate because it means nearly twice the work at each step. The logic
to grab args_size can be in unw_step or get_proc_info (since this is always
called before resuming in stack unwinding). Putting it in get_proc_info allows
the more common unw_step code to remain fast.
It would potentially fit in nicely with a proc info cache (as mentioned in the
if0 comment block)
GCC versions 4.9~current will often generate stack alignment prologues like:
lea 0x8(%rsp),%r10
and $0xfffffffffffffff0,%rsp
...
push %rbp
mov %rsp, %rbp
push %r10
resulting in dwarf expressions:
DW_CFA_def_cfa_expression (DW_OP_breg6: -8; DW_OP_deref)
DW_CFA_expression: r6 (rbp) (DW_OP_breg6: 0)
These prologues seem to be generated for SSE/AVX code, but sometimes
other times as well.
tdep_trace fastpath currently falls back to the slow dwarf parsing path
if it encounters any cfa_expressions. Unfortunately this is happening
often enough in our codebase to cause perf issues. We could also fix the
fallback path (make the rs cache bigger, lock-free instead of locking, etc),
but that seems like a separate issue, and it will ever be as fast as the tracing
code. Our binaries each have at least ~100 functions in them like this.
This patch teaches the tdep_trace about the two specific cfa_expressions,
which really just result in a single extra memory dereference of the stack
at a fixed offset from rbp.
Improves support for binaries missing the GNU_EH_FRAME segment
(.eh_frame_hdr section) by adding a function
'dwarf_find_eh_frame_section' that can create a synthetic one.
This fixes GCC 4.9.3 warnings (Linux/mipsel):
dwarf/Gfind_proc_info-lsb.c: In function 'locate_debug_info':
dwarf/Gfind_proc_info-lsb.c:244:23: warning: 'mi.buf_end' may be used
uninitialized in this function [-Wmaybe-uninitialized]
struct map_iterator mi;
^
dwarf/Gfind_proc_info-lsb.c:244:23: warning: 'mi.buf' may be used
uninitialized in this function [-Wmaybe-uninitialized]
In file included from dwarf/Gfind_proc_info-lsb.c:46:0:
./os-linux.h:292:27: warning: 'mi.buf_size' may be used uninitialized in
this function [-Wmaybe-uninitialized]
munmap (mi->buf_end - mi->buf_size, mi->buf_size);
^
dwarf/Gfind_proc_info-lsb.c:244:23: note: 'mi.buf_size' was declared here
struct map_iterator mi;
^
By default, the start_ip_offset in libunwind's table_entry struct is
relative to the unw_dyn_info_t's segbase. This presents a problem
for us in conjunction with using LLVM's MCJIT because it likes to
spread text sections and the corresponding eh_frame sections quite
far apart. This represents my attempt to support this use case in the
simplest manner that is backwards compatible, by adding a new format
kind (UNW_INFO_FORMAT_REMOTE_TABLE2) that indicates that the
`start_ip_offset` should be interpreted as relative to `start_ip`
rather than segbase.
Due to a bug in the gold linker[1], the .eh_frame and .eh_frame_hdr
sections contains garbage. When dwarf_extract_proc_info_from_fde tried
to look up the begin of the CIE subsection, it would underflow the
.eh_frame segment, resulting in a crash[2].
This patch avoids that crash by checking whether the CIE pointer is
located after the begin of the .eh_frame section. The variable "base"
was misused in various places as a boolean (decode as .debug_frame or
decode as .eh_frame). These instances have been renamed to
is_debug_frame where applicable.
Tested on Linux x86_64.
[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=17639
[2]: http://lists.nongnu.org/archive/html/libunwind-devel/2014-11/msg00009.html
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Ubuntu's libc-bin (2.15-0ubuntu20.2) on x86_64 uses DW_CFA_val_expression
in describing the pthread spinlock operations __lll_unlock_wake() and
__lll_lock_wait(). libunwind 1.1 doesn't understand that opcode and
so backtraces from those operations are truncated.
This changeset adds basic support for it, by adding a new type to
dwarf_loc_t that describes the register's actual contents rather than
its location. I've only implemented the new type for x86_64, and
stubbed it out for all other architectures -- it looks like a lot
of that code is duplicated so oughtn't to be that hard, but I don't
have test cases for them.
Tested that DW_CFA_val_expression works on x86_64 (by using
https://code.google.com/p/gperftools/ on a lock-heavy program).
Build-tested on x86, x86_64 and arm. The unit tests don't pass for me
on any of those archs, but this cset doesn't break anything that was
passing before.
Signed-off-by: Tim Deegan <tjd@phlegethon.org>
The dwarf_eval_expr routine uses macros push, pop, and pick to
manipulate the DWARF expression stack. When these macros are
nested, e.g. in the implementation of DW_OP_dup:
push (pick (0));
the combination can lead to unfortunate results.
In particular, when substituting into:
do {
if (tos >= MAX_EXPR_STACK_SIZE)
{
Debug (1, "Stack overflow\n");
return -UNW_EINVAL;
}
stack[tos++] = (x);
} while (0)
a value of "x" that makes use of "tos" (as instances of the
pick or pop macros do), the resulting expression will both
use and modify tos without an intervening sequence point,
which is undefined behavior according to the C standard.
And in fact with current GCC on PowerPC, this leads to a
miscompilation of the DW_OP_dup implementation.
This patch fixes the problem by assigning "x" to a
temporary variable before modifying tos.
Signed-off-by: Ulrich Weigand <uweigand@de.ibm.com>
The DWARF code allocates its unwind_info objects out of a
memory pool. The code which frees the object therefore calls
the mempool freeing code. However, there are cases where the
free code will be run with an unwind_info that was allocated
through a different mechanism (e.g. an ARM exidx table entry).
In these cases, the object should not be freed through the
mempool code.
To correct this, a check was added to ensure that the unwind_info
is of the appropriate type before passing the object along to the
mempool to be freed.
This change adds some special cases to allow libunwind to compile
for QNX.
* QNX's copy of <elf.h> and <link.h> reside in sys/ instead. To deal
with this, an AC_CHECK_HEADERS() was added to check for the files
in both locations.
* Similarly, QNX does not have <endian.h>. In cases where the file is
not found, logic was added to refer to QNX-specific macros to determine
endianness.
* The QCC compiler, which is a wrapper around GCC, cannot handle some
standard GCC options. Therefore, logic was added to check for QCC,
and when it is found, to suppress the use of -lgcc, and to express the
option -nostartfiles as -Wc,-nostartfiles instead, which is correctly
passed on to the underlying GCC.
* Finally, the support file os-qnx.c was added, patterned after the existing
os-*.c files. Only local image lookup is currently supported (see the
comments for more information), but this is sufficient for QNX, since
ptrace is not supported there anyway, and that is the only case where the
function is required to do remote image lookup.
Change-Id: Ie7934f94a7317bdde59335f2acd4c3a97c0384c1
Unwinding over ptrace and unwinding coredump fail to lookup the
.debug_frame dwarf data when the ELF file text segment virtual address
is non-zero. Looking at some binaries, the virtual address is non-zero
for non-pie binaries, and zero for PIC shared libraries and PIE
executables.
The core dump unwinder can be used for demonstrating the bug. Without
this patch, the unwinding fails badly (testing with a ARM qemu image):
$ UNW_ARM_UNWIND_METHOD=1 ./test-coredump-unwind core `cat backing_files`
test-coredump-unwind: unw_get_proc_info(ip=0x86d8) failed: ret=-10
After applying this patch, we can unwind all the way until running out
of dwarf data:
$ UNW_ARM_UNWIND_METHOD=1 ./test-coredump-unwind core `cat backing_files`
ip=0x000086d8 proc=000086d4-000086dc handler=0x00000000 lsda=0x00000000
test-coredump-unwind: step
test-coredump-unwind: step done:1
ip=0x000086ef proc=000086dc-000086f2 handler=0x00000000 lsda=0x00000000
test-coredump-unwind: step
test-coredump-unwind: step done:1
ip=0x000086e7 proc=000086dc-000086f2 handler=0x00000000 lsda=0x00000000
test-coredump-unwind: step
test-coredump-unwind: step done:1
ip=0x00008597 proc=00008584-0000859a handler=0x00000000 lsda=0x00000000
test-coredump-unwind: step
test-coredump-unwind: step done:1
ip=0x76eacc3b proc=76eacba0-76eaccec handler=0x00000000 lsda=0x00000000
test-coredump-unwind: step
test-coredump-unwind: step done:1
test-coredump-unwind: unw_get_proc_info(ip=0x85c3) failed: ret=-10
Note how the binary itself is mapped to address 0x8000, the virtual
address for the text segment is 0x8000, and the .debug_frame program
counter values are relative to 0:
$ tr ' ' '\n' < backing_files
0x8000:/home/user/tests/crasher
0x76e96000:/lib/arm-linux-gnueabi/libc-2.13.so
0x76f77000:/lib/arm-linux-gnueabi/libgcc_s.so.1
0x76f88000:/lib/arm-linux-gnueabi/ld-2.13.so
$ readelf -l crasher
Elf file type is EXEC (Executable file)
Entry point 0x859d
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x0007b0 0x000087b0 0x000087b0 0x00030 0x00030 R 0x4
PHDR 0x000034 0x00008034 0x00008034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x00008154 0x00008154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.3]
LOAD 0x000000 0x00008000 0x00008000 0x007e4 0x007e4 R E 0x8000
LOAD 0x000efc 0x00010efc 0x00010efc 0x00148 0x00154 RW 0x8000
DYNAMIC 0x000f08 0x00010f08 0x00010f08 0x000f8 0x000f8 RW 0x4
NOTE 0x000168 0x00008168 0x00008168 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
GNU_RELRO 0x000efc 0x00010efc 0x00010efc 0x00104 0x00104 R 0x1
$ readelf --debug-dump=frames crasher | grep FDE
00000010 00000024 00000000 FDE cie=00000000 pc=00008614..000086d4
00000038 0000000c 00000000 FDE cie=00000000 pc=000086d4..000086dc
00000048 00000014 00000000 FDE cie=00000000 pc=000086dc..000086f2
00000060 00000014 00000000 FDE cie=00000000 pc=00008584..0000859a
Just pass potentially NULL pointers to free() in the error path in
load_debug_frame(). Saved 40 bytes of code in libunwind.so on ARM -O2
thumb build at the expense of slightly slower execution.