Insert static branch prediction predicates in useful places and avoid
unnecessary code in the hottest paths. Bypass unnecessary indirect
calls, in particular to access_mem(), when known to be safe.
This is rather on the obvious side.
While doing strace on an executable using libunwind, I noticed a
lot of:
msync(0, 1, MS_SYNC) = -1 ENOMEM (Cannot allocate memory)
Since we know that the first page isn't mapped (or at least doesn't
contain the data we are looking for), we can eliminate all such
msync calls.
Tested on Linux/x86_64 with no regressions.
bad/missing unwind information, which could result in libunwind
dereferencing bad pointers. This mechanism is based on msync(2) system
call and significantly reduces the chances of a bad pointer
dereference in libunwind.
The original idea was to turn this mechanism on only when necessary
i.e. libunwind didn't find proper unwind information for a IP.
There are a couple of problems in the current implementation.
* The flag is global and is modified without locking
* The flag isn't reset when starting a new unwind
The attached patch makes ->validate a per-thread setting by moving it
into struct cursor from unw_local_addr_space and resets it to false
when starting a new unwind. As a result, cursor->as_arg points to the
cursor itself instead of the ucontext (for the local case).
This was found to reduce the number of msync() system calls from an
application using libunwind significantly.
Signed-off-by: Paul Pluzhnikov <ppluzhnikov@google.com>
Signed-off-by: Arun Sharma <arun.sharma@google.com>
routine and add address-space argument. This is needed because on
PPC64, a the function-name symbol refers to a function descriptor
(unlike, for example, on ia64, where the @fptr() operator is needed to
refer to a function descriptor). Thus, in order to look up the name
of a function, we need to dereference the function descriptor. To
make matters more "interesting", the function descriptors are normally
resolved by the dynamic linker, so we can't get their values from the
ELF file. Instead, we have to read them from the running image, hence
the need for the address-space argument.