On FreeBSD, as well as on the Solaris < 10, weak pthread_once stub is
always exported from libc. But it does nothing, which means that if
threaded library is not loaded, then pthread_once() call do not actually
call the initializer finction. The construct
if (likely (pthread_once != 0))
{
pthread_once(&trace_cache_once, &trace_cache_init_once);
then fails to initialize the trace cache on x86_64.
Work around by checking that the initializer was indeed called.
Note that this can break if libthr is loaded dynamically, but my belief
is that there is no platforms which allow dynamic loading of the threading
library.
Insert static branch prediction predicates in useful places and avoid
unnecessary code in the hottest paths. Bypass unnecessary indirect
calls, in particular to access_mem(), when known to be safe.
Since the fast unwinding code path doesn't need the full context,
a faster target dependent getcontext is implemented.
Signed-off-by: Lassi Tuura <lat@cern.ch>
Dropping the extra frame for unw_backtrace itself using unw_step is
approximately 15% slower than skipping the frame in tdep_trace. So
drop the frame in the latter, and make the function a private
implementation detail for libunwind, not an exported interface.
Also moves unw_getcontext call back into unw_backtrace to avoid an
extra call frame in case slow_backtrace does not get inlined into
unw_backtrace.
Adds new function to perform a pure stack walk without unwinding,
functionally similar to backtrace() but accelerated by an address
attribute cache the caller maintains across calls.
the instruction after the call for a normal frame. libunwind uses
IP-1 to lookup unwind information. However, this is not necessary for
interrupted frames such as signal frames (or interrupt frames) in
the kernel context.
This patch handles both cases correctly.
Based on work by Mark Wielaard <mwielaard@redhat.com>