mirror of
https://github.com/tobast/libunwind-eh_elf.git
synced 2025-01-08 18:33:42 +01:00
Update it some more.
(Logical change 1.147)
This commit is contained in:
parent
8365c7bc60
commit
3d24b59dba
1 changed files with 338 additions and 73 deletions
|
@ -10,55 +10,63 @@
|
|||
|
||||
\section{Introduction}
|
||||
|
||||
For \Prog{libunwind} to do its work, it needs to be able to
|
||||
reconstruct the \emph{frame state} of each frame in a call-chain. The
|
||||
frame state consists of some frame registers (such as the
|
||||
instruction-pointer and the stack-pointer) and the locations at which
|
||||
the current values of every callee-saved (``preserved'') resides.
|
||||
For \Prog{libunwind} to do its job, it needs to be able to reconstruct
|
||||
the \emph{frame state} of each frame in a call-chain. The frame state
|
||||
describes the subset of the machine-state that consists of the
|
||||
\emph{frame registers} (typically the instruction-pointer and the
|
||||
stack-pointer) and all callee-saved registers (preserved registers).
|
||||
The frame state describes each register either by providing its
|
||||
current value (for frame registers) or by providing the location at
|
||||
which the current value is stored (callee-saved registers).
|
||||
|
||||
The purpose of the dynamic unwind-info is therefore to provide
|
||||
\Prog{libunwind} the minimal information it needs about each
|
||||
dynamically generated procedure such that it can reconstruct the
|
||||
procedure's frame state.
|
||||
For statically generated code, the compiler normally takes care of
|
||||
emitting \emph{unwind-info} which provides the minimum amount of
|
||||
information needed to reconstruct the frame-state for each instruction
|
||||
in a procedure. For dynamically generated code, the runtime code
|
||||
generator must use the dynamic unwind-info interface provided by
|
||||
\Prog{libunwind} to supply the equivalent information. This manual
|
||||
page describes the format of this information in detail.
|
||||
|
||||
For the purpose of the following discussion, a \emph{procedure} is any
|
||||
contiguous piece of code. Normally, each procedure directly
|
||||
corresponds to a function in the source-language but this is not
|
||||
strictly required. For example, a runtime code-generator could
|
||||
translate a given function into two separate (discontiguous)
|
||||
procedures: one for frequently-executed (hot) code and one for
|
||||
rarely-executed (cold) code. Similarly, simple source-language
|
||||
functions (usually leaf functions) may get translated into code for
|
||||
which the default unwind-conventions apply and for such code, no
|
||||
dynamic unwind info needs to be registered.
|
||||
For the purpose of this discussion, a \emph{procedure} is defined to
|
||||
be an arbitrary piece of \emph{contiguous} code. Normally, each
|
||||
procedure directly corresponds to a function in the source-language
|
||||
but this is not strictly required. For example, a runtime
|
||||
code-generator could translate a given function into two separate
|
||||
(discontiguous) procedures: one for frequently-executed (hot) code and
|
||||
one for rarely-executed (cold) code. Similarly, simple
|
||||
source-language functions (usually leaf functions) may get translated
|
||||
into code for which the default unwind-conventions apply and for such
|
||||
code, it is not strictly necessary to register dynamic unwind-info.
|
||||
|
||||
Within a procedure, the code can be thought of as being divided into a
|
||||
sequence of \emph{regions}. Each region logically consists of an
|
||||
optional \emph{prologue}, a \emph{body}, and an optional
|
||||
\emph{epilogue}. If present, the prologue sets up the frame state for
|
||||
the body, which does the actual work of the procedure. For example,
|
||||
the prologue may need to allocate a stack-frame and save some
|
||||
callee-saved registers before the body can start executing.
|
||||
Correspondingly, the epilogue, if present, restores the previous frame
|
||||
state and thereby undoes the effect of the prologue. Regions are
|
||||
nested in the sense that the frame state at the end of a region serves
|
||||
as the entry-state of the next region. At the end of several nested
|
||||
regions, there may be a single epilogue which undoes the effect of all
|
||||
the prologues in the nested regions.
|
||||
A procedure logically consists of a sequence of \emph{regions}.
|
||||
Regions are nested in the sense that the frame state at the end of one
|
||||
region is, by default, assumed to be the frame state for the next
|
||||
region. Each region is thought of as being divided into a
|
||||
\emph{prologue}, a \emph{body}, and an \emph{epilogue}. Each of them
|
||||
can be empty. If non-empty, the prologue sets up the frame state for
|
||||
the body. For example, the prologue may need to allocate some space
|
||||
on the stack and save certain callee-saved registers. The body
|
||||
performs the actual work of the procedure but does not change the
|
||||
frame state in any way. If non-empty, the epilogue restores the
|
||||
previous frame state and as such it undoes or cancels the effect of
|
||||
the prologue. In fact, a single epilogue may undo the effect of the
|
||||
prologues of several (nested) regions.
|
||||
|
||||
Even though logically we think of the prologue, body, and epilogue as
|
||||
separate entities, optimizing code-generators will generally
|
||||
interleave instructions from all three entities to achieve higher
|
||||
performance. In fact, as far as the dynamic unwind-info is concerned,
|
||||
there is no distinction at all between prologue and body. Similarly,
|
||||
the exact set of instructions that make up an epilogue is also
|
||||
irrelevant. The only point in the epilogue that needs to be described
|
||||
explicitly is the point at which the stack-pointer gets restored. The
|
||||
reason this point needs to be described is that once the stack-pointer
|
||||
is restored, all values saved in the deallocated portion of the stack
|
||||
become invalid. All other locations that store the values of
|
||||
callee-saved register are assumed to remain valid throughout the end
|
||||
of the region.
|
||||
We should point out that even though the prologue, body, and epilogue
|
||||
are logically separate entities, optimizing code-generators will
|
||||
generally interleave instructions from all three entities. For this
|
||||
reason, the dynamic unwind-info interface of \Prog{libunwind} makes no
|
||||
distinction whatsoever between prologue and body. Similarly, the
|
||||
exact set of instructions that make up an epilogue is also irrelevant.
|
||||
The only point in the epilogue that needs to be described explicitly
|
||||
by the dynamic unwind-info is the point at which the stack-pointer
|
||||
gets restored. The reason this point needs to be described is that
|
||||
once the stack-pointer is restored, all values saved in the
|
||||
deallocated portion of the stack frame become invalid and hence
|
||||
\Prog{libunwind} needs to know about it. The portion of the frame
|
||||
state not saved on the stack is assume to remain valid through the end
|
||||
of the region. For this reason, there is usually no need to describe
|
||||
instructions which restore the contents of callee-saved registers.
|
||||
|
||||
Within a region, each instruction that affects the frame state in some
|
||||
fashion needs to be described with an operation descriptor. For this
|
||||
|
@ -75,40 +83,286 @@ in the stack frame.
|
|||
|
||||
\section{Procedures}
|
||||
|
||||
unw\_dyn\_info\_t
|
||||
unw\_dyn\_proc\_info\_t
|
||||
unw\_dyn\_table\_info\_t
|
||||
unw\_dyn\_remote\_table\_info\_t
|
||||
A runtime code-generator registers the dynamic unwind-info of a
|
||||
procedure by setting up a structure of type \Type{unw\_dyn\_info\_t}
|
||||
and calling \Func{\_U\_dyn\_register}(), passing the address of the
|
||||
structure as the sole argument. The members of the
|
||||
\Type{unw\_dyn\_info\_t} structure are described below:
|
||||
\begin{itemize}
|
||||
\item[\Type{void~*}next] Private to \Prog{libunwind}. Must not be used
|
||||
by the application.
|
||||
\item[\Type{void~*}prev] Private to \Prog{libunwind}. Must not be used
|
||||
by the application.
|
||||
\item[\Type{unw\_word\_t} \Var{start\_ip}] The start-address of the
|
||||
instructions of the procedure (remember: procedure are defined to be
|
||||
contiguous pieces of code, so a single code-range is sufficient).
|
||||
\item[\Type{unw\_word\_t} \Var{end\_ip}] The end-address of the
|
||||
instructions of the procedure (non-inclusive, that is,
|
||||
\Var{end\_ip}-\Var{start\_ip} is the size of the procedure in
|
||||
bytes).
|
||||
\item[\Type{unw\_word\_t} \Var{gp}] The global-pointer value in use
|
||||
for this procedure. The exact meaing of the global-pointer is
|
||||
architecture-specific and on some architecture, it is not used at
|
||||
all.
|
||||
\item[\Type{int32\_t} \Var{format}] The format of the unwind-info.
|
||||
This member can be one of \Const{UNW\_INFO\_FORMAT\_DYNAMIC},
|
||||
\Const{UNW\_INFO\_FORMAT\_TABLE}, or
|
||||
\Const{UNW\_INFO\_FORMAT\_REMOTE\_TABLE}.
|
||||
\item[\Type{union} \Var{u}] This union contains one sub-member
|
||||
structure for every possible unwind-info format:
|
||||
\begin{description}
|
||||
\item[\Type{unw\_dyn\_proc\_info\_t} \Var{pi}] This member is used
|
||||
for format \Const{UNW\_INFO\_FORMAT\_DYNAMIC}.
|
||||
\item[\Type{unw\_dyn\_table\_info\_t} \Var{ti}] This member is used
|
||||
for format \Const{UNW\_INFO\_FORMAT\_TABLE}.
|
||||
\item[\Type{unw\_dyn\_remote\_table\_info\_t} \Var{rti}] This member
|
||||
is used for format \Const{UNW\_INFO\_FORMAT\_REMOTE\_TABLE}.
|
||||
\end{description}\
|
||||
The format of these sub-members is described in detail below.
|
||||
\end{itemize}
|
||||
|
||||
\section{Regions}
|
||||
\subsection{Proc-info format}
|
||||
|
||||
unw\_dyn\_region\_info\_t:
|
||||
- insn_count can be negative to indicate that the region is
|
||||
at the end of the procedure; in such a case, the negated
|
||||
insn_count value specifies the length of the final region
|
||||
in number of instructions. There must be at most one region
|
||||
with a negative insn_count and only the last region in a
|
||||
procedure's region list may be negative. Furthermore, both
|
||||
di->start\_ip and di->end\_ip must be valid.
|
||||
This is the preferred dynamic unwind-info format and it is generally
|
||||
the one used by full-blown runtime code-generators. In this format,
|
||||
the details of a procedure are described by a structure of type
|
||||
\Type{unw\_dyn\_proc\_info\_t}. This structure contains the following
|
||||
members:
|
||||
\begin{description}
|
||||
|
||||
\section{Operations}
|
||||
\item[\Type{unw\_word\_t} \Var{name\_ptr}] The address of a
|
||||
(human-readable) name of the procedure or 0 if no such name is
|
||||
available. If non-zero, The string stored at this address must be
|
||||
ASCII NUL terminated. For source languages that use name-mangling
|
||||
(such as C++ or Java) the string stored at this address should be
|
||||
the \emph{demangled} version of the name.
|
||||
|
||||
\item[\Type{unw\_word\_t} \Var{handler}] The address of the
|
||||
personality-routine for this procedure. Personality-routines are
|
||||
used in conjunction with exception handling. See the C++ ABI draft
|
||||
(http://www.codesourcery.com/cxx-abi/) for an overview and a
|
||||
description of the personality routine. If the procedure has no
|
||||
personality routine, \Var{handler} must be set to 0.
|
||||
|
||||
\item[\Type{uint32\_t} \Var{flags}] A bitmask of flags. At the
|
||||
moment, no flags have been defined and this member must be
|
||||
set to 0.
|
||||
|
||||
\item[\Type{unw\_dyn\_region\_info\_t~*}\Var{regions}] A NULL-terminated
|
||||
linked list of region-descriptors. See section ``Region
|
||||
descriptors'' below for more details.
|
||||
|
||||
\end{description}
|
||||
|
||||
\subsection{Table-info format}
|
||||
|
||||
This format is generally used when the dynamically generated code was
|
||||
derived from static code and the unwind-info for the dynamic and the
|
||||
static versions is identical. For example, this format can be useful
|
||||
when loading statically-generated code into an address-space in a
|
||||
non-standard fashion (i.e., through some means other than
|
||||
\Func{dlopen}()). In this format, the details of a group of procedures
|
||||
is described by a structure of type \Type{unw\_dyn\_table\_info}.
|
||||
This structure contains the following members:
|
||||
\begin{description}
|
||||
|
||||
\item[\Type{unw\_word\_t} \Var{name\_ptr}] The address of a
|
||||
(human-readable) name of the procedure or 0 if no such name is
|
||||
available. If non-zero, The string stored at this address must be
|
||||
ASCII NUL terminated. For source languages that use name-mangling
|
||||
(such as C++ or Java) the string stored at this address should be
|
||||
the \emph{demangled} version of the name.
|
||||
|
||||
\item[\Type{unw\_word\_t} \Var{segbase}] The segment-base value
|
||||
that needs to be added to the segment-relative values stored in the
|
||||
unwind-info. The exact meaning of this value is
|
||||
architecture-specific.
|
||||
|
||||
\item[\Type{unw\_word\_t} \Var{table\_len}] The length of the
|
||||
unwind-info (\Var{table\_data}) counted in units of words
|
||||
(\Type{unw\_word\_t}).
|
||||
|
||||
\item[\Type{unw\_word\_t} \Var{table\_data}] A pointer to the actual
|
||||
data encoding the unwind-info. The exact format is
|
||||
architecture-specific (see architecture-specific sections below).
|
||||
|
||||
\end{description}
|
||||
|
||||
\subsection{Remote table-info format}
|
||||
|
||||
The remote table-info format has the same basic purpose as the regular
|
||||
table-info format. The only difference is that when \Prog{libunwind}
|
||||
uses the unwind-info, it will keep the table data in the target
|
||||
address-space (which may be remote). Consequently, the type of the
|
||||
\Var{table\_data} member is \Type{unw\_word\_t} rather than a pointer.
|
||||
This implies that \Prog{libunwind} will have to access the table-data
|
||||
via the address-space's \Func{access\_mem}() call-back, rather than
|
||||
through a direct memory reference.
|
||||
|
||||
From the point of view of a runtime-code generator, the remote
|
||||
table-info format offers no advantage and it is expected that such
|
||||
generators will describe their procedures either with the proc-info
|
||||
format or the normal table-info format. The main reason that the
|
||||
remote table-info format exists is to enable the
|
||||
address-space-specific \Func{find\_proc\_info}() callback (see
|
||||
\SeeAlso{unw\_create\_addr\_space}(3)) to return unwind tables whose
|
||||
data remains in remote memory. This can speed up unwinding (e.g., for
|
||||
a debugger) because it reduces the amount of data that needs to be
|
||||
loaded from remote memory.
|
||||
|
||||
\section{Regions descriptors}
|
||||
|
||||
A region descriptor is a variable length structure that describes how
|
||||
each instruction in the region affects the frame state. Of course,
|
||||
most instructions in a region usualy do not change the frame state and
|
||||
for those, nothing needs to be recorded in the region descriptor. A
|
||||
region descriptor is a structure of type
|
||||
\Type{unw\_dyn\_region\_info\_t} and has the following members:
|
||||
\begin{description}
|
||||
\item[\Type{unw\_dyn\_region\_info\_t~*}\Var{next}] A pointer to the
|
||||
next region. If this is the last region, \Var{next} is \Const{NULL}.
|
||||
\item[\Type{int32\_t} \Var{insn\_count}] The length of the region in
|
||||
instructions. Each instruction is assumed to have a fixed size (see
|
||||
architecture-specific sections for details). The value of
|
||||
\Var{insn\_count} may be negative in the last region of a procedure
|
||||
(i.e., it may be negative only if \Var{next} is \Const{NULL}). A
|
||||
negative value indicates that the region covers the last \emph{N}
|
||||
instructions of the procedure, where \emph{N} is the absolute value
|
||||
of \Var{insn\_count}.
|
||||
\item[\Type{uint32\_t} \Var{op\_count}] The (allocated) length of
|
||||
the \Var{op\_count} array.
|
||||
\item[\Type{unw\_dyn\_op\_t} \Var{op}] An array of dynamic unwind
|
||||
directives. See Section ``Dynamic unwind directives'' for a
|
||||
description of the directives.
|
||||
\end{description}
|
||||
A region descriptor with an \Var{insn\_count} of zero is an
|
||||
\emph{empty region} and such regions are perfectly legal. In fact,
|
||||
empty regions can be useful to establish a particular frame state
|
||||
before the start of another region.
|
||||
|
||||
A single region list can be shared across multiple procedures provided
|
||||
those procedures share a common prologue and epilogue (their bodies
|
||||
may differ, of course). Normally, such procedures consist of a canned
|
||||
prologue, the body, and a canned epilogue. This could be described by
|
||||
two regions: one covering the prologue and one covering the epilogue.
|
||||
Since the body length is variable, the latter region would need to
|
||||
specify a negative value in \Var{insn\_count} such that
|
||||
\Prog{libunwind} knows that the region covers the end of the procedure
|
||||
(up to the address specified by \Var{end\_ip}).
|
||||
|
||||
The region descriptor is a variable length structure to make it
|
||||
possible to allocate all the necessary memory with a single
|
||||
memory-allocation request. To facilitate the allocation of a region
|
||||
descriptors \Prog{libunwind} provides a helper routine with the
|
||||
following synopsis:
|
||||
|
||||
\noindent
|
||||
\Type{size\_t} \Func{\_U\_dyn\_region\_size}(\Type{int} \Var{op\_count});
|
||||
|
||||
This routine returns the number of bytes needed to hold a region
|
||||
descriptor with space for \Var{op\_count} unwind directives. Note
|
||||
that the length of the \Var{op} array does not have to match exactly
|
||||
with the number of directives in a region. Instead, it is sufficient
|
||||
if the \Var{op} array contains at least as many entries as there are
|
||||
directives, since the end of the directives can always be indicated
|
||||
with the \Const{UNW\_DYN\_STOP} directive.
|
||||
|
||||
\section{Dynamic unwind directives}
|
||||
|
||||
A dynamic unwind directive describes how the frame state changes
|
||||
at a particular point within a region. The description is in
|
||||
the form of a structure of type \Type{unw\_dyn\_op\_t}. This
|
||||
structure has the following members:
|
||||
\begin{description}
|
||||
\item[\Type{int8\_t} \Var{tag}] The operation tag. Must be one
|
||||
of the \Type{unw\_dyn\_operation\_t} values described below.
|
||||
\item[\Type{int8\_t} \Var{qp}] The qualifying predicate that controls
|
||||
whether or not this directive is active. This is useful for
|
||||
predicated architecturs such as IA-64 or ARM, where the contents of
|
||||
another (callee-saved) register determines whether or not an
|
||||
instruction is executed (takes effect). If the directive is always
|
||||
active, this member should be set to the manifest constant
|
||||
\Const{\_U\_QP\_TRUE} (this constant is defined for all
|
||||
architectures, predicated or not).
|
||||
\item[\Type{int16\_t} \Var{reg}] The number of the register affected
|
||||
by the instruction.
|
||||
\item[\Type{int32\_t} \Var{when}] The region-relative number of
|
||||
the instruction to which this directive applies. For example,
|
||||
a value of 0 means that the effect described by this directive
|
||||
has taken place once the first instruction in the region has
|
||||
executed.
|
||||
\item[\Type{unw\_word\_t} \Var{val}] The value to be applied by the
|
||||
operation tag. The exact meaning of this value varies by tag. See
|
||||
Section ``Operation tags'' below.
|
||||
\end{description}
|
||||
It is perfectly legitimate to specify multiple dynamic unwind
|
||||
directives with the same \Var{when} value, if a particular instruction
|
||||
has a complex effect on the frame state.
|
||||
|
||||
Empty regions by definition contain no actual instructions and as such
|
||||
the directives are not tied to a particular instruction. By
|
||||
convention, the \Var{when} member should be set to 0, however.
|
||||
|
||||
There is no need for the dynamic unwind directives to appear
|
||||
in order of increasing \Var{when} values. If the directives happen to
|
||||
be sorted in that order, it may result in slightly faster execution,
|
||||
but a runtime code-generator should not go to extra lengths just to
|
||||
ensure that the directives are sorted.
|
||||
|
||||
IMPLEMENTATION NOTE: should \Prog{libunwind} implementations for
|
||||
certain architectures prefer the list of unwind directives to be
|
||||
sorted, it is recommended that such implementations first check
|
||||
whether the list happens to be sorted already and, if not, sort the
|
||||
directives explicitly before the first use. With this approach, the
|
||||
overhead of explicit sorting is only paid when there is a real benefit
|
||||
and if the runtime code-generator happens to generated sorted lists
|
||||
naturally, the performance penalty is limited to a simple O(N) check.
|
||||
|
||||
\subsection{Operations tags}
|
||||
|
||||
The possible operation tags are defined by enumeration type
|
||||
\Type{unw\_dyn\_operation\_t} which defines the following
|
||||
values:
|
||||
\begin{description}
|
||||
|
||||
\item[\Const{UNW\_DYN\_STOP}] Marks the end of the dynamic unwind
|
||||
directive list. All remaining entries in the \Var{op} array of the
|
||||
region-descriptor are ignored. This tag is guaranteed to have a
|
||||
value of 0.
|
||||
|
||||
\item[\Const{UNW\_DYN\_SAVE\_REG}] Marks an instruction which saves
|
||||
register \Var{reg} to register \Var{val}.
|
||||
|
||||
\item[\Const{UNW\_DYN\_SPILL\_FP\_REL}] Marks an instruction which
|
||||
spills register \Var{reg} to a frame-pointer-relative location. The
|
||||
frame-pointer-relative offset is given by the value stored in member
|
||||
\Var{val}. See the architecture-specific sections for a description
|
||||
of the stack frame layout.
|
||||
|
||||
\item[\Const{UNW\_DYN\_SPILL\_SP\_REL}] Marks an instruction which
|
||||
spills register \Var{reg} to a stack-pointer-relative location. The
|
||||
stack-pointer-relative offset is given by the value stored in member
|
||||
\Var{val}. See the architecture-specific sections for a description
|
||||
of the stack frame layout.
|
||||
|
||||
\item[\Const{UNW\_DYN\_ADD}] Marks an instruction which adds
|
||||
the constant value \Var{val} to register \Var{reg}. To add subtract
|
||||
a constant value, store the two's-complement of the value in
|
||||
\Var{val}. The set of registers that can be specified for this tag
|
||||
is described in the architecture-specific sections below.
|
||||
|
||||
\item[\Const{UNW\_DYN\_POP\_FRAMES}]
|
||||
|
||||
\item[\Const{UNW\_DYN\_LABEL\_STATE}]
|
||||
|
||||
\item[\Const{UNW\_DYN\_COPY\_STATE}]
|
||||
|
||||
\item[\Const{UNW\_DYN\_ALIAS}]
|
||||
|
||||
\end{description}
|
||||
|
||||
unw\_dyn\_operation\_t
|
||||
unw\_dyn\_op\_t
|
||||
\_U\_QP\_TRUE
|
||||
|
||||
unw\_dyn\_info\_format\_t
|
||||
|
||||
- instructions don't have to be sorted in increasing order of ``when''
|
||||
values: In general, if you can generate the sorted order easily
|
||||
(e.g., without an explicit sorting step), I'd recommend doing so
|
||||
because in that case, should some version of libunwind ever require
|
||||
sorted order, libunwind can verify in O(N) that the list is sorted
|
||||
already. In the particular case of the ia64-version of libunwind, a
|
||||
sorted order won't help, since it always scans the instructions up
|
||||
to UNW_DYN_STOP.
|
||||
|
||||
\_U\_dyn\_region\_info\_size(opcount);
|
||||
\_U\_dyn\_op\_save\_reg();
|
||||
\_U\_dyn\_op\_spill\_fp\_rel();
|
||||
\_U\_dyn\_op\_spill\_sp\_rel();
|
||||
|
@ -119,6 +373,17 @@ unw\_dyn\_info\_format\_t
|
|||
\_U\_dyn\_op\_alias();
|
||||
\_U\_dyn\_op\_stop();
|
||||
|
||||
\section{IA-64 specifics}
|
||||
|
||||
- meaning of segbase member in table-info/table-remote-info format
|
||||
- format of table\_data in table-info/table-remote-info format
|
||||
- instruction size: each bundle is counted as 3 instructions, regardless
|
||||
of template (MLX)
|
||||
- describe stack-frame layout, especially with regards to sp-relative
|
||||
and fp-relative addressing
|
||||
- UNW\_DYN\_ADD can only add to ``sp'' (always a negative value); use
|
||||
POP\_FRAMES otherwise
|
||||
|
||||
\section{See Also}
|
||||
|
||||
\SeeAlso{libunwind(3)},
|
||||
|
|
Loading…
Reference in a new issue