Writeup: ELF files

This commit is contained in:
Théophile Bastian 2024-01-06 12:13:21 +01:00
parent 93ffefc8f4
commit 5d4d3e34ae
2 changed files with 50 additions and 2 deletions

View file

@ -117,8 +117,49 @@ In the case of assembled binaries, as all analyzers were run on Linux,
executables or object files are ELF files. Some analyzers work on sections of
the file defined by user-provided offsets in the binary, while others require
the presence of \iaca{} markers around the code portion or portions to be
analyzed. Those markers, introduced by \iaca{}, consist in the following
assembly snippets: \todo{}
analyzed. Those markers, introduced by \iaca{} as C-level preprocessor
statements, consist in the following x86 assembly snippets:
\hfill\begin{minipage}{0.35\textwidth}
\begin{lstlisting}[language={[x86masm]Assembler}]
mov ebx, 111
db 0x64, 0x67, 0x90
\end{lstlisting}
\textit{\iaca{} start marker}
\end{minipage}\hfill\begin{minipage}{0.35\textwidth}
\begin{lstlisting}[language={[x86masm]Assembler}]
mov ebx, 222
db 0x64, 0x67, 0x90
\end{lstlisting}
\textit{\iaca{} end marker}
\end{minipage}
\medskip
On UNIX-based operating systems, the standard format for assembled binaries
---~either object files (\lstc{.o}) or executables~--- is ELF~\cite{elf_tis}.
Such files are organized in sections, the assembled instructions themselves
being found in the \texttt{.text} section ---~the rest holding metadata,
program data (strings, icons, \ldots), debugging information, etc. When an ELF
is loaded to memory for execution, each segment may be \emph{mapped} to a
portion of the address space. For instance, if the \texttt{.text} section has
1024 bytes, starting at offset 4096 of the ELF file itself, it may be mapped at
virtual address \texttt{0x454000}; as such, the byte that could be read from
the program by dereferencing address \texttt{0x454010} would be the 16\up{th}
byte from the \texttt{.text} section, that is, the byte at offset 4112 in the
ELF file.
Throughout the ELF file, \emph{symbols} are defined as references, or pointers,
to specific offsets or chunks in the file. This mechanism is used, among
others, to refer to the program's function. For instance, a symbol
\texttt{main} may be defined, that would point to the offset of the first byte
of the \lstc{main} function, and may also hold its total number of bytes.
Both these mechanisms can be used to identify, without \iaca{} markers or the
like, a section of ELF file to be analyzed: an offset and size in the
\texttt{.text} section can be provided (which can be found with tools like
\lstc{objdump}), or a symbol name can be provided, if an entire function is to
be analyzed.
\subsection{Examples with \llvmmca}

View file

@ -158,3 +158,10 @@
archivePrefix={arXiv},
primaryClass={cs.PF}
}
@misc{elf_tis,
title={Tool interface standard (TIS) executable and linking format (ELF) specification version 1.2},
author={{TIS} Committee and others},
year={1995},
publisher={May}
}