Eliminate future tense

2018-08-19 18:55:16 +02:00 · 2018-08-19 18:55:16 +02:00 · 887027f0f3
commit 887027f0f3
parent d031d8ec49
1 changed files with 22 additions and 23 deletions
--- a/report/report.tex
+++ b/report/report.tex
@ -125,7 +125,7 @@ from it. For instance, when running a debugger, a frequent usage is to obtain a
 IP\@. This actually observes the stack to find the different stack frames, and
 decode them to identify the function names, parameter values, etc.
-This operation is far from trivial. Often, a stack frame will only make sense
+This operation is far from trivial. Often, a stack frame only makes sense
 when the machine registers hold the right values. These values,
 however, are to be restored from the previous stack frame, where they are
 stored. This imposes to \emph{walk} the stack, reading the frames one after
@ -140,8 +140,8 @@ frame, and thus be able to decode the next frame recursively, is called
 Let us consider a stack with x86\_64 calling conventions, such as shown in
 Figure~\ref{fig:call_stack}. Assuming the compiler decided here \emph{not} to
 use \reg{rbp}, and assuming the function allocates \eg{} a buffer of 8
-integers, the area allocated for local variables should be at least $32$ bytes
+integers, the area allocated for local variables is at least $32$ bytes
-long (for 4-bytes integers), and \reg{rsp} will be pointing below this area.
+long (for 4-bytes integers), and \reg{rsp} points below this area.
 Left apart analyzing the assembly code produced, there is no way to find where
 the return address is stored, relatively to \reg{rsp}, at some arbitrary point
 of the function. Even when \reg{rbp} is used, there is no easy way to guess
@ -347,9 +347,9 @@ clone of the previous one, which can then be altered (\eg{} here by setting
 \lstc{CFA} to $\reg{rsp} + 48$). This means that every line is defined \wrt{}
 the previous one, and that the IPs of the successive rows cannot be determined
 without evaluating every row that comes before in the first place. Thus,
-unwinding a frame from an IP close to the end of the frame will require
+unwinding a frame from an IP close to the end of the frame requires evaluating
-evaluating pretty much every DWARF row in the table before reaching the
+pretty much every DWARF row in the table before reaching the relevant
-relevant information, slowing down drastically the unwinding process.
+information, slowing down drastically the unwinding process.
 \FloatBarrier{}
@ -397,19 +397,18 @@ parse the relevant FDE from its start, until it finds the row it was seeking.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{DWARF semantics}\label{sec:semantics}
-We will now define semantics covering the operations used for FDEs described in
+We now define semantics covering the operations used for FDEs described in the
-the DWARF standard~\cite{dwarf5std}, such as seen in
+DWARF standard~\cite{dwarf5std}, such as seen in Listing~\ref{lst:ex1_dwraw},
-Listing~\ref{lst:ex1_dwraw}, with the exception of DWARF expressions. These are
+with the exception of DWARF expressions. These are not treated here, because
-not treated here, because they form a rich language and would take a lot of
+they form a rich language and would take a lot of time and space to formalize,
-time and space to formalize, while in the mean time being seldom used --~see
+while in the mean time being seldom used --~see Section~\ref{ssec:instr_cov}.
 Section~\ref{ssec:instr_cov}.
 These semantics are defined \wrt{} the well-formalized C language, and
 are passing through an intermediary language. The DWARF language can read the
 whole memory, as well as registers, and is always executed for some instruction
-pointer. The C function representing it will thus take as parameters an array
+pointer. The C function representing it thus takes as parameters an array
-of the registers' values as well as an IP, and will return another array of
+of the registers' values as well as an IP, and returns another array of
-registers values, which will represent the evaluated DWARF row.
+registers values, which represents the evaluated DWARF row.
 \subsection{Concerning correctness}\label{ssec:sem_correctness}
@ -429,7 +428,7 @@ instructions are up to variants --~most instructions exist in multiple formats
 to handle various operands formatting for space optimisation. Since we won't be
 talking about the underlying file format here, those variations between \eg{}
 \dwcfa{advance\_loc1} and \dwcfa{advance\_loc2} --~which differ only on the
-number of bytes of their operand~-- are irrelevant and will be eluded.
+number of bytes of their operand~-- are irrelevant and are eluded.
 As said before, we also elude here references to DWARF expressions, as they are
 complex and are mostly not implemented in the actual compiler anyway --~left
@ -487,7 +486,7 @@ a language.
 \subsection{Intermediary language $\intermedlang$}
-A first pass will translate DWARF instructions into this intermediary language
+A first pass translates DWARF instructions into this intermediary language
 $\intermedlang$. It is designed to be more mathematical, representing the same
 thing, but abstracting all the data compression of the DWARF format away, so
 that we can better reason on it and transform it into C code.
@ -535,7 +534,7 @@ here.
 The target language of these semantics is a C function, to be interpreted
 \wrt{} the C11 standard~\cite{c11std}. The function is supposed to be run
 in the context of the program being unwound. In particular, it must be able to
-dereference some pointer derived from DWARF instructions that will point to the
+dereference some pointer derived from DWARF instructions that points to the
 execution stack, or even the heap.
 This function takes as arguments an instruction pointer --~supposedly
@ -544,7 +543,7 @@ fresh array of register values after unwinding this call frame. The function is
 compositional: it can be called twice in a row to unwind two stack frames,
 unless the IP obtained after the first unwinding comes from another shared
 object file, for instance a call to \prog{libc}. In this case, unwinding the
-second frame will require loading the corresponding DWARF information.
+second frame requires loading the corresponding DWARF information.
 The function is the following:
@ -558,7 +557,7 @@ duly defined elsewhere, unwinding multiple frames would then look like this:
 \lstinputlisting[language=C]{src/dw_semantics/stack_walker.c}
-Thus, if we hold for true that the IP will remain in the same memory segment
+Thus, if we hold for true that the IP remains in the same memory segment
 --~\ie{} binary file~-- for two frames, we can safely unwind two frames this
 way:
@ -578,7 +577,7 @@ interpreted. We then define the interpretation function $\llbracket t
 having the knowledge of $H$, the current interpreted row.
 But we also need to keep track of this state-saving stack DWARF uses, which
-will be kept in subscript.
+is kept in subscript.
 Thus, we define $\semI{\bullet}{s}(\bullet): \DWARF \times \FDE \to \FDE$, for
 $s$ a stack of $\dwrow$, that is,
@ -756,7 +755,7 @@ switch cases bodies then fill a context with unwound values before return it.
 A setting of the compiler also optionally enables another parameter to the
 \lstc{_eh_elf} function, \lstc{deref}, which is a function pointer. This
 \lstc{deref} function, when present, replaces everywhere the dereferencing
-\lstc{*} operator, and can be used to generate \ehelfs{} that will work on
+\lstc{*} operator, and can be used to generate \ehelfs{} that works on
 remote address spaces, that is, whenever the unwinding is not done on the
 process reading the \ehelf{} itself, but some other process, or even on a stack
 dump of a long-terminated process.
@ -1022,7 +1021,7 @@ the program's stack, and all the auxiliary information that is needed to unwind
 later. This is done when running \lstbash{perf record}. Then, a subsequent call
 to \lstbash{perf report} unwinds the stack to analyze it; but at this point of
 time, the traced process is long dead. Thus, any PID-based approach, or any
-approach using \texttt{/proc} information will fail. However, as this was the
+approach using \texttt{/proc} information fails. However, as this was the
 easiest method, the first version of \ehelfs{} used those mechanisms; it took
 some code rewriting to move to a PID- and \texttt{/proc}-agnostic
 implementation.