Abstract and symbolic execution notes (with possible application on reverse engineering)
This post will be probably a too ambitious one, since I'm going to write about a very theoretical aspect of computation, in particular the so called abstract interpretation: for now let me just say that this field tries to give the best possible answer to unsolvable characterization of computer programs.
Another point of view that I want to address is the use of this kind of analysis with reverse engineering, since a lot of aspects between these two kind of analysis are shared.
This post will have pretty mathematical formalism but should be readable enough also without a PhD.
Languages
As a first thing to start thinking about code we need to define what is a program: let me show you a grammar of a pseudo-C language:
$$ \begin{array} {rlll} \mathsf{S} & ::= & \mathsf{x = A} \\ & | & ; \\ & | & \mathsf{{\bf if}\, (B)\, S} \\ & | & \mathsf{{\bf if}\, (B)\, S\, {\bf else}\, S} \\ & | & \mathsf{{\bf while}\, (B)\, S} \\ & | & \mathsf{{\bf break} ; } \\ & | & \mathsf{\{ Sl \} } \\ \mathsf{Sl} & ::= &\mathsf{Sl}\; \mathsf{S}\;|\,\epsilon \\ \mathsf{P} & ::= &\mathsf{Sl} \\ \end{array} $$
We'll use this grammar to define a structural analysis of the code, a pseudo-recursive approach: to understand what I mean, let's start with the simplest analysis possible, i.e., how many statements there are in the program; obviously you would think that simply counting the "lines" should suffice, but we can instead define the counting in a structural way, starting from the terminals, i.e. the statements themselves
$$ \begin{array} {rlll} \mathsf{S} & ::= & \mathsf{x = A} & \hbox{count}(\mathsf{S}) = 1 \\ \mathsf{Sl} & ::= & \mathsf{Sl}^\prime\;\mathsf{S} & \hbox{count}(\mathsf{Sl})=\hbox{count}(\mathsf{Sl}^\prime) + \hbox{count}(\mathsf{S}) \\ \mathsf{S} & ::= & \mathsf{{\bf if}\, (B)\, S_t} & \hbox{count}(\mathsf{S}) = \hbox{count}(\mathsf{S_t}) + 1 \\ \mathsf{S} & ::= & \mathsf{{\bf if}\, (B)\, S_t\, {\bf else}\, S_f} & \hbox{count}(\mathsf{S}) = \hbox{count}(\mathsf{S_t}) + \hbox{count}(\mathsf{S_f}) + 2 \\ \mathsf{S} & ::= & \mathsf{{\bf while}\, (B)\, S_b} & \hbox{count}(\mathsf{S}) = \hbox{count}(\mathsf{S_b}) + 1 \\ \end{array} $$
This works because since a program it's a finite set of statements, there are finite numbers of steps to reach the terminating elements of the grammar.
Now, for something more interesting that will come in hand later, we can associate at each structure a control flow graph: these diagrams show the \(\mathsf{\bf if}\), \(\mathsf{\bf if else}\) and \(\mathsf{\bf while}\) diagramatical structure
stateDiagram head: if (B) [*] --> head head --> [*] head --> S: true S --> [*]
stateDiagram head: if (B) left: S right: S [*] --> head head --> left: false head --> right: true right --> [*] left --> [*]
stateDiagram head: while (B) body: S [*] --> head head --> body: true body --> head head --> [*]: false
Now, using structural analysis, it's possible to deduce that relationship between the number of entering and exiting edges in a particular block
$$ \begin{array} {rlll} \mathsf{S} & ::= & \mathsf{x = A} & \sharp\hbox{out}(\mathsf{S}) = 1 \\ \mathsf{Sl} & ::= & \mathsf{Sl}^\prime\;\mathsf{S} & \sharp\hbox{in}(\mathsf{Sl})=\sharp\hbox{in}(\mathsf{Sl}^\prime),\;\sharp\hbox{out}(\mathsf{Sl}) = \sharp\hbox{out}(\mathsf{S}) \\ \mathsf{S} & ::= & \mathsf{{\bf if}\, (B)\, S_t} & \sharp\hbox{out}(\mathsf{S}) = \sharp\hbox{out}(\mathsf{S_t}) + 1 \\ \mathsf{S} & ::= & \mathsf{{\bf if}\, (B)\, S_t\, {\bf else}\, S_f} & \sharp\hbox{out}(\mathsf{S}) = \sharp\hbox{out}(\mathsf{S_t}) + \sharp\hbox{out}(\mathsf{S_f}) \\ \mathsf{S} & ::= & \mathsf{{\bf while}\, (B)\, S_b} & \sharp\hbox{out}(\mathsf{S}) = 1 \\ \end{array} $$
and in particular it's obvious that the convergence node of an \(\mathsf{\bf if}\) block might have a number of entering edges without an upper bound.
Note: you can build directly the flowchart from the CFG.
With this we can assume that the final CFG is composed of nodes that have at most two exiting edges but an unlimited number of entering edges.
But, the important fact is that the control flow instructions are the ones with
the two exiting edges, where the while
is identifiable by a returning edge
but the if
is recognizable from the two different paths converging to the
same exit node. Obviously goto
are not take into consideration, but in our
case it's not part of the language.
Abstract interpretation and lattices
What is abstract interpretation and why we should care? to answer this you should know that from theoretical computation studies it's known that it's impossible to build a program that is able to extract information of another program unless it's a trivial property.
This doesn't mean that we cannot know anything about a given program, simply that we must accept the fact the we have to approximate something.
The following in this section is an overview of the formalism needed to actually talk about what means "approximate". It's pretty much taken from two books: the most formal book on the subject "Principles of abstract interpretation" by Patrick Cousot (if you are not a math-person avoid it, it's practically a book about proving theorems) and "Static program analysis" by Anders Møller and Michael I. Schwartzbach, an openly available book that you can find here. The last one is more accessible (it's still pretty formal though).
Let's start with the simplest of the examples (it's the first example in any book about this stuff :P) i.e. the study of the "sign": we know that a variable representing an integer can assume three possible values with respect to the sign, it can be positive, negative or be zero.
With repect to multiplication this "classification" behaves well
$$ \begin{array} {r|ccc} * & + & 0 & - \\ \hline + & + & 0 & - \\ 0 & 0 & 0 & 0 \\ - & - & 0 & + \\ \end{array} $$
but not the addition
$$ \begin{array} {r|ccc} + & + & 0 & - \\ \hline + & + & + & ? \\ 0 & + & 0 & - \\ - & ? & - & - \\ \end{array} $$
since summing two integers with opposite signs can have all the prossible outcomes, this means that we lose any information possible. Moreover this means that every further calculation needs to take that into account (note how in the particular case of multiplication, whatever value is multiplied by zero, it becomes zero).
To formalize mathematically this concept of "anything" we introduce the symbol \(\top\) (you can read it as "top") representing the set of all possible values in this representation (we are talking about signs, so it's the set \(\{+,0,-\}\)) that roughly speaking means "I don't know what value actually is". Let's also introduce the symbol \(\bot\) (you can read it as "bottom") to indicate values that are not numbers (think about pointers or unreachable variables).
This allows us to formalize our reasoning defining an abstract domain composed by five symbols \(\{+, 0, -, \top, \bot\}\) that we can arrange in the following manner
$$ \begin{matrix} && \top & \\ &\huge\diagup & \huge| & \huge\diagdown \\ + & & 0 & & -\\ &\huge\diagdown & \huge| & \huge\diagup \\ &&\bot \end{matrix} $$
where the elements are ordered vertically with regard of "approximation"; let's define an ordering via the operator \(\sqsubseteq\): when \(x\sqsubseteq y\) we say that \(y\) is a safe approximation of \(x\), or that \(x\) is at least as precise as \(y\). In this case we have \(+\sqsubseteq\top\). A set with such operator is called poset (Partial order set). Let me introduce this space:
Posets: a poset \(\langle\mathbb{P}\sqsubseteq\rangle\) is a set \(\mathbb{P}\) with a partial order \(\sqsubseteq\) that is
- Reflexive: \(\forall x\in\mathbb{P}.x\sqsubseteq x\)
- Antisymmetric: \(\forall x,y\in\mathbb{P}.\left((x\sqsubseteq y)\wedge(y\sqsubseteq x)\right)\Rightarrow(x = y)\)
- Transitive: \(\forall x,y,z\in\mathbb{P}.\left((x\sqsubseteq y)\wedge(y\sqsubseteq z)\right)\Rightarrow(x \sqsubseteq z)\)
Note: not every couple of elements are supposed to have a well defined relation with respect to the operator \(\sqsubseteq\): if for a couple of element \(x,y\in\mathbb{P}\) we have \(x\sqsubseteq\) or \(y\sqsubset x\) we say that are comparable, otherwise they are incomparable. When in a poset all the elements are comparable we have a poset that is total.
Let \(\langle \mathbb{P}, \sqsubseteq\rangle\) be a poset and \(S\in \wp(\mathbb{P})\) be a subset.This subset \(S\) has
- upper bound \(u\) iff \(u\in\mathbb{P}\) and \(\forall x\in S.x\sqsubseteq u\)
- a least upper bound (lub, indicated \(\sqcup S\)) iff \(\sqcup S\) is an upper bound of \(S\) smaller than other upper bound of \(S\). \(\sqcup\{x, y\}\) is denoted with the infix notation \(x\sqcup y\)
The intuition around posets is that they abstract set theory, where \(\sqsubseteq\) is the analogous of \(\subseteq\) and \(\sqcup\), \(\sqcap\) have as "dual" \(\cup\) and \(\cap\).
Finite posets \(\langle \mathbb{P}, \sqsubseteq\rangle\) can be represented by Hassle diagrams, which is a set of points \(\{p_x | x\in P\}\) in the plan such that
- if \(x \sqsubset y\) then \(p_x\) is strictly below \(p_y\) and
- \(p_x\) and \(p_y\) are linked by a segment when \(x \lessdot y\) (\(y\) covers \(x\)) where \(x\lessdot y \triangleq x\sqsubset y\wedge \not\exists z\in P.x\sqsubset z\wedge z\sqsubset y\).
Note: two unlinked elements are incomparable.
$$ \begin{array}{ccl} x\sqcap x = x & x\sqcup x = x & \hbox{idempotency} \\ x\sqcap y = y \sqcap x & x\sqcup y = y \sqcup x & \hbox{commutativity} \\ x\sqcap (y\sqcap z) = (x\sqcap y)\sqcap z & x\sqcup (y\sqcup z) = (x\sqcup y)\sqcap z & \hbox{associativity} \\ (x\sqcap y)\sqcup x = x & (x\sqcup y)\sqcap x = x & \hbox{absorpsion} \\ \end{array} $$
soundness: the results are always correct
completeness: all true facts are provable
Soundness (\(\Longleftarrow\)) states that if a statement is proved bythe proof method then that statement is true. Completeness (\(\Longrightarrow\)) states that the proof method is always applicable to prove a true statement.
If you think it's useless stuff, you know, vulnerabilities come from it
$$ \require{AMScd} \begin{CD} A @>a>> B\\ @V b V V @VV c V\\ C @>>d> D \end{CD} $$
Properties
Properties can be understood as the set of mathematical objects that have this property, e.g
$$ 2\mathbb{Z} = \left\{x\in\mathbb{Z}\,|\,\exists k\in\mathbb{Z}.x=2k\right\} $$
hence if \(P\) is a property then \(x\in P\) means "\(x\) has property \(P\)".
Under this interpretation, logical implication is subset inclusion: as an example
$$ \hbox{"to be greater than 42 implies to be positive"} \Longleftrightarrow \{x\in\mathbb{Z}\,|\,x\gt42\}\subseteq\{x\in\mathbb{Z}\,|\,x\geq0\} $$
In the expression \(P\subseteq Q\), \(P\) is said to be stronger/more precise since its property is satisfied by less elements.
To refer to the abstract domain of sign, we can associate to each abstract sign a set represented by the elements in \(\mathbb{Z}\) that have such property, i.e.
$$ \lt \,\sim\,\{x\in\mathbb{Z}\,|\,x\lt0\} $$
Another example is the property to be an integer constant, that is espressed by the set
$$ \left\{\left\{ n\right\}\,|\,n\in\mathbb{Z}\right\} $$
Liveness and safeness properties
There are two properties that are fundamental:
- liveness: something "good" must happen during execution
- safeness: anything "bad" doesn't happen during execution
Let me borrow an example from the seminal paper that defined these concepts, using a producer/consumer:
it consists of a producer process which puts messages into a buffer and a consumer process which removes the messages. We assume that the buffer can hold at most \\(b\\) messages, \\(b\qeq1\\). The processes must be synchronized so the producer doesn't try to put a message into a buffer if there is no room for it, and the consumer doesn't try to remove a message that has not yet been produced
Let's implement these entities using a buffer \(B\) of dimension \(b\) using as references respectively to the sent and received messages with the variables \(s\) and \(r\): when a producer wants to send a message it puts it into \(B[s\mod b]\) and the consumers takes it from \(B[r \mod b]\).
while (1) { while ((s - r) % b == 0 && r != s) ; send_message(s); s = s + 1; }
The following diagrams explain the cfg of producer and consumer:
stateDiagram-v2 main: while (1) head: s - r mod b = 0 and r != s message: deliver message cursor: s = s + 1 [*] --> main main --> head head --> message: false message --> cursor cursor --> main head --> head: true
stateDiagram-v2 main: while (1) head: s - r mod b = 0 and r == s message: receive message cursor: r = r + 1 [*] --> main main --> head head --> message: false message --> cursor cursor --> main head --> head: true
Now, we define \(n = s - r\) as the number of messages in the buffer, to prove the desired safety property we need to prove that \(0\leq n\leq b\) holds for the execution of the entire program. Here program is the combined execution of these two processes.
An assertion is a logical function of program variables and tokens positions and it's said to be invariant if it's always true during the execution of the program.
An interpretation of a program is an assignment of an assertion to each arc of the cfg. We say that the program is in a legitimate state if each token is on an arc whose interpretation is true. The interpretation is said to be invariant if the program always remains in a legitimate state throughout its execution.
So, under these definitions, proving the program correctness can be done verifying the following conditions:
- the program starts in a legitimate state
- during execution it remains in a legitimate state, i.e. if at a certain moment it's in a legitimate state, any execution step will leave it in a legitimate state.
Turing my beloved
Assume \(\hbox{termination}(P)\) would always terminate and returns true iff \(P\) always terminates on all input data; the following statement is a contradiction
$$ P = \mathsf{\bf while}(\hbox{termination}(P)); $$
Structural deductive stateless prefix trace semantics
This is from chapter 6 of PoAI;
- Stateless: does not register the values of variables
- Structural: by induction on the program syntax
- Deductive: the semantics of each program component is specified by a deductive system of axioms and rules of inference, a program execution being a proof in such a system.
\(\def\trace{\mathbb{T}}\)
A trace is a finite sequence of configurations (labels) separated by actions. We let \(\trace^+\) be the set of finite traces, \(\trace^\infty\) be the set of infinite traces and \(\trace^{+\infty}\buildrel\triangle\over=\trace^+\cup\trace^\infty\)
\(\def\action#1{\xrightarrow{\hspace{1cm}\displaystyle{#1}\hspace{1cm}}}\) \(\def\l#1{\vphantom{x}^{#1}}\) \(\def\pijoin{\buildrel\frown\over\cdot}\)
$$ \pi_1\l{l_1}\pijoin\l{l_2}\pi_2=\cases{ \pi_1\l{l_1}\pi_2 \,\hbox{when}\,\l{l_1}=\l{l_2}\cr \hbox{undefined when }\,\l{l_1}\not=\l{l_2}\cr } $$
$$ \action{\pi_1}\underbrace{\hbox{at}[\![\mathsf{S}]\!]\action{\pi_2}l}_{\displaystyle\in\mathcal{S}^\star[\![\mathsf{S}]\!]\left( \pi_1\hbox{at}[\![\mathsf{S}]\!] \right)} $$
Because \(\mathbb{T}\longrightarrow\wp(\mathbb{T^\star})\) is isomorph to \(\wp(\mathbb{T}^+\times\mathbb{T}^\star)\) then we can write
$$ \mathcal{S}^{\hat\star}[\![\mathsf{S}]\!] = \left\{\langle\pi,\pi^\prime\rangle\,|\,\pi\in\trace^+\wedge\pi^\prime\in\mathcal{S}^\star[\![\mathsf{S}]\!]\pi\right\} $$
so that \(\mathcal{S}^{\star}[\![\mathsf{S}]\!]\) is the right image of \(\mathcal{S}^{\hat\star}[\![\mathsf{S}]\!]\)
Maximal trace semantics
A trace is maximal when it's finite and represents an execution that terminates or it is infinite and represents an execution that doesn't terminate. Maximal finite traces of statements are prefix executions that exit the statement. $$ \mathcal{S}^{\infty}[\![\mathsf{S}]\!]\left( \pi^l \right) \buildrel\triangle\over= \lim\left( \mathcal{S}^{\star}[\![\mathsf{S}]\!]\left( \pi^l \right) \right) $$
$$ \mathcal{S}^{+}[\![\mathsf{S}]\!]\left(\pi_1\hbox{at}[ \![\mathsf{S}]\!]\right) \buildrel\triangle\over= \left\{ \pi^{\phantom{2}l}_2\in\mathcal{S}^{\star}[\![\mathsf{S}]\!](\pi_1\hbox{at}[\![\mathsf{S}]\!])\,|\, \vphantom{\pi}^l=\hbox{after}[\![\mathsf{S}]\!]\vee\left(\hbox{escape}[\![S]\!] \wedge\vphantom{\pi}^l=\hbox{break-to}[\![\mathsf{S}]\!]\right) \right\} $$
$$ \eqalign{ \mathcal{S}^{+\infty}[\![\mathsf{S}]\!]\left(\pi^l\right) & \buildrel\triangle\over= \mathcal{S}^{+}[\![\mathsf{S}]\!]\left(\pi^l\right) \cup \mathcal{S}^{\infty}[\![\mathsf{S}]\!]\left(\pi^l\right) \\ \mathcal{S}^{+\infty}[\![\mathsf{S}]\!]\Pi & \buildrel\triangle\over= \bigcup\left\{ \mathcal{S}^{+\infty}[\![\mathsf{S}]\!]\left(\pi^l\right) \, | \, \pi^l\in\Pi \right\} \\ \mathcal{S}^{+\infty}[\![\mathsf{S}]\!] & \buildrel\triangle\over= \mathcal{S}^{+\infty}[\![\mathsf{S}]\!]\left( \mathbb{T}^+ \right) \\ \mathcal{S}^{+\infty}[\![\mathsf{P}]\!] & \buildrel\triangle\over= \mathcal{S}^{+\infty}[\![\mathsf{S}]\!]\left( \left\{ \hbox{at}[\![\mathsf{P}]\!] \right\}\right) \\ } $$
Subroutines
In our syntax is not included explicitely the execution of subroutine but this should not be a problem since you can think of that as a sub-CFG implictily defined by that instruction.
CFG from instructions
In a more abstract sense some code is a sequence of instructions, in this context I'm going to talk about binary code, i.e. code that is usually represented and stored via bits and organized in a flat way.
To clarify: in general the following argument could apply also to source code (in particular control flow and stuff) but it's a little more complex since allow for nesting and it's explicit about the "intended" code, but here I'm more interested in reconstructing it. Moreover, the compiler is doing in reverse what in general a decompiler is trying to do.
The core point here is that, being the instructions laid out one after the other, some of them must handle the control flow and these are the instructions we are interested in studying, at least as appetizer: let's organize the instruction based on their behaviour with respect to control flow
- standard: the large majority of instructions flow to the next instruction
- jump: unconditional jump
- branch: conditional jump if some condition holds, otherwise fallthrough
- call: invokes a function
- return: transfer back control to the caller
Note: we can create a taxonomy for classes of instructions, first with the
subdivision "standard"/"control flow", then the latter can be subdivided again
in "intra"/"extra"-procedural, where call
s and return
s create a
context that is not present with jmp
s and branch
es.
If we abstract each instruction as a vertex and we set directed edges between an instruction and its (possible) successors we obtain what is called a control flow graph (CFG from now on), this is the next step after having disassembled your code.
Take in mind that usually the analysis is done at function level, if you
encounter a call
instruction, you treat it like a standard
one, if you encounter
a return
instruction you set the corresponding vertex as an end point.
Two particular vertices are the entry point and the exit, the first is
usually not explicitely indicated in binary code and it's a challenging problem
by itself to find it, the latter is usually represented by the return
instruction.
Calling functions is not usually included in the CFG, that generates another
graph of its own, the call graph.
Note: obviously in real code you can have signals and exceptions or interrupts that go against this reasoning.
Note: there are cases where a stream of bytes can have different
interpretation regarding the instructions in it, I'm looking at you x86
. In
particular, a jump can reach in the "middle" of another possible instruction
creating a different interpretation.
It's possible to transform a little the graph, grouping instructions in regions having a single entry point and a single exit point: the so called basic blocks.
The first instruction in them is named leader, the last instruction is the tail. The leader is either the first instruction in a program, the target of a jump, branch or call, or the fall-through instruction of a non-taken branch. Instructions of type jump, branch, call and return cannot be followed by any other instruction in a block.
So at the end, basic blocks are blocks of instructions connected by branches and are a less grained representation of a CFG; loops are obvious in such representation.
Intermediate language
But there is much more: in the last post about reversing C++
code with
ghidra
I also
talked about the capabilities of the tools to "abstract away" the actual
instructions and via Pcode
that is an intermediate language used by
ghidra
to have a common language for all the architecture it understands.
A lot of tools have an intermediate language (IL
) or intermediate
representation (IR
) used for their own purposes and here I'm talking about
tools that are not even reverse engineering tools: for example
LLVM
Binary Ninja
radare2
IDA
Whatever language you use to encode the binary instructions, exists a representation the in invariant and that is used to extract core information about the flow of execution.
P-code
It uses definition in SLEIGH
, it's human readable; P-code
is a Register
Transfer Language: each native instruction is translated in (possibly) multiple
p-code operations that takes parts of the processor state as input and output
variables (so called Varnode
s).
The p-code obtained from direct translation is referred to as raw but can be further refined and exist additional operations that are not available initially.
Take in mind that all p-code operations are associated with the address of the original instruction they were translated from. For a single instruction, a counter is used to enumerate the multiple p-code operations involved in the translation. The address and this counter, as a pair, are referred with the name of sequence number.
The core p-code defined are the following (here the documentation)
Name | Arguments | Description |
---|---|---|
COPY |
input0 output |
output = input |
LOAD |
input0 input1 output |
output = *input or output = *[input0]input1
|
STORE |
input0 input1 input2 |
*input1 = input2 or *[input0] = input2
|
BRANCH |
input0 |
goto input0 |
CBRANCH |
input0 input1 |
if (input1) goto input0 |
BRANCHIND |
input0 |
goto [input0] |
CALL |
input0 ... |
call [input0] |
CALLIND |
input0 ... |
call [input0] |
RETURN |
input0 ... |
return [input0] |
PUSH RBP (unique, 0xea00, 8) = COPY (register, 0x28, 8) (register, 0x20, 8) = INT_SUB (register, 0x20, 8), (const, 0x8, 8) STORE (const, 0x1b1, 8), (register, 0x20, 8), (unique, 0xea00, 8) MOV RBP,RSP (register, 0x28, 8) = COPY (register, 0x20, 8) SUB RSP,0x20 (register, 0x200, 1) = INT_LESS (register, 0x20, 8), (const, 0x20, 8) (register, 0x20b, 1) = INT_SBORROW (register, 0x20, 8), (const, 0x20, 8) (register, 0x20, 8) = INT_SUB (register, 0x20, 8), (const, 0x20, 8) (register, 0x207, 1) = INT_SLESS (register, 0x20, 8), (const, 0x0, 8) (register, 0x206, 1) = INT_EQUAL (register, 0x20, 8), (const, 0x0, 8) (unique, 0x12e80, 8) = INT_AND (register, 0x20, 8), (const, 0xff, 8) (unique, 0x12f00, 1) = POPCOUNT (unique, 0x12e80, 8) (unique, 0x12f80, 1) = INT_AND (unique, 0x12f00, 1), (const, 0x1, 1) (register, 0x202, 1) = INT_EQUAL (unique, 0x12f80, 1), (const, 0x0, 1) MOV dword ptr [RBP + local_1c],EDI (unique, 0x3100, 8) = INT_ADD (register, 0x28, 8), (const, 0xfffffffffff (unique, 0xbf00, 4) = COPY (register, 0x38, 4) STORE (const, 0x1b1, 4), (unique, 0x3100, 8), (unique, 0xbf00, 4) MOV qword ptr [RBP + local_28],RSI (unique, 0x3100, 8) = INT_ADD (register, 0x28, 8), (const, 0xfffffffffff (unique, 0xc000, 8) = COPY (register, 0x30, 8) STORE (const, 0x1b1, 4), (unique, 0x3100, 8), (unique, 0xc000, 8) MOV RAX,qword ptr [RBP + local_28] (unique, 0x3100, 8) = INT_ADD (register, 0x28, 8), (const, 0xfffffffffff (unique, 0xc000, 8) = LOAD (const, 0x1b1, 4), (unique, 0x3100, 8) (register, 0x0, 8) = COPY (unique, 0xc000, 8) ADD RAX,0x8 (register, 0x200, 1) = INT_CARRY (register, 0x0, 8), (const, 0x8, 8) (register, 0x20b, 1) = INT_SCARRY (register, 0x0, 8), (const, 0x8, 8) (register, 0x0, 8) = INT_ADD (register, 0x0, 8), (const, 0x8, 8) (register, 0x207, 1) = INT_SLESS (register, 0x0, 8), (const, 0x0, 8) (register, 0x206, 1) = INT_EQUAL (register, 0x0, 8), (const, 0x0, 8) (unique, 0x12e80, 8) = INT_AND (register, 0x0, 8), (const, 0xff, 8) (unique, 0x12f00, 1) = POPCOUNT (unique, 0x12e80, 8) (unique, 0x12f80, 1) = INT_AND (unique, 0x12f00, 1), (const, 0x1, 1) (register, 0x202, 1) = INT_EQUAL (unique, 0x12f80, 1), (const, 0x0, 1) MOV RAX,qword ptr [RAX] (unique, 0xc000, 8) = LOAD (const, 0x1b1, 4), (register, 0x0, 8) (register, 0x0, 8) = COPY (unique, 0xc000, 8) MOV RDI,RAX (register, 0x38, 8) = COPY (register, 0x0, 8) CALL <EXTERNAL>::atoi (register, 0x20, 8) = INT_SUB (register, 0x20, 8), (const, 0x8, 8) STORE (const, 0x1b1, 8), (register, 0x20, 8), (const, 0x10117b, 8) CALL (ram, 0x101050, 8) MOV dword ptr [RBP + index],EAX (unique, 0x3100, 8) = INT_ADD (register, 0x28, 8), (const, 0xfffffffffff (unique, 0xbf00, 4) = COPY (register, 0x0, 4) STORE (const, 0x1b1, 4), (unique, 0x3100, 8), (unique, 0xbf00, 4) MOV EAX,dword ptr [RBP + index] (unique, 0x3100, 8) = INT_ADD (register, 0x28, 8), (const, 0xfffffffffff (unique, 0xbf00, 4) = LOAD (const, 0x1b1, 4), (unique, 0x3100, 8) (register, 0x0, 4) = COPY (unique, 0xbf00, 4) (register, 0x0, 8) = INT_ZEXT (register, 0x0, 4) CMP EAX,0x2 (register, 0x200, 1) = INT_LESS (register, 0x0, 4), (const, 0x2, 4) (register, 0x20b, 1) = INT_SBORROW (register, 0x0, 4), (const, 0x2, 4) (unique, 0x29000, 4) = INT_SUB (register, 0x0, 4), (const, 0x2, 4) (register, 0x207, 1) = INT_SLESS (unique, 0x29000, 4), (const, 0x0, 4) (register, 0x206, 1) = INT_EQUAL (unique, 0x29000, 4), (const, 0x0, 4) (unique, 0x12e80, 4) = INT_AND (unique, 0x29000, 4), (const, 0xff, 4) (unique, 0x12f00, 1) = POPCOUNT (unique, 0x12e80, 4) (unique, 0x12f80, 1) = INT_AND (unique, 0x12f00, 1), (const, 0x1, 1) (register, 0x202, 1) = INT_EQUAL (unique, 0x12f80, 1), (const, 0x0, 1) JBE LAB_0010119c (unique, 0xca00, 1) = BOOL_OR (register, 0x200, 1), (register, 0x206, 1) CBRANCH (ram, 0x10119c, 8), (unique, 0xca00, 1) LEA RAX,[s_forbidden!_0010200d] (register, 0x0, 8) = COPY (const, 0x10200d, 8) MOV RDI=>s_forbidden!_0010200d,RAX (register, 0x38, 8) = COPY (register, 0x0, 8) CALL <EXTERNAL>::puts (register, 0x20, 8) = INT_SUB (register, 0x20, 8), (const, 0x8, 8) STORE (const, 0x1b1, 8), (register, 0x20, 8), (const, 0x101195, 8) CALL (ram, 0x101030, 8) MOV EAX,0x1 (register, 0x0, 8) = COPY (const, 0x1, 8) JMP LAB_001011d0 BRANCH (ram, 0x1011d0, 8) MOV EAX,dword ptr [RBP + index] (unique, 0x3100, 8) = INT_ADD (register, 0x28, 8), (const, 0xfffffffffff (unique, 0xbf00, 4) = LOAD (const, 0x1b1, 4), (unique, 0x3100, 8) (register, 0x0, 4) = COPY (unique, 0xbf00, 4) (register, 0x0, 8) = INT_ZEXT (register, 0x0, 4) CDQE (register, 0x0, 8) = INT_SEXT (register, 0x0, 4) LEA RDX,[RAX*0x8] (unique, 0x3680, 8) = INT_MULT (register, 0x0, 8), (const, 0x8, 8) (register, 0x10, 8) = COPY (unique, 0x3680, 8) LEA RAX,[messages] (register, 0x0, 8) = COPY (const, 0x104030, 8) MOV RAX=>messages,qword ptr [RDX + RAX*0x1] (unique, 0x3300, 8) = INT_MULT (register, 0x0, 8), (const, 0x1, 8) (unique, 0x3400, 8) = INT_ADD (register, 0x10, 8), (unique, 0x3300, 8) (unique, 0xc000, 8) = LOAD (const, 0x1b1, 4), (unique, 0x3400, 8) (register, 0x0, 8) = COPY (unique, 0xc000, 8) MOV RSI,RAX (register, 0x30, 8) = COPY (register, 0x0, 8) LEA RAX,[s_message:_'%s'_00102018] (register, 0x0, 8) = COPY (const, 0x102018, 8) MOV RDI=>s_message:_'%s'_00102018,RAX (register, 0x38, 8) = COPY (register, 0x0, 8) MOV EAX,0x0 (register, 0x0, 8) = COPY (const, 0x0, 8) 6_c32105f2 CALL <EXTERNAL>::printf (register, 0x20, 8) = INT_SUB (register, 0x20, 8), (const, 0x8, 8) STORE (const, 0x1b1, 8), (register, 0x20, 8), (const, 0x1011cb, 8) CALL (ram, 0x101040, 8) MOV EAX,0x0 (register, 0x0, 8) = COPY (const, 0x0, 8) LEAVE (register, 0x20, 8) = COPY (register, 0x28, 8) (register, 0x28, 8) = LOAD (const, 0x1b1, 8), (register, 0x20, 8) (register, 0x20, 8) = INT_ADD (register, 0x20, 8), (const, 0x8, 8) RET (register, 0x288, 8) = LOAD (const, 0x1b1, 8), (register, 0x20, 8) (register, 0x20, 8) = INT_ADD (register, 0x20, 8), (const, 0x8, 8) RETURN (register, 0x288, 8)
Ghidra's decompiler implementation
On Debian gradle is at version 4 but it requires version 7 :P https://github.com/NationalSecurityAgency/ghidra/blob/master/DevGuide.md
export PATH=/opt/gradle-7.5.1/bin:$PATH export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::")
be sure to have openjdk17
These are my personal notes about the implementation of the decompiling process in ghidra
,
it's not authoritative and could be all a fever dream from my side, take it with
a grain of salt.
First of all, ghidra
is built using two different languages, C++
for the
core analysis and Java
for firther manipulation and the application itself:
user interface, plugins etc...
The decompiler it's written in C++
, it's possible to compile it separately
using the following encantation
$ cd Ghidra/Features/Decompiler/src/decompile/cpp $ make decomp_dbg $ export SLEIGHHOME=~/git/ghidra/ $ ./decomp_dbg [decomp]>
but ghidra
itself calls it via the DecompInterface
class that in reality
uses a client in C++
located at Ghidra/Features/Decompiler/src/decompile/cpp/ghidra_process.cc
that registers the following commands
registerProgram
deregisterProgram
flushNative
-
decompileAt
: decompile a specific function structureGraph
setAction
setOptions
(if you want to know more details you can look at the source code that has a lot
of comments); you can find here
where it calls decompileAt
The communication is done via XML
with the actual decompiler process that is implemented
Ghidra/Features/Decompiler/src/decompile/cpp/ghidra_process.cc
is the client
The actual binary is named decompile
(at ./Ghidra/Features/Decompiler/build/os/linux_x86_64/decompile
)
the command line decompiler is implemented in
Ghidra/Features/Decompiler/src/decompile/cpp/consolemain.cc
Obviously you need to registerProgram
before
protected void initializeProcess() throws IOException, DecompileException { if (decompCallback == null) { throw new IOException("Program not opened in decompiler"); } if (decompProcess == null) { decompProcess = DecompileProcessFactory.get(); } else if (!decompProcess.isReady()) { DecompileProcessFactory.release(decompProcess); decompProcess = DecompileProcessFactory.get(); } long uniqueBase = UniqueLayout.SLEIGH_BASE.getOffset(pcodelanguage); XmlEncode xmlEncode = new XmlEncode(); pcodelanguage.encodeTranslator(xmlEncode, program.getAddressFactory(), uniqueBase); String tspec = xmlEncode.toString(); xmlEncode.clear(); dtmanage.encodeCoreTypes(xmlEncode); String coretypes = xmlEncode.toString(); SleighLanguageDescription sleighdescription = (SleighLanguageDescription) pcodelanguage.getLanguageDescription(); ResourceFile pspecfile = sleighdescription.getSpecFile(); String pspecxml = fileToString(pspecfile); xmlEncode.clear(); compilerSpec.encode(xmlEncode); String cspecxml = xmlEncode.toString(); decompCallback.setNativeMessage(null); decompProcess.registerProgram(decompCallback, pspecxml, cspecxml, tspec, coretypes, program);
- @param pspecxml = string containing .pspec xml
- @param cspecxml = string containing .cspec xml
- @param tspecxml = XML string containing translator spec
- @param coretypesxml = XML description of core data-types
- @param cspecxml = string containing .cspec xml
./Ghidra/Processors/x86/data/languages/x86.pspec ./Ghidra/Processors/x86/data/languages/x86.ldefs ./Ghidra/Processors/x86/data/languages/x86.slaspec ./Ghidra/Processors/x86/data/languages/x86.opinion ./Ghidra/Processors/x86/data/languages/x86.sla
/opt/ghidra-github/Ghidra/Processors/x86/data/languages/x86-64.pspec
cspec
tspec
coretypes
void RegisterProgram::rawAction(void) ghidra = new ArchitectureGhidra(pspec,cspec,tspec,corespec,sin,sout); ghidra->init(store);
Theorems
Now, the title of this section can be puzzling, why theorems? remember when I told you that every basic block controlling the flow of the code has a logic condition at end? this means that a continous path from vertex \(A\) to vertex \(B\) is practically "building" a formula using boolean conditions.
Let me step back a little to explain some maths to you
-
SAT
: Boolean Satisfability Problem, i.e. determining if there exists an interpretation that satisfies a given boolean formula. -
SMT
: Satisfability Modulo Theory, an extension ofSAT
-
BMC
: Bounded Model Checking
Experiment with ghidra
Since I'm interested to practical applications but I want to be able to use this
mechanism directly while I'm reverse engineering code with ghidra
, I want a
way to "lift" the Pcode
as logical expression and work on that.
It's not a difficult task, I mean, Pcode
has not many instructions and it's
a tedious but doable task, for example maat does that
but it's C++
but fortunately for me someone has done that in Java
, a tool named,
GhiHorn,
thanks to the magic of Jython I can use directly from the console since the
classes are defined as public
, take note that here however we are using the
Java
's APIs
(here
the documentation)
>>> from kaiju.tools.ghihorn.decompiler import GhiHornSimpleDecompiler >>> from kaiju.tools.ghihorn import GhiHornifierBuilder >>> from kaiju.tools.ghihorn.api import GhiHornApiDatabase >>> apiDatabase = GhiHornApiDatabase(GhiHornSimpleDecompiler()) >>> hornifier = SimpleHornifier("kebab") >>> hornifier.setDecompiler(GhiHornSimpleDecompiler()) >>> hornifier.setEntryPoint(toAddr(0x0101149)) >>> hornifier.setApiDatabase(apiDatabase) >>> hornifier.hornify(currentProgram, ConsoleTaskMonitor())
/** * Expressions that represent a p-code operation. They basically have the form: * * out = OP(in...) */ public class PcodeExpression implements HornExpression { ... }
If I place the cursor on the uvar1 < 4
block of this code
int main(int argc,char **argv) { uint uVar1; int jim; uVar1 = atoi(argv[1]); if (uVar1 < 4) { puts("!"); } return 0; }
I can extract the logic expression from the Pcode
>>> from kaiju.tools.ghihorn.hornifer.horn.expression import PcodeExpression >>> currentLocation.getToken().getPcodeOp() (unique, 0xcb80, 1) INT_LESS (register, 0x0, 4) , (const, 0x4, 4) >>> pcode = currentLocation.getToken().getPcodeOp() >>> expr = PcodeExpression(pcode) >>> expr.getOperation() uVar1@main!cast_argv < const=4
seems pretty ok, I want to investigate a little more, in particular I'm interested to see if it's encoded as an unsigned operation
>>> type(expr) <type 'kaiju.tools.ghihorn.hornifer.horn.expression.PcodeExpression'> >>> expr.getComponents() array(kaiju.tools.ghihorn.hornifer.horn.expression.HornExpression, [uVar1@main!cast_argv, const=4]) >>> expr.getComponents()[0] uVar1@main!cast_argv >>> var = expr.getComponents()[0] >>> var.getDataType() kaiju.tools.ghihorn.z3.GhiHornBitVectorType@6385652a >>> expr.getOperation().getClass() <type 'kaiju.tools.ghihorn.hornifer.horn.expression.UltExpression'>
and indeed, from the last line, it is!
Now I want to use it
>>> from kaiju.tools.ghihorn.z3 import GhiHornContext >>> ctx = GhiHornContext() >>> solver = ctx.mkSolver() >>> type(solver) <type 'com.microsoft.z3.Solver'> >>> instance = expr.instantiate(ctx) >>> instance (= bVar148@main!cast_argv (bvult uVar1@main!cast_argv #x0000000000000004)) >>> type(instance) <type 'com.microsoft.z3.BoolExpr'> >>> solver.add(instance) >>> solver.check() SATISFIABLE >>> model = solver.getModel() >>> model (define-fun uVar1@main!cast_argv () (_ BitVec 64) #x0000000000000000) (define-fun bVar148@main!cast_argv () Bool true)
Now, listen, I know that Java
is shit so I'm teaching a neat trick to export
all, from Java
>>> print solver.toString() (declare-fun bVar145@main!array () Bool) (declare-fun uVar1@main!array () (_ BitVec 64)) (assert (= bVar145@main!array (bvult uVar1@main!array #x0000000000000003))) (model-add bVar145@main!array () Bool (not (bvule #x0000000000000003 uVar1@main!array)))
you can import into python
In [1]: import z3 In [2]: s = z3.Solver() In [3]: s.from_string("(declare-fun bVar145@main!array () Bool)\n(declare-fun uVar1@main!array () (_ BitVec 64))\n(assert (= bVar145@main!array (bvult uVar1@main!array #x0000000000000003)))\n(model-add bVar145@main!array\n ()\n ...: Bool\n (not (bvule #x0000000000000003 uVar1@main!array)))\n") In [4]: print(s) [bVar145@main!array == ULT(uVar1@main!array, 3)]
Ideas
- Maybe having a distribution of the input and looking for the element in the tail of such distribution. Note here the "right" or "compliant" input is tricky: in theory everything accepted is compliant.
- Try to build a finite state machine from the CFG
struct { size_t length; char payload[]; } miao_t void do_something(char* buffer, miao_t* m) { size_t length = m->length - sizeof(m->head); memcpy(buffer, m->payload, length); }
here, with the assumption that the attacker controls the m
variable in its
entirety,
the obvious vulnerability is that if m->length
is less than the size in
bytes of the field head
then an underflow happens.
But if the attacker controls the two arguments, i.e. the two pointers now you have a read/write primitive :)
Links
- Abstract Interpretation spring 2005 course with slides
- Awesome symbolic execution github repo
- Why Horn Formulas Matter in Computer Science: Initial Structures and Generic Examples
- Z3 Python tutorials
- SAT/SMT by example
- Dynamic Symbolic Execution for Software Analysis
- LLBMC — The Low-Level Bounded Model Checker
- Constrained Horn Clauses (CHC) from Automated Program Verification (APV) Fall 2018
- Identifying Loops In Almost Linear Time
- Practical Dynamic Reconstruction of Control Flow Graphs
- A Survey of Symbolic Execution Techniques
- Regular Property Guided Dynamic Symbolic Execution
- Symbolic Verification of Regular Properties
- ESP: Path-Sensitive Program Verification in Polynomial Time
- Algorithms for Computing the Static Single Assignment Form
- Efficiently Computing Static Single Assignment Form and the Control Dependence Graph
- Static Program Analysis
- static program analysis technique of Abstract Interpretation by Mozilla
- Abstract Interpretation
- Abstract Interpretation and Abstract Domains
- An abstract interpretation-based deobfuscation plugin for ghidra
- RolfRolles/GhidraPAL Ghidra Program Analysis Library
- Concrete interpretation of Chess
- Abstract intepretation of Chess
- Be a binary rockstar (video)
- Look out! Divergent representations are everywhere!
- BINSEC binary code analysis for security
- Proving the security of software-intensive embedded systems by abstract interpretation, PhD thesis by Marc Chevalier where a more realistic semantics of C is defined.
-
Finding JIT Optimizer Bugs using SMT Solvers and Fuzzing in
PyPy
. - gtirb Intermediate Representation for Binary analysis and transformation
- ddisasm A fast and accurate disassembler
- Ghidraton, Snaking Ghidra with Python 3 Scripting
- Hyperproperties, paper by Michael R. Clarkson and Fred B. Schneider
- IKOS Static analyzer for C/C++ based on the theory of Abstract Interpretation by NASA (slide)
- Proving the correctness of multiprocess programs, papers where "liveness" and "safeness" were defined
- “Static Analysis by Abstract Interpretation and Decision Procedures” PhD thesis by Julien Henry and PAGAI, a tool which does Path Analysis for invariant Generation by Abstract Interpretation
Comments
Comments powered by Disqus