Please enable JavaScript.
Coggle requires JavaScript to display documents.
2016 - BINSEC/SE: A Dynamic Symbolic Execution Toolkit for Binary-level…
2016 - BINSEC/SE: A Dynamic Symbolic Execution
Toolkit for Binary-level Analysis
Information
BINSEC/SE
Dynamic Symbolic Execution Engine
Formal technique for exploring program paths in a systematic way
Path predicate
Computed for each path
A set of constraints on the program input that leads to follow that path at runtime
Fed to an automatic solver: a solution to the predicate is a new test input exploring the targeted path
Systematic exploration is achieved through iterating on all (user-bounded) paths of the program
Implemented in BINSEC
Architecture
PINSEC
Written in C++
Based on the Pin DBI framework
Offer a generic and modular tracing for both Linux and
Windows binaries
Export results into a generic format, here protobuf2 which is usable with major programming languages
Parameters
start
Addresses indicating where to begin the tracing
stop
Addresses indicating where to end the tracing
call_skips
Address of calls for which the callee should not be traced,
fun_skips
Function addresses that should not be traced.
A pintool
Tools created using Pin
Can be used to perform program analysis on user space applications on Linux, Windows and OS X
DSE engine
Written with OCaml
Generate path predicates, to send them to a solver and to get back new input data
Solver
SMT solver is finally used to check the satisfiability of the formula and if so, to get back a solution
Z3
CVC4
Boolector
Steps
x86 instructions are translated into DBA instructions on
which the path predicate is computed
Generate a formula (theory of arrays and bitvectors) which is then exported to the SMTLIB2 format
SMT solver is finally used to check the satisfiability of the formula and if so, to get back a solution
Formula generation and solving
Optimization
Optimizations of the path predicate
Standard simplifications such as constant folding
Local rewriting
Pruning useless parts of the formula
Read-over-Write (RoW) simplifications
over array load and store
Initial memory state
Logical arrays are unconstrained by default
Modelling the initial memory state as a logical array implies that the solver is free to give any value to any memory location of the initial state, possibly leading to meaningless results since the initial mapping of addresses in a real program is more complex.
BINSEC/SE allows to specify constraints on
the initial memory state
Concretization/Symbolization
For scalability issues, the path predicate is often approximated, either by forcing some logical values to their runtime values (observed by the tracer) – concretization, or by injecting fresh logical input – symbolization.
Reduce the complexity of the formula, at the price of completeness and/or correctness
BINSEC provides a complete set of callbacks to specify when and what to concretize/symbolize
Approach
Maintaining a symbolic memory state along the computation together with the encountered path constraints
Path selector
Written with OCaml
Choosing which path to explore next
Relies on a simple API allowing to easily implement various
exploration strategies
Offers a function that returns the best trace from a set of traces based on a function that calculate score from implemented strategies
Strategies
DFS
BFS
Random path
MinCall-DFS and MinCall-BFS
Binary-level security analyses
Example
Malware comprehension
Vulnerability analysis
Reverse engineering of binary codes is a key component yet can be difficult, especially, low-level assembly constructs can deceive disassemblers, while dynamic execution explores only a few possible behaviors
BINSEC
Platform for formal analysis of binary codes
Proposes a front-end from x86(32bits) to a generic intermediate representation called DBA
What is
Ocaml
Formal
Tracer
protobuf
Concretization
Problem
Lack of analysis platform that provide multiple analysis technique in the same platform
Authors
R David
Contribution
Highly configurable generic DSE engine toolkit, with a strong interaction between the tracer and the symbolic execution core, as well as heavy optimizations on the path predicate