Please enable JavaScript.
Coggle requires JavaScript to display documents.
2015 - Multi-Platform Binary Program Testing Using Concolic Execution…
2015 - Multi-Platform Binary Program Testing Using Concolic Execution
Information
Approach
PANDA
Steps
Input files
Input selection
Scoring
Produces a new path of execution
Number of inputs executed targeting this branch
Targets the uncovered edge of a known branch
Length of the path
Initialize algorithm with seed input
Length of the content should vary since the system only varies the content of the file, but not the length of it.
New input obtained from mutating previous input based on constraint information from the model
Begin recording of program execution with input
Concolic execution of PANDA replay
Branch Constraint
LLVM instrumentation
Augment LLVM code with callback function
Implemented by LLVM pass
Memory I/O
If the memory location is not tainted, the byte from the concrete memory is picked and turned into a bit-vector.
If an instruction reads a value from memory as an operator, and that memory is marked as tainted, we construct a new bit-vector expression with the size of the read.
Need to capture potential symbolic values read from (or written to) shadow memory, while still mixing in concrete values from the underlying storage if no symbolic values are present.
Operator Translation
Translates LLVM instruction in the form of a symbolic SMT-LIB expression
Symbolic Pointers
When the address read from or written to is tainted i.e. is a symbolic value. This involves that the exact result of the operation depends on the input
SMT solver solves selected constraint
Z3 SMT solver
Generate new input
Test Case Generation
Once the constraints of the current test case have been captured, new inputs are generated using the solutions to the constraints produced by Microsoft's Z3 SMT solver
Individual constraints are iteratively negated
Dynamic Taint Analysis
Shadow memory
Taint analysis requires bookkeeping of the taint state of the systems memory, including I/O memory and CPU registers, in order to properly propagate taint.
Symbolic Execution
Concolic Execution
Restricted variant of symbolic execution
Instead of covering all possible execution paths in one execution, concolic execution outputs the PC that is recorded during a single execution.
Generate new input files by selectively negating individual branch constraints and solving the resulting, new, path constraint.
Extending taint analysis in order to learn about how exactly the inputs to the program affect its execution
Program is executed while all its inputs are treated as symbolic variables
Transformations on inputs during the execution are tracked in the form of logical formulas depending on these symbolic variables
System performing execution kept symbolic values in a symbolic state that mirrors the memory locations of the system
Path constraint
List of logical frst-order expressions
depending on the input variables.
Forking
Not scaling well to larger applications
Weakness
LLVM
Panda
LLVM JIT
QEMU
Record & Replay
Satisfiability Modulo Theories
Standardized format of representing SMT formulas is SMT-LIB
SAT
Determining if an interpretation of boolean variables for a given boolean formula exists
e.g. Z3
What is
First-order logic
NP-complete
Fuzzing
Executing a program with (semi-)random inputs in an automated fashion with the goal of causing errors in the program.
Bit-vector
atoi
Evaluation
Validate the correctness of the symbolic execution implementation and its performance against complex branch conditions in isolation
Category
Complex Control Flow Logic
Samples from real-world applications
Server software
Sendmail
WU-FTP
BIND
Source code with buffer overflow vulnerabilities
Not all of the samples could be run
Complex Branch Conditions
Simple Comparison
Nested if statements
Loop:
Simple Arithmetic
Arithmetic operations to verify common binary operations
Complex Arithmetic
Larger equation based on two integers read from the input file is compared to a constant value
strcmp
Larger equation based on two integers read from
the input file is compared to a constant value
Compared with
American Fuzzy Lop fuzzing tool
Author
Eike Siewertsen
Contribution
Platform to dynamically generate test les for programs that cause the program to behave dierently.
Analysis of binary programs with zero knowledge about the program internals, without any source code, and on multiple architectures
Goals