Please enable JavaScript.
Coggle requires JavaScript to display documents.
2015 - Platform Agnostic Binary Maximization (Information (Information…
2015 - Platform Agnostic Binary Maximization
Information
Stripped binary
Binary files without debugging information
Targets
Approach
Generate thorough test suite given a stripped binary and corpus of input files
Grey box dynamic analysis on the program as it consumes each input files
Derive useful information for future test candidate to iterate and achieve 100%
Coverage maximization
Dynamic Taint Analysis
PANDA recording of execution and later taint tracing to identify key branch condition influenced by input files
Input files grouped together with script file, dependencies, target program, and utilities to communicate with PANDA using hypercall as a package
Package is converted into ISO image and loaded into the guest system, it will then be extracted in guest and executed via the script
PANDA would be contacted to start recording via custom hypercalls by a custom signaled-recorded plugin
Run the target program with the input
Taint utility accept input file as argument and labels each byte as tainted
At the recording stage, taint plugin will not react to plugin, when replay begin, the taint plugin will find the hypercall produced by panda_taint plugin and begin tracing
Taint callback only produces LLVM register number, taint compute number (TCN), and set of taint labels present on that register.
To fetch additional information
Derive information: tainted branches, how it is influenced
Taint label
User-customizable set of labels that can be assigned
to each byte of the tainted input.
Integer indices of the bytes of an input file from 0..n.
Taint Compute Number (TCN)
On each compute operation
(addition, multiplication, etc.) the TCN is incremented
Number of operations that have taken place to yield the current
tainted register from its initial state with TCN = 1
This information can then used to determine which branch conditions are highly or only slightly correlated to their
original input value.
Takes test file from input corpus
Enter information into database, can be furthered filtered for coverage frontier
Coverage frontier:
number_attempts counter
Increment when a branch previously detected appears
Ensures all parts of the application is tested at a reasonable extent
Highlight branches difficult to explore
Identify half-covered conditional
The number of operations that have taken place to yield the current tainted register from its initial state with TCN = 1
Picking next branch for target
Filters half-covered conditionals in coverage frontier reachable from current input file
Prioritize half-covered conditionals based on least number of number_attempts
Generate Children
Takes information provided by picking process
Create input that will explore the uncovered portion of contional
Run another iteration with the child input file
Dynamic Values
Instrument LLVM to include custom logging functionality
Called during each block execution
Used to record values which are only retrievable at runtime, such as the arguments to load, store, call, and branch instructions.
Automated test case generation related to user controlled inputs
No software is completely safe, vulnerability is consistently written by developer
Consume structured inputs
Example
Image / media parsing libaries
PDF readers
Compression parser
Information access for test case generator
Black box testing
Assumes the source code is not available and any internal state introspection is unavailable. Information can only be gained by monitoring externally-visible behavior such as return code and running status
Grey box testing
Hybrid, source code unavailable but internal state information is retrievable via dynamic instrumentation
White box testing
Assumes the input generator for test case generation has full access to the source code and able to recompile with additional instrumentation
Notes
What is
Hypercall
Conclusion
PANDA
Hypercall instruction
Goals
Problem
Test suite fails to exercise full functionality of tested program
Producing test case is generally a manual process and requires knowledge on the program
Requires great effort to manually produce a thorough test suite
Vulnerabilities
Objective
Contribution