Please enable JavaScript.
Coggle requires JavaScript to display documents.
2017 - An Improved Method to Unveil Malware’s Hidden Behavior (Authors…
2017 - An Improved Method to Unveil Malware’s Hidden Behavior
Goals
Problem
How to unleash the hidden behaviors of targeted malware.
Many works have been done to mitigate this problem. However, these solutions either use limited and fixed sandbox environments or introduce time and space consuming multi-path exploration.
Objecctive
Focused on environment-targeted malware
Identify possible targeted malware
Unveil targeted malwares environment sensitive behaviors
Information
Approach
Hybrid dynamic analysis scheme by applying function summary based symbolic execution of malware.
Pre-selection of Potential Targeted Malware
Select the potential targeted malware through static function call graph and dynamic analysis information
Static Function Call Graph
Graph that represents calling relationships between functions in program
Each node represented a function, each edge (f1,f2) indicates function f1 calls function f2
Static FCG was created by conducting static analysis of program, the program could be in source code or binary code form
For binary code, it should be disassembled first using IDA Pro, then FCG is created by analyzing the disassembly file
We should de-obfuscate malware if they were packed, we only deal with currently known types of packer
Runtime FCG only represents part of the code being executed, not suitable
Dynamic Information
Cuckoo
Open source automated malware analysis system
It could monitor native API calls as well as API call
Information obtained at runtime that can constantly changes
API call name to represent the dynamic execution process and organize them as a sequence
Map Dynamic Information on FCG
Function name could miss match between dynamic information and FCG since dynamic information could come from different API levels.
Windows API level
Normal API
Native API
System call
High level APIs are wrappers of low level APIs
By comparing dynamic analysis information with its function call graph, we calculate one metric to measure the coverage of its functionality. When this metric is too small, it is determined to have the potential hidden behavior.
Function Summary Based Symbolic Execution
Function Summary
e.g. (x > 0 ∧ ret = 1) ∨ (x < 0 ∧ ret = 0)
Propositional logic whose propositions are constraints expressed in theory T
If the execution of the function terminates on a return statement, a post condition can be computed by taking the conjunction of constraints associated with memory locations
If the function terminates on a halt or abort statement, we define postw= false
We optimized the function summary to reduce unnecessary symbolic value
Automatic Generate Windows API Function Summary
Most Windows API is exposed using DLL files, developers can use API functions through static import and dynamic load. When using static import, API would be inserted into Import Address Table
Here for most of the API, they are not actually executed the original code, rather use a fake API stub to summary its behavior
During implementation of Windows API summary, we have to detect parameters number and type, and release its space on stack manually
Emulate Key Windows API Function
Optimized the function summary to reduce unnecessary symbolic value
Symbolic execution process is really expensive
Static and Dynamic Hook Windows API
Hooking Windows API functions is done in the PE executable file IAT
To hook APIs which are dynamically loaded, we hooked Get ProcAddress function, then implement the hook of been queried APIs inside GetProcAddress’s summary
Through parsing arguments of GetProcAddress, we determine the name of called APIs, then assign the address of its corresponding function summary.
System Setup
unicorn
CPU emulator
IDA Pro
Multi-processor disassembler and debugger
Extract the FCG
angr
Multi-architecture binary analysis platform
Perform dynamic symbolic execution
Cuckoo
Sandbox malware analysis system
Provides dynamic information
Windows guest system
Targeted malware
Malware that is able to determine whether it infects the targeted machine or not by querying the victim environment
Example
Keyboard layout
Version of the operating system
What is?
Environment
A system configuration, such as the version of operating system, system language, and the existence of certain system objects, such as file, registry and devices
Hyphotesis
Using function summary technique, we improved the analysis speed of symbolic execution
Related Works
Contribution
Without the usage of full system emulation, achieve much higher speed
Method to evaluate the execution progress of the sample in sandbox analysis
Authors
Qiang Li
Yunan Zhang
Liya Su
Yang Wu
Xinjian Ma
Zeming Yang