2007 - Panorama: Capturing System-wide Information Flow for Malware…
2007 - Panorama: Capturing System-wide Information Flow for Malware Detection and Analysis
Malware detection method
Unable to detect malware without its signature
Heuristic based detection
Based on heuristic not based on fundamental characteristics of malware, may incur high false positive and negative
Most malware types share a trait which is malicious or suspicious information access and behavior
Representation of how the information flow
Combination of taint propagation information at hardware level and OS-level knowledge (e.g which module is called or which file tainted data is written)
Basis for further analysis and detection policy creation
Load the sample into emulation environment
Input sensitive information not intended for the sample
Taint the sensitive information for monitoring
Generate test case
Monitor access and processing behavior of the sample
Especially in regards to the sensitive information
Results in taint graph
Analyze the recorded information
Determine whether sample is malicious / benign
System to demonstrate the approach
Malware Analysis Engine
Examine the taint graphs, for detailed analysis information
Malware Detection Engine
Use a set of devised policies to detect malware from unknown samples
HTTP & HTTPS
Data received as a response to the web requests
Packets received in response to ping requests are labeled ICMP.
When listing a directory, all accessed disk blocks that hold file directory information are tainted as directory
Instruction that always produce the same result
e.g. xor eax, eax
Taint input may be used access entry in table
Result of read is tainted
Control flow evasion
Taint information may also propagate through control flow
Current implementation of Panorama does not handle this situation.
OS-Aware Taint Tracking
Extract operating-system level information.
Identifying code under analysis
Identify the actions of the code under analysis
Code under analysis operates on tainted data if an instruction in it accesses the taint directly
This can be checked in a straightforward fashion by consulting the mapping between instruction addresses and modules.
Record the current value of the stack pointer, together with the current thread identifier
Check whether there is a recorded stack pointer for the current thread identifier when executing jumps out of the code
Process and module information
Knowing which process and module this instruction comes
Maintaining a mapping between addresses in memory and
modules requires information from the guest operating system.
Developed a kernel module called module notifier and load this module to collect updated memory information.
Module notifier registers two callback routines.
First callback routine is invoked whenever a process is created or deleted.
Second callback routine is called whenever a new module is loaded and gathers the address range in the virtual memory that the new module occupies.
Module notifier obtains the value of the CR3 register for each process.
CR3 register contains the physical address of the page table of the current process
Filesystem and network information
When tainted data is written to the hard disk, we wish to identify which file it is written to.
Interested in when tainted data is written to the hard disk or sent over the network.
The Sleuth Kit
Disk forensic tool
When tainted data is written to a block on the hard disk, TSK can determine which file this block belongs to.
When tainted data is sent out, we simply check the packet header to find out which connection it belongs to.2
Open source emulator software
Automatically perform the analysis
Cooperate with Taint Engine to determine which part of the input should be executed
Taint graph based analysis
Taint-Graph-Based Malware Detection
Malicious code exhibit anomalous behavior
Anomalous information access
Simple access performed
by the samples under analysis
Excessive information access
For some information sources, benign samples may access
some of them occasionally, while malicious samples will access
them excessively to achieve their malicious intent.
Taint-Graph-Based Malware Analysis
Given a taint graph, the first step is to check this graph
for the presence of a node that corresponds to the sample
Such existence indicate sample is suspicious as test cases are designed to never be accessed by the sample
Sample’s successor nodes in the graph can be examined for information
Focus on Windows-based malware