Please enable JavaScript.

Coggle requires JavaScript to display documents.

2016 - CodeXt: Automatic Extraction of Obfuscated Attack Code from Memory…

- - - - Supports in-vivo multi-path analysis and allows us to execute any basic block either concretely with QEMU or symbolically with KLEE
    - - Pinpoint the exact code start and boundaries by exploring all the legitimate execution start points and paths
    - - Handle potential dynamic binary transformation and self-modifying code
    - - There is some intrusion or malware detection system that can detect the execution of attack code in real-time and it will dump the memory around the instruction where the attack has been detected and other attack context information
      - Assume the attack context information includes some system call triggered by the attack code and corresponding register values
      - Dumped memory is large enough to contain all hidden attack code present in the runtime memory when the attack was detected
      - No infinite loop in the the attack code and our system will terminate after a configurable maximum number of instructions have been executed
    - - S2E plugins which can monitor, track, and direct the selective symbolic execution of any given byte stream by exploring all execution paths from all offsets
      - Filters out impossible code snippets
      - Records those that are feasible and satisfy the attack context information given
    - - Further analyzes the online results to derive the hidden code’s start and boundaries
    - - Determine the existence of, exact start, and the boundaries of any hidden code from a given memory dump
      - To leverage the system call information from the IDS, we have developed a S2E plugin to catch all the system calls triggered from within a given memory dump
      - The hidden code is usually mingled with random data/code
      - Every offset in the memory dump is treated as a possible logical start, or entry point, of the hidden code
      - Online kill conditions
        
        To avoid unnecessary symbolic execution
        
        Immediately terminate an offset’s execution
        
        Condition
        
        Any instruction does not align to the system call we know
        
        Invalid memory access such as a segmentation fault
        
        Exception due to an invalid instruction;
        
        Detected system call number or address does not match given context from the IDS
        
        Execution of end of path system calls
        
        Jumps out of bounds of the memory buffer
      - Record the symbolically executed instructions that end with a system call as a code fragment for each starting offset
        
        Any application level attack code must execute one or more segments of privileged code (i.e., system calls) to cause any real harm
      - To model code with multiple system calls, we define a code chunk as a sequence of code fragments in a control flow. To extract code with multiple system calls, we merge adjacent code fragments into a code chunk
    - - Recovering transient code involved in multiple layers of self-modification,
      - Need to take snapshots for each layer of decoding
      - Self-modifying code
        
        Executing dynamically generated instruction
        
        Can reliably identified if any instruction consists of bytes written by the code under observation
        
        Achieved by tracking all the memory updates within the memory buffer range at run-time.
      - We do not want to take a snapshot for each dynamically generated instruction as one layer of decoding normally consists of multiple cor-related instruction blocks
      - Instead we developed a clustering based approach for obtaining appropriate snapshots of self-modifying code
      - Maintain a global counter of all the instructions executed, and assign the current global counter to each to be executed instruction as its unique sequence number, which reflects the temporal order of the execution of all instructions
      - We treat one cluster of writes as one snapshot. We mark those snapshots from which we executed any instructions after the snapshot was created. These marked snapshots correspond to each layer of self-modifying code executed
      - By stringing the snapshots together, generate a memory map to show the changes over time. Specifically, can see all the values of all memory bytes translated, executed, or written, even if the same memory location has been overwritten multiple times during the execution