Please enable JavaScript.

Coggle requires JavaScript to display documents.

Instruction-Level Parallelism and Superscalar Processors (Design Issues,…

- - - - Refers to the degree to which the instructions of a program can be executed in parallel
      - A combination of compiler based optimization and hardware techniques can be used to maximize instruction level parallelism
    - - True data dependency
        Input of the next instruction is the output of the previous (RAW)
      - Procedural dependency
        Previous instruction is a branch, code of the target can cause affects on input of the next
      - Resource conflicts
        2 instructions access the same resource (bus, registers,…)
      - Output dependency
        2 instructions write values to the same output (Write-after-write - WAW)
      - Anti-dependency
        Write-after-read situation (WAR)
      - Situations in which parallel executions can not be used
- - - - Renaming registers
        
        Registers allocated dynamically
        
        Compiler techniques attempt to maximize the use of registers maximizing the number of storage conflicts if parallel execution is applied. Register renaming is a technique of duplication of resources (more registers are added). Registers are allocated dynamically by the processor hardware, and they are associated with the values needed by instructions at various points in time. Thus, the same original register reference in several different instructions may refer to different actual registers.
        
        May result in a pipeline stall (nghẽn)
        
        Output and antidependencies occur because register contents may not reflect the correct ordering from the program
      - Duplication of resources
      - Out-of-order issue
  - - - The order in which instructions are fetched
      - The order in which instructions are executed
      - The order in which instructions update the contents of register and memory locations
    - - In-order issue with in-order completion
      - In-order issue with out-of-order completion
      - Out-of-order issue with out-of-order completion
  - - - Delayed branch strategy was explored
      - Processor always executes the single instruction that immediately follows the branch
      - Keeps the pipeline full while the processor fetches a new instruction stream
    - - Delayed branch strategy has less appeal (không là yêu cầu)
      - Have returned to pre-RISC techniques of branch prediction
      - Reasons: multiple instructions need to execute in the delay slot, instruction dependencies are major interest
      - Superscalar Execution: Xem slide 19 bài 16
      - Superscalar Implementation
        
        Instruction fetch strategies that simultaneously fetch multiple instruction
        
        Logic for determining true dependencies involving register values, and mechanisms for communicating these values to where they are needed during execution
        
        Mechanisms for initiating, or issuing, multiple instructions in parallel
        
        Resources for parallel execution of multiple instructions, including multiple pipelined functional units and memory hierarchies capable of simultaneously servicing multiple memory references
        
        Mechanisms for committing the process state in correct order