3.1 Configuration
USIMM, an x86 simulator with detailed memory system model, included a DRAM cache
4-level cache hierarchy (L1, L2, L3 being on-chip SRAM cache, and L4 being off-chip DRAM cache, with 64B line size)
the virtual memory system
L4 - Alloy Cache - results are normalized to it
cache misses fill all levels of the hierarchy
equip it with a MAP-I predictor to overcome tag lookup latency for cache misses
assume:
the heterogeneous memory system with DRAM cache using HBM technology
the main memory using DDR-based DIMM technology
corresponding to a 1/8th scale of Knights Landing
same access latency for both of the technologies - in accordance with stacked memory specifications
however:
the bandwidth of stacked-DRAM is 8x higher than main-memory, with 4x channels and 2x bus width