Please enable JavaScript.
Coggle requires JavaScript to display documents.
Caching - Coggle Diagram
Caching
-
Glossary
- Cache Hit
When requested data is in the cache
- Cache Miss
When requested data is not in the cache
- Cache Flushing
When the amount of unwritten data in the cache reaches a certain level, the controller periodically writes cached data to a drive.
- Cache eviction
When entry is removed from cache to free resources (memory)
- False Cache Eviction
When key is used often, but it's still evicted from the cache (may happen in LRU)
-
Caching Strategies
- Cache Aside
Most comment strategy when the application has access to the cache and to the storage. If something is not in the cache, the application retrieves it from the storage and adds it to the cache.
- Cache Aside (Pros & Cons)
-- Pros:
1) Only data that is needed is in the cache
-- Cons:
1) Cache misses are expensive
2) Data Staleness
3) Implementation complexity (app should work with two storage places, ect)
- Read Through
The application does not have direct access to the storage and communicates only with the cache API. The cache API retrieves data from the storage if it's a miss and caches it, before it returns the data to the application. (This pattern is common in ORM frameworks)
- Read Through (Pros & Cons)
-- Pros
1) Cache only what's needed
2) Transparent
-- Cons
1) Cache misses are expensive
2) Data Staleness
3) Reliability
- Write Through
The application has access only to the API and for each update, API stores the data also in the cache.
- Write Through (Pros & Cons)
-- Pros
1) Data is never stale (it's always up to date)
-- Cons
1) Write is expensive
2) Cache may contain redundant data
- Write Behind
The application interacts only with the API, when update occurs, the data is not written immediately to the slower storage, but awaits in the cache for more events/ data to occur or when a timeout occurs and then everything will be written (later) in bulks (will be flushed in the storage). Cache flushing - when the amount of unwritten data in the cache reaches a certain level, the controller periodically writes cached data to a drive.
- Write Behind (Pros & Cons)
-- Pros
1) Fast writes
2) Reduces load for the storage
-- Cons
1) Not reliable (if memory fails, data will be lost)
2) Lack of consistency
Cache Eviction Policies
1) LRU Policy (Least Recently Used) - most used policy
LRU cache is implemented with a LinkedList (LL), where head pints to the next item to be removed from the cache (evicted) and the tail points to the last recently used. When there is a cache hit, the item is moved to the tail of the list and if it's a cache miss, the missing entry is inserted to the tail and they become the most recently used item. When a new entry is inserted to the tail, the head element is removed from the cache.
- Cons:
-- When many keys are requested in short period of time or one after another, it may remove popular keys in order to free space (this problem is solved with LFU) = false cache eviction
- Pros:
-- Very efficient
-- Faster & Cheaper
2) LFU Policy (Least Frequently Used)
Very similar to LRU, but every key (in the cache) has a counter. When there is a hit, then the counter of the entry is reset and counters of the other entries keep increasing. When there is a miss, the key with the larges counter will be evicted. This means that entries that are often access, stay in the cache.
- Pros:
-- Entries that are often used stay in the cache (less false eviction)
- Cons:
-- There is overhead to maintain counters
-- More expensive
-
Products
-
- Redis
PROS:
-- Most popular for distributed cache
-- In memory key-data store
-- Supports additional data structures (string, list, set, hash)
-- Does NOT support JSON or nested data types
-- Limited only by RAM
-- Supports 100K+ requests per second on single node
-- Keys support TTL (time to live)
-- Support persistence (can save data to disk) => if it fails, it can recover from disk
-- It persist data once every second => if it crashes, then the data from the last second will be lost
-- Extremely fast, high throughput
CONS:
-- Added complexity
-- Should app work without Redis, if it crashes
-- How many instances you will need, if RAM limits to 500GB
-
-
What is Caching?
Common technique used in many cases (CPU, browsers, etc) to improve read times. It stores data from slower storage (like disk) to faster storage (like memory) in order to speed up a system. Browsers store data on disk, not to read it again from the network.
What is a Cache?
It is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere