Please enable JavaScript.
Coggle requires JavaScript to display documents.
How to scale databases? - Coggle Diagram
How to scale databases?
2. Sharding Methods
(Horizontal Scaling)
Key Based Sharding (Hash Function)
Pros
The most efficient and best one to prevent unevenly distributed data (such as geo location of users and stores on a global map).
Fast query time to find the correct partition node
Don't need to save a map to look up for the correct node that have the queried data stored.
Cons
Dynamically adding new node is tricky!
When you add a new node, all the original value that saved in that range of the hash function output will need to be rehashed. Because hash function is one-way, and for each node that's basically a HashSet that we probably won't be able to directly separate the data into two or multiple parts based on the location they are in the set.
Range Based Sharding
Pros
Easy to implement
Dynamic adding nodes is not a big issue
Cons
Hotspot issue: can easily cause uneven distributed request if the data is not uniformly distributed.
Directory Based Sharding
Pros
Simple and fit well for small data sets that need to scald
Easy dynamic adding nodes
Cons
Cannot be Scala really well when the data is really huge, because the need to save a map to find the nodes is causing high for performances
1. Non-Sharding Methods
Cheaper
Setting up a remote database
Implement cache
Expensive
Creating one or more reading replicas
Upgrading to a larger server (vertical-scaling)
3. Pros and Cons of Sharding
Pros
The most economic way to scale large amount of data storage infrastructure. Comparing to vertical-scaling, the horizontal-scaling with sharding can save a lot.
Speeding up the query response time by having multiple nodes to distribute the work load.
Application can be more reliable and prevent one single point of failure.
Cons
Harder to maintain than single node of DB
Implementation can be tricky and might cause data loss or errors
The revere sharding is hard, i.e. it is not easy to change the data back. Also the backup of old single node DB and new sharded DB are not compatible.
Some database engine simply doesn't support sharding.