Backend System Design

Databases

All terms

Scalability

Definition
Scalability is the capability of a system, process, or a network to grow and manage increased demand.

Types: Vertical and Horizontal

Availability

Definition
Availability is the time a system remains operational to perform its required function in a specific period. It is a simple measure of the percentage of time that a system, service, or machine remains operational under normal conditions.

How to calculate
Availability (%) = (Actual operation time/Scheduled operation time) x 100%
Example - 99.99%

Reliability

Definition
Probability that the system will meet certain performance standards in yielding correct output for a desired time duration.

Comparison with Availability
The system can be available but not reliable. For example, Availability is 99.99% but the system is vulnerable to cyber-attacks

How to calculate
MTBF(Mean Time Between Failures) = (total elapsed time – sum of downtime)/number of failures

Caching

Types

Client caching

Caches can be located on the client-side (OS or browser)
CDN caching
Web server caching

Reverse proxies and caches such as Varnish can serve static and dynamic content directly. Web servers can also cache requests, returning responses without having to contact application servers.
Database caching
Application caching

In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage.

Cache Invalidation

Cache-aside
Write-through
Write-behind (write-back)

Databases

Types

SQL

A relational database like SQL type is a collection of data items organized in tables.
🚩ACID is a set of properties of relational database transactions.

There are many techniques to scale a relational database:

🚩 master-slave replication
🚩 master-master replication
🚩 Horizontal Partitioning
🚩 Vertical Partitioning
Denormalization
Denormalization attempts to improve read performance at the expense of some write performance. Redundant copies of the data are written in multiple tables to avoid expensive joins. Some RDBMS such as PostgreSQL and Oracle support materialized views which handle the work of storing redundant information and keeping redundant copies consistent.
SQL Tuning

Also, there is a popular solution Vitess that allows to scale relational databases simpler and better.

Availability and Consistency for Data

Data Partitioning

Partitioning Methods

Horizontal Partitioning
Vertical Partitioning
Directory-Based Partitioning

Partitioning Criteria

Key or Hash-based Partitioning

Under this scheme, we apply a hash function to some key attributes of the entity we are storing; that yields the partition number.
List partitioning

In this scheme, each partition is assigned a list of values, so whenever we want to insert a new record, we will see which partition contains our key and then store it there.
Round-robin partitioning

This is a very simple strategy that ensures uniform data distribution. With ‘n’ partitions, the ‘i’ tuple is assigned to partition (i mod n).
Composite Partitioning

Under this scheme, we combine any of the above partitioning schemes to devise a new scheme.
Consistent Hashing Ring

Key or Hash-based Partitioning has disadvantages - hashes could be distributed unequally and further redistributions affect performance very strong (need to recalculate all hashes).

Consistent Hashing maps data to physical nodes and ensures that only a small set of keys move when servers are added or removed.

Consistent Hashing details

General scheme

Virtual nodes
Vnodes help spread the load more evenly across the physical nodes on the cluster by dividing the hash ranges into smaller subranges

Data replication using Consistent Hashing to ensure highly availability and durability

NoSQL (Not Only SQL)

Key-value store

Abstraction: hash table

A key-value store generally allows for O(1) reads and writes and is often backed by memory or SSD. Data stores can maintain keys in lexicographic order, allowing efficient retrieval of key ranges. Key-value stores can allow for storing of metadata with a value.

DB: Redis, Hazelcast, Riak, Voldemort, Memcache

Document store

Abstraction: key-value store with documents stored as values

A document store is centered around documents (XML, JSON, binary, etc), where a document stores all information for a given object. Document stores provide APIs or a query language to query based on the internal structure of the document itself. Note, many key-value stores include features for working with a value's metadata, blurring the lines between these two storage types. The data is generally semi-structured & stored in a JSON-like format.

DB: MongoDB, CouchDB, DynamoDB

Wide column store

Abstraction: nested map ColumnFamily<RowKey, Columns<ColKey, Value, Timestamp>>

A wide column store's basic unit of data is a column (name/value pair). A column can be grouped in column families (analogous to a SQL table). Super column families further group column families. You can access each column independently with a row key, and columns with the same row key form a row. Each value contains a timestamp for versioning and for conflict resolution. This may seem similar to traditional relational databases, but rather than grouping columns together into tables, each column is stored in a separate file or region in the system’s storage.

Hbase structure example:

ts - timestamp

DB: Hbase, Bigtable, Cassandra

Graph database

Abstraction: graph

In a graph database, each node is a record and each arc is a relationship between two nodes. Graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships.

Time-Series databases

Time-Series databases are optimized for tracking & persisting time series data.
Time-series data is primarily used for running analytics, deducing conclusions and making future business decisions looking at the results of the analytics.

DB: Influx DB, Timescale DB, Prometheus

SQL or NoSQL

Reasons for SQL:

Structured data
Strict schema
Relational data
Need for complex joins
Transactions
Clear patterns for scaling
More established: developers, community, code, tools, etc
Lookups by index are very fast

Reasons for NoSQL:

Semi-structured data
Dynamic or flexible schema
Non-relational data
No need for complex joins
Store many TB (or PB) of data
Very data intensive workload
Very high throughput for IOPS

CAP and DB

Load Balancer

What is it?

A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance.
LB products: NGINX, NetScaler, etc.

Benefits

Reduced downtime
Scalable
Redundancy
Flexibility
Efficiency

Redundant Load Balancers

Load Balancing Algorithms

Round Robin – Requests are distributed across the group of servers sequentially.
Weighted Round Robin Method
Least Connections – A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
Least Response Time – Sends requests to the server selected by a formula that combines the fastest response time and fewest active connections.
Least Bandwidth Method - This method selects the server that is currently serving the least amount of traffic measured in megabits per second (Mbps).
Hash – Distributes requests based on a key you define, such as the client IP address or the request URL.
IP Hash – The IP address of the client is used to determine which server receives the request.
Random with Two Choices – Picks two servers at random and sends the request to the one that is selected by then applying the Least Connections algorithm.

Additional benefits (could be expensive here)

SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations.
Session persistence - Issue cookies and route a specific client's requests to same instance if the web apps do not keep track of sessions

Types

Layer 4 load balancing

Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT).

Example: DNS Load Balancing

Layer 7 load balancing

Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers.
At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7, although the performance impact can be minimal on modern commodity hardware.

Reverse Proxy

What is it?

A reverse proxy is a web server that centralizes internal services and provides unified interfaces to the public. Requests from clients are forwarded to a server that can fulfill it before the reverse proxy returns the server's response to the client.

Usually, reverse proxy is located at FrontEnd Server Host

Benefits

Increased security - Hide information about backend servers, blacklist IPs, limit number of connections per client
Increased scalability and flexibility - Clients only see the reverse proxy's IP, allowing you to scale servers or change their configuration
SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
Compression - Compress server responses
Caching - Return the response for cached requests
Static content - Serve static content directly
- HTML/CSS/JS
- Photos
- Videos
- Etc

Load balancer vs reverse proxy

Deploying a load balancer is useful when you have multiple servers. Often, load balancers route traffic to a set of servers serving the same function.
Reverse proxies can be useful even with just one web server or application server, opening up the benefits described in the previous section.
Solutions such as NGINX and HAProxy can support both layer 7 reverse proxying and load balancing.

Asynchronism =>
Message Queue

Definition

A message queue is a queue that routes messages from the source to the destination or the sender to the receiver following the FIFO (First in, first out) policy.

Message queues facilitate asynchronous behavior.

Message queues facilitate cross-module communication, which is key in service-oriented and microservices architecture. They enable communication in a heterogeneous environment, providing temporary storage for storing messages until they are processed and consumed by the consumer.

Popular implementations

Apache Kafka, Rabbit MQ

How MQ may improve the System

Let's consider the example with Notification Systems.

Pull-based approach

Disadvantages:

polling the database so often
it's not a realtime

Push-based approach

ACID

Using for 🚩SQL Database

Atomicity - Each transaction is all or nothing
Consistency - Any transaction will bring the database from one valid state to another
Isolation - Executing transactions concurrently has the same results as if the transactions were executed serially
Durability - Once a transaction has been committed, it will remain so

CAP

Using for 🚩 Availability and Consistency for Data

Consistency ( C ): All nodes see the same data at the same time. This means users can read or write from/to any node in the system and will receive the same data. It is equivalent to having a single up-to-date copy of the data.

Availability ( A ): Availability means every request received by a non-failing node in the system must result in a response. Even when severe network failures occur, every request must terminate.

Partition tolerance ( P ): A partition is a communication break (or a network failure) between any two nodes in the system, i.e., both nodes are up but cannot communicate with each other. A partition-tolerant system continues to operate even if there are partitions in the system.

Reality. Therefore, the theorem can really be stated as: In the presence of a network partition, a distributed system must choose either Consistency or Availability.

Example

BASE

BASE is often used to describe the properties of NoSQL databases. In comparison with the CAP Theorem, BASE chooses availability over consistency.

Basically available - the system guarantees availability.
Soft state - the state of the system may change over time, even without input.
Eventual consistency - the system will become consistent over a period of time, given that the system doesn't receive input during that period.

PACELC

The PACELC theorem states that in a system that replicates data:

if there is a partition (‘P’), a distributed system can tradeoff between availability and consistency (i.e., ‘A’ and ‘C’);
else (‘E’), when the system is running normally in the absence of partitions, the system can tradeoff between latency (‘L’) and consistency (‘C’).

High Availability for Services

What is it?

It's Services/Microservices that do only some business logic and are not responsible for Data

Ways

Redundancy

Definition

Redundancy is the duplication of critical components or functions of a system with the intention of increasing the reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance.

Example

Replication

Definition

Replication means having a number of similar nodes running the workload together. There are no standby or passive instances. When a single or a few nodes go down, the remaining nodes bear the load of the service. Think of this as load balancing.

Example

High Availability Clustering

Definition

A High Availability cluster also known as the Fail-Over cluster contains a set of nodes running in conjunction with each other that ensures high availability of the service.
The nodes in the cluster are connected by a private network called the Heartbeat network that continuously monitors the health and the status of each node in the cluster. Nodes in a cluster are presenting one service (not multiple).
A single state across all the nodes in a cluster is achieved with the help of a shared distributed memory & a distributed co-ordination service like the Zookeeper.

Example

Definition

Data partitioning is a technique to break a big database (DB) or cache into many smaller parts. It is the process of splitting up a DB/table across multiple machines to improve the manageability, performance, availability, and load balancing of an application.

What is it?

Data is any information that is stored in DB, Cache, File System.
We consider Distributed between different hosts Data

Replications

Definition

Replication means sharing information to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Types

Master-slave replication

Master-master replication

Data consistency between replicas

Leader-Follower
Allow only a single server (called leader) to be responsible for data replication and to coordinate work.

Quorum
A quorum is the minimum number of servers on which a distributed operation needs to be performed successfully before declaring the operation’s overall success.

Semi-sync
This ensures that a change has not only been applied locally on the database server, but also that there is at least one other server which has received and persisted the change.

How to reach the right node?

Central registry

Registry knows about all partitions and replicas. Any service asks the Registry about the address of a relevant node. The popular solution for the Central registry is Zookeeper.

Gossip protocol

Details is here.

Best practices

There is a recommendation to interact with any DB through a special service that provides an agnostic API and hides the details. Also, the service can be as a Cache layer, Coordinator Node (Zookeeper), and so on.
Example
Metadata Service

Also, there is a common recommendation to begin with SQL databases because it's simpler and this topic is more researched.

Separation of Web Servers and
Application Servers

General

Separating out the web layer from the application layer allows you to scale and configure both layers independently. Adding a new API results in adding application servers without necessarily adding additional web servers.
The single responsibility principle advocates for small and autonomous services that work together. Small teams with small services can plan more aggressively for rapid growth.

Other

Monitoring/Logging

TechStack: Elasticsearch/Logstash/Kibana
The example how servers and monitoring/logging systems are interacting:

FrontEnd Server possible responsibilities

Request validation
Authentication/Authorization
TLS termination
Server-Side encryption
Caching
Rate limiting (throttling)
Request Dispatching
- make a system more resilient
- Bulkhead pattern, Circuit Breaker
Request deduplication
Usage data collection

Requirements

Users/Customers

Who will use the System
How the system will be used

Scale (read and write)

How many read queries per a second
How many write queries per a second
How much data is queried per request
Can there be spikes in traffic

Perfomance

What is expected write-to-read data delay
What is expected p99 latency for read queries

Cost

Should the design minimize the cost of development
Should the design minimize the cost of maintenance