Backend System Design

All terms

Scalability


Definition
Scalability is the capability of a system, process, or a network to grow and manage increased demand.


Types: Vertical and Horizontal

Availability


Definition
Availability is the time a system remains operational to perform its required function in a specific period. It is a simple measure of the percentage of time that a system, service, or machine remains operational under normal conditions.


How to calculate
Availability (%) = (Actual operation time/Scheduled operation time) x 100%
Example - 99.99%

Reliability


Definition
Probability that the system will meet certain performance standards in yielding correct output for a desired time duration.


Comparison with Availability
The system can be available but not reliable. For example, Availability is 99.99% but the system is vulnerable to cyber-attacks


How to calculate
MTBF(Mean Time Between Failures) = (total elapsed time – sum of downtime)/number of failures

Caching

Types


  1. Client caching


    Caches can be located on the client-side (OS or browser)


  2. CDN caching


  3. Web server caching


    Reverse proxies and caches such as Varnish can serve static and dynamic content directly. Web servers can also cache requests, returning responses without having to contact application servers.


  4. Database caching


  5. Application caching


    In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage.

Cache Invalidation


  1. Cache-aside


    image


  2. Write-through


    image


  3. Write-behind (write-back)


    image

Databases

Types

SQL


A relational database like SQL type is a collection of data items organized in tables.
🚩ACID is a set of properties of relational database transactions.


There are many techniques to scale a relational database:

  1. 🚩 master-slave replication
  2. 🚩 master-master replication
  3. 🚩 Horizontal Partitioning
  4. 🚩 Vertical Partitioning
  5. Denormalization
    Denormalization attempts to improve read performance at the expense of some write performance. Redundant copies of the data are written in multiple tables to avoid expensive joins. Some RDBMS such as PostgreSQL and Oracle support materialized views which handle the work of storing redundant information and keeping redundant copies consistent.
  6. SQL Tuning

Also, there is a popular solution Vitess that allows to scale relational databases simpler and better.

Availability and Consistency for Data

Data Partitioning

Partitioning Methods


  1. Horizontal Partitioning
    image
  2. Vertical Partitioning
    image
  3. Directory-Based Partitioning

Partitioning Criteria


  1. Key or Hash-based Partitioning


    Under this scheme, we apply a hash function to some key attributes of the entity we are storing; that yields the partition number.


  2. List partitioning


    In this scheme, each partition is assigned a list of values, so whenever we want to insert a new record, we will see which partition contains our key and then store it there.


  3. Round-robin partitioning


    This is a very simple strategy that ensures uniform data distribution. With ‘n’ partitions, the ‘i’ tuple is assigned to partition (i mod n).


  4. Composite Partitioning


    Under this scheme, we combine any of the above partitioning schemes to devise a new scheme.


  5. Consistent Hashing Ring


    Key or Hash-based Partitioning has disadvantages - hashes could be distributed unequally and further redistributions affect performance very strong (need to recalculate all hashes).


    Consistent Hashing maps data to physical nodes and ensures that only a small set of keys move when servers are added or removed.

Consistent Hashing details


General scheme
image


Virtual nodes
Vnodes help spread the load more evenly across the physical nodes on the cluster by dividing the hash ranges into smaller subranges
image


Data replication using Consistent Hashing to ensure highly availability and durability
image

NoSQL (Not Only SQL)

Key-value store


Abstraction: hash table


A key-value store generally allows for O(1) reads and writes and is often backed by memory or SSD. Data stores can maintain keys in lexicographic order, allowing efficient retrieval of key ranges. Key-value stores can allow for storing of metadata with a value.


DB: Redis, Hazelcast, Riak, Voldemort, Memcache

Document store


Abstraction: key-value store with documents stored as values


A document store is centered around documents (XML, JSON, binary, etc), where a document stores all information for a given object. Document stores provide APIs or a query language to query based on the internal structure of the document itself. Note, many key-value stores include features for working with a value's metadata, blurring the lines between these two storage types. The data is generally semi-structured & stored in a JSON-like format.


DB: MongoDB, CouchDB, DynamoDB

Wide column store


Abstraction: nested map ColumnFamily<RowKey, Columns<ColKey, Value, Timestamp>>


A wide column store's basic unit of data is a column (name/value pair). A column can be grouped in column families (analogous to a SQL table). Super column families further group column families. You can access each column independently with a row key, and columns with the same row key form a row. Each value contains a timestamp for versioning and for conflict resolution. This may seem similar to traditional relational databases, but rather than grouping columns together into tables, each column is stored in a separate file or region in the system’s storage.
image


Hbase structure example:
image
ts - timestamp


DB: Hbase, Bigtable, Cassandra

Graph database


Abstraction: graph


In a graph database, each node is a record and each arc is a relationship between two nodes. Graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships.
image

Time-Series databases


Time-Series databases are optimized for tracking & persisting time series data.
Time-series data is primarily used for running analytics, deducing conclusions and making future business decisions looking at the results of the analytics.


DB: Influx DB, Timescale DB, Prometheus

SQL or NoSQL


Reasons for SQL:

  • Structured data
  • Strict schema
  • Relational data
  • Need for complex joins
  • Transactions
  • Clear patterns for scaling
  • More established: developers, community, code, tools, etc
  • Lookups by index are very fast

Reasons for NoSQL:

  • Semi-structured data
  • Dynamic or flexible schema
  • Non-relational data
  • No need for complex joins
  • Store many TB (or PB) of data
  • Very data intensive workload
  • Very high throughput for IOPS

CAP and DB


image

Load Balancer

What is it?


A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance.
LB products: NGINX, NetScaler, etc.


image

Benefits


  • Reduced downtime
  • Scalable
  • Redundancy
  • Flexibility
  • Efficiency

Redundant Load Balancers
image

Load Balancing Algorithms


  • Round Robin – Requests are distributed across the group of servers sequentially.
  • Weighted Round Robin Method
  • Least Connections – A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
  • Least Response Time – Sends requests to the server selected by a formula that combines the fastest response time and fewest active connections.
  • Least Bandwidth Method - This method selects the server that is currently serving the least amount of traffic measured in megabits per second (Mbps).
  • Hash – Distributes requests based on a key you define, such as the client IP address or the request URL.
  • IP Hash – The IP address of the client is used to determine which server receives the request.
  • Random with Two Choices – Picks two servers at random and sends the request to the one that is selected by then applying the Least Connections algorithm.

Additional benefits (could be expensive here)


  • SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations.
  • Session persistence - Issue cookies and route a specific client's requests to same instance if the web apps do not keep track of sessions

Types

Layer 4 load balancing


Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT).


Example: DNS Load Balancing
image

Layer 7 load balancing


Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers.
At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7, although the performance impact can be minimal on modern commodity hardware.

Reverse Proxy

What is it?


A reverse proxy is a web server that centralizes internal services and provides unified interfaces to the public. Requests from clients are forwarded to a server that can fulfill it before the reverse proxy returns the server's response to the client.


image


Usually, reverse proxy is located at FrontEnd Server Host
image

Benefits


  • Increased security - Hide information about backend servers, blacklist IPs, limit number of connections per client
  • Increased scalability and flexibility - Clients only see the reverse proxy's IP, allowing you to scale servers or change their configuration
  • SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
  • Compression - Compress server responses
  • Caching - Return the response for cached requests
  • Static content - Serve static content directly
    • HTML/CSS/JS
    • Photos
    • Videos
    • Etc

Load balancer vs reverse proxy


  • Deploying a load balancer is useful when you have multiple servers. Often, load balancers route traffic to a set of servers serving the same function.
  • Reverse proxies can be useful even with just one web server or application server, opening up the benefits described in the previous section.
  • Solutions such as NGINX and HAProxy can support both layer 7 reverse proxying and load balancing.

Asynchronism =>
Message Queue

Definition


A message queue is a queue that routes messages from the source to the destination or the sender to the receiver following the FIFO (First in, first out) policy.


Message queues facilitate asynchronous behavior.


Message queues facilitate cross-module communication, which is key in service-oriented and microservices architecture. They enable communication in a heterogeneous environment, providing temporary storage for storing messages until they are processed and consumed by the consumer.

Popular implementations


Apache Kafka, Rabbit MQ

How MQ may improve the System


Let's consider the example with Notification Systems.



Pull-based approach
image
Disadvantages:

  • polling the database so often
  • it's not a realtime

Push-based approach
image

ACID


Using for 🚩SQL Database


  • Atomicity - Each transaction is all or nothing
  • Consistency - Any transaction will bring the database from one valid state to another
  • Isolation - Executing transactions concurrently has the same results as if the transactions were executed serially
  • Durability - Once a transaction has been committed, it will remain so

CAP


Using for 🚩 Availability and Consistency for Data


Consistency ( C ): All nodes see the same data at the same time. This means users can read or write from/to any node in the system and will receive the same data. It is equivalent to having a single up-to-date copy of the data.


Availability ( A ): Availability means every request received by a non-failing node in the system must result in a response. Even when severe network failures occur, every request must terminate.


Partition tolerance ( P ): A partition is a communication break (or a network failure) between any two nodes in the system, i.e., both nodes are up but cannot communicate with each other. A partition-tolerant system continues to operate even if there are partitions in the system.



Reality. Therefore, the theorem can really be stated as: In the presence of a network partition, a distributed system must choose either Consistency or Availability.


Example
image
image

BASE


BASE is often used to describe the properties of NoSQL databases. In comparison with the CAP Theorem, BASE chooses availability over consistency.


  • Basically available - the system guarantees availability.
  • Soft state - the state of the system may change over time, even without input.
  • Eventual consistency - the system will become consistent over a period of time, given that the system doesn't receive input during that period.

PACELC


image


The PACELC theorem states that in a system that replicates data:

  • if there is a partition (‘P’), a distributed system can tradeoff between availability and consistency (i.e., ‘A’ and ‘C’);
  • else (‘E’), when the system is running normally in the absence of partitions, the system can tradeoff between latency (‘L’) and consistency (‘C’).

High Availability for Services

What is it?


It's Services/Microservices that do only some business logic and are not responsible for Data

Ways

Redundancy

Definition


Redundancy is the duplication of critical components or functions of a system with the intention of increasing the reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance.

Example


image

Replication

Definition


Replication means having a number of similar nodes running the workload together. There are no standby or passive instances. When a single or a few nodes go down, the remaining nodes bear the load of the service. Think of this as load balancing.

Example


image

High Availability Clustering

Definition


A High Availability cluster also known as the Fail-Over cluster contains a set of nodes running in conjunction with each other that ensures high availability of the service.
The nodes in the cluster are connected by a private network called the Heartbeat network that continuously monitors the health and the status of each node in the cluster. Nodes in a cluster are presenting one service (not multiple).
A single state across all the nodes in a cluster is achieved with the help of a shared distributed memory & a distributed co-ordination service like the Zookeeper.

Example


image

Definition


Data partitioning is a technique to break a big database (DB) or cache into many smaller parts. It is the process of splitting up a DB/table across multiple machines to improve the manageability, performance, availability, and load balancing of an application.

What is it?


Data is any information that is stored in DB, Cache, File System.
We consider Distributed between different hosts Data

Replications

Definition


Replication means sharing information to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Types


Master-slave replication
image


Master-master replication
image

Data consistency between replicas


Leader-Follower
Allow only a single server (called leader) to be responsible for data replication and to coordinate work.
image


Quorum
A quorum is the minimum number of servers on which a distributed operation needs to be performed successfully before declaring the operation’s overall success.
image


Semi-sync
This ensures that a change has not only been applied locally on the database server, but also that there is at least one other server which has received and persisted the change.

How to reach the right node?


Central registry


Registry knows about all partitions and replicas. Any service asks the Registry about the address of a relevant node. The popular solution for the Central registry is Zookeeper.
image


Gossip protocol


Details is here.
image

Best practices


  1. There is a recommendation to interact with any DB through a special service that provides an agnostic API and hides the details. Also, the service can be as a Cache layer, Coordinator Node (Zookeeper), and so on.
    Example
    Metadata Service
    image


  1. Also, there is a common recommendation to begin with SQL databases because it's simpler and this topic is more researched.

Separation of Web Servers and
Application Servers

General


Separating out the web layer from the application layer allows you to scale and configure both layers independently. Adding a new API results in adding application servers without necessarily adding additional web servers.
The single responsibility principle advocates for small and autonomous services that work together. Small teams with small services can plan more aggressively for rapid growth.


image

Other

Monitoring/Logging


TechStack: Elasticsearch/Logstash/Kibana
The example how servers and monitoring/logging systems are interacting:
image

FrontEnd Server possible responsibilities


  • Request validation
  • Authentication/Authorization
  • TLS termination
  • Server-Side encryption
  • Caching
  • Rate limiting (throttling)
  • Request Dispatching
  • Request deduplication
  • Usage data collection

Requirements

Users/Customers

  • Who will use the System
  • How the system will be used

Scale (read and write)

  • How many read queries per a second
  • How many write queries per a second
  • How much data is queried per request
  • Can there be spikes in traffic

Perfomance

  • What is expected write-to-read data delay
  • What is expected p99 latency for read queries

Cost

  • Should the design minimize the cost of development
  • Should the design minimize the cost of maintenance