Please enable JavaScript.
Coggle requires JavaScript to display documents.
Networks and Distributed Systems - Coggle Diagram
Networks and Distributed Systems
Networks
Security
Basic Principles
Confidentiality
Restricting access to authorised users
Nonrepudiation
Confirming that something happended
Integrity
Stopping other people modifying messages
Authentication
Are you who you say you are?
Obscurity
Is not a type of security.
At best it slows attacker down.
It gives a false sense of security.
Shannon's Maxim
"The enemy knows the system"
Kerckhoff's Principle
"a cryptosystem should be secure even if everything about the system, except the key, is public knowledge"
Possible Faults
Bad software
The software
you
write
The software
your suppliers
write
(supply chain attack)
Humans
Social Engineering
Phishing
Dumpster Diving
Attacker Profiles
Persona
Criminals
Corporate Actors
Governments/state-sponsored actors
Terrorists
Activists
Ex-exployeeys
Ex-lovers
Curious, bored or mischievous people
Motivation
Money
Espionage
Disruption
Propoganda
Revenge
To see if it can be done
For a laugh
Security Mechanisms
Invalid
(Not security mechanisms)
NAT (Network address translation)
Valid
(Is a security mechanism)
Firewalls
A network device that sits of the edge of a network
Inspects each packet passing through it and matches them to a set of rules.
Packets that meet the rules are passed
Packets that don't meet the rules are dropped or rejected
Operates at the Network Layer (IP & port numbers)
Types of Firewalls
Stateful Firewalls
Modern firewalls = stateful
Inspect TCP/IP header fields and keep track of connections
Have rules for related and established
e.g. allow outbound traffic to port 80 but block inbound traffic from port 80 unless it is related to existing connection
A common rule set of home firewalls is:
Deny all inbound unless related/established
Allow all outbound
Stateless Firewalls
Limited, e.g. a rule to allow web traffic to any web server would allow an attacker to do anything from port 80 inbound
Application Layer Firewalls
Firewalls that can act at the application layer
a.k.a. "Layer 7" firewalls or Application Layer Gateways
Inspect the contents of the packet to make sure that the packet that looks like HTTP on port 80 is actually HTTP
Doesn't necessarily work with encrypted traffic
Network Intrusion Detection
Detect people getting into your network or legitimate users doing malicious things
(N)IDS
Monitors a network looking for malicious activity or policy violations
Can be signature based or use anomaly-based detection
Can be used to monitor and alert only, or it may be able to take action to prevent the instrusion
Physical Security
If an attacker has physical access to a network, it is much more vulnverable
e.g. a visitor plugging into a corporate network and having access to everything
e.g. someone splicing into an external connection and sniffing/injecting traffic
Not always possible, therefore a technological solution is required
Network Access Control
Restricting what unauthorised devices have access to
MAC Address FIltering
Is the simplest "solution", but it's easy to circumvent and an administrative overhead
Port-based Network Access Control
(802.1x)
Uses the Extensible Authentication Protocol (EAP) to authenticate clients.
Blocks the switch/prot wireless client until they authenticate successfully
Involves three parties
A supplicant (client)
An authenticator (switch or AP)
An authentication server (e.g. LDAP using RADIUS)
Reply from the authentication server can include details of the VLAN that a client is put on.
e.g. students could be placed on one VLAN, staff on another, and authorised guests completely separately.
IPSEC
Details of IPSEC beyond scope of sources
Tacked on in IPv4, easier in IPv6
Essentially encryption at the Network Layer to prevent sniffage
Virtual Private Networking (VPN)
Simulates a direct connection between devices
Two main types
Site-to-Site VPN
Securely connects two networks
e.g. a branch to head office
Remote-Access VPN
Individual clients connect to a VPN server
e.g. Globalprotect
VPNs may be routed (common) or bridged (uncommon)
Common contemporary implementations use SSL, TLS, or IPSEC. Also IKE, L2TP or PPTP
Topic: Wireless
Wireless connections are inherently less secure that a wired connection as anyone in range can listen in or participate
Wireless Encryption is used to restrict access and ensure data integrity
Types of Wireless Encryption
Wired Equivalent Privacy (WEP)
The original method of encryption for 802.11 Wiresless Networks
Proved to be insecure in early 2000s
In 2005, the FBI demonstrated that a busy WEP protected network could be cracked in 3 minutes
In 2007, additional techniques for generating traffic allowed a quiet network to be cracked with a 50% chance in < 1 minute
Wi-fi Protected Access (WPA)
WPA
Introduced in 2003
An intermediate solution to the weakness of WEP
Uses Temporal Key Integrity Protocol
WPA2
Introduced in 2004
move secure & complex than WPA
Uses AES as encryption algorithm
WPA3
Introduced in 2018
Same encryption 'strength' as WPA2
Improves security of initial key exchange
WPA-Personal
Uses pre-shared keys
WPA-Enterprise
Uses 802.1x authentication
WPA Pre-Shared Key (PSK) Vulnerabilities
Weak auto-generation of ISP-set default keys
Wi-Fi Project Setup (WPS)
Intended to make it easier to connect to a WPA-protected network
User enters an 8-digit pin or presses a button on the AP for initial connection
BUT pin can be brute-forced quickly
Prevailing recommendation is to disable WPA
note from self: surely you mean [WPS]
Attacks
KRACK - October 2017
Key Reinstallation AtttACK
Replay attack that resets a number used during authentication
KrOOk - August 2019
Vulnerability in some WiFi chip-sets
About 1bn devices affected
Fixed by software update
Distributed Systems
Distributed System Models
Physical Models
Component, computer and network based model
Architectural Models
Component and relationship based model
Validated against: Availability, Performance, Adaptability, Security
Architectural Elements
The building blocks of a distributed system
Communicating Entities
Which entities are communicating?
System-oriented perspective
Lower level, what is running?
Processes
Threads
Problem-oriented perspective
Higher level can I abstract what is running?
Objects
Components
Web services
Communication Paradigms
How do these entities communicate?
Inter-process communication
Network Level
Message passing
Sockets
Remote Invocation
Communication protocols based on a two-way exchange
Request-reply protocols (e.g. HTTP)
Remote procedure calls (RPC)
Remote method invocation (RMI)
Limitations of inter-process and remote invocation
Space Coupling
Senders need to know how to reach receivers (i.e. destination endpoints)
Time Coupling
Senders and receivers need to be up and running at the same time
Indirect communication
Time and space decoupling between senders and receivers
Group communication
Publish-subscribe systems
Message queues
Tuple spaces
Distributed shared memory
Roles and Responsibilities
What roles/responsibilities do they have?
Client-Server
Clients send requests to servers and receive responses back
Clients enabled to access shared resources managed by server
Peer-to-peer
All entities involved play similar roles
Clients themselves can join the p2p network
Enables decentralised approaches
Placement
How are they mapped on to the physical distributed infrastructure?
Impacts: performance, reliability, security.
Mapping of services to multiple servers
Applications replicated across servers
Data can be replicated or partitioned
Replication
Pros:
Fault tolerance
load balancing
Cons:
Need to ensure data consistency
Partitioning
Pros:
No overhead for data consistency
Cons:
No real fault tolerance
Caching
Store recently used data objects closer to the client(s)
Caches can also be co-located with clients (e.g. web browser)
Web proxy servers can also be used to access web servers through a firewall
Architectural Patterns
Layering
Vertical organisation of services into service layers
Each higher provides a SW abstraction and hides the details of lower layers
Contains middleware that aims to mask heterogeneity and provide suitable programming abstraction
Tiering
Complementary to layering
Can organise functionality of a given layer and place this functionality into servers
Fundamental Models
All dist. systems share a number of properties
Processes communicate,
interacts
affect performance
Processes, machines and networks
fail
, affecting reliability
WIdely distributed, affecting its cyber
security
Interaction Model
Defines the different way processes interact
Performance of communication channels
Latency
From start of transmission to start of reception, measured in seconds
Bandwidth
Total amount of information that can be transmitted in a given time
Jitter
Variation in the time it takers to deliver a series of messages
Computer Clocks and Timing Events
Drift
Clocks update their counting register with different frequencies, hence they deviate over time
Drift Rate
Time difference for time unit with respect to an ideal clock
Synchronous vs Asynchronous Systems
Synchronous Distributed Systems
Assumes known upper and lower bounces on processing time, network latency and clock drift rate
Simpler design
Useful when starting out
Asynchronous Distributed Systems
Assumes not bound on process execution speed, message transmission time and clock drift rate
Harder to design
Models the internet and real systems much better than synchronous assumptions
Any solution that is valid for an async dist system is also valid for a sync one
Failure Model
Defines the possible ways a process or communication channel can fail
Types of Failure
Arbitrary Failures
Timing Failures
Process Omission Failues
Methods of deling with failure
Masking
e.g process replication to tolerate process omission failures
e.g. retransmission to tolerate channel omission failures
Conversion
e.g. use checksum to detect corrupted messages and discard them; arbitrary failure converted to channel omission failure
Security Model
Goal is to protect objects and secure processes and their interactions
Threat Model
The enemy
Threats to processes
Identity Spoofing
Malicious code
Threats to communication channels
Copy Messages
Alter Messages
Inject messages
Cryptography
Encrypt exchanged and stored data to preserve confidentiality
Authentication
Based on cryptography and message digests; digital signatures
Prove the identity of sender and receiver
Secure Channels
Authenticated parties
Provide integrity and privacy of transmitted data
Causality and Time
It is important to know the
relative order
in which events take place, or will take place, in the system.
Clock Synchronisation
Types
Mutual (or internal) synchronisation of clocks in a dist. system
As the title says, useful if an application requires a consistent view of time across all nodes of the system
External Clock Synchronisation
Using an external time source (typically the Coordinated Universal Time, UTC) to configure computers with real-time clocks.
Algorithms
Centralised Algorithms
Uses a single node, the
time server node
which is regarded as the correct time and used to update all other nodes to the time of the time server node
Passive Time Server Centralised Algorithms
Cristian's Algorithm
Time of the node is updated using the time the server node responds with + the time taken to request and respond / 2
Active Time Server Centralised Algorithms
Berkeley Algorithm
Use one master node to collect the times of all the other nodes, finds the average of those, and the sends the adjustment each node needs to make with their individual clock to me consistent with the average
Drawbacks
Unreliable
due to single point of failure
Not scalable
as too much burden on single time server
Distributed Algorithms
Each node's clock is independently synchronised with real time :
Network Time Protocol
Allows clients across the internet to be accurately synchronised to Universal Time
Tree shaped network with multiple roots
Each root connected directly to a time source
Each layer in the network synchronises with the previous one
Logical Time
Lamport Clock
Uses a monotonically increasing count as the timer
If a -> b (a happened before b) then Count(a) <= Count(b)
If C(a) <= C(b) then we cannot be sure a -> b
Vector Clock
Designed to detect causality
Failure
Types
Omission
Process/channel fails to perform intended actions
Arbitrary (Byzantine)
Wrong process/channel behaviour
Timing
(In synchronous systems)
Failure Detectors
Service that processes queries regarding process failure
Unreliable Failure Detectors
Returns hints:
unsuspected
/
suspected
Implemented by most practical systems
Reliable Failure Detectors
Return
unsuspected
(hint) /
failure
(guarantee)
Difficult to implement in practice
Critical Sections
A section of code where processes access shared resources and perform write operations
ME1 [Safety]
At most, one process can remain in CS at any time (Mutual Exclusion)
ME2 [Freedom from deadlock]
In every configuration, at least one process must be eligible to take an action and enter its CS (Another safety property)
ME3 [Progress]
Every process trying to enter its CS must eventually success (Liveness property)
Violation of this property is known as
livelock
or
starvation
Deadlock
No process is able to do steps to make a progress, thus not exit
Livelock
Each process is able to do steps but without making any progress
Correctness
Safety
At most one process executes in critical section at any time
Liveness
: Requests to enter/exit critical section eventually succeed
Fairness
Access to critical section in
happened-before ordering
of requests
Performance
Minimise number of messages sent in each entry/exit operation
Minimise
client delay
when entering/exiting critical section
Minimise
synchronisation delay
(time between one process exiting and another entering critical section)
Algorithms
Central Server Algorithm
Token Based
Ring-Based Algorithm
Token based
Multicast and Logical Clock Algorithm
Non token based
Ricart and Agrawala's Algorithm
Non token based
Maekawa's Algorithm
Quorum based
Leader Election
Election called after coordinator failure detected
Participant
A process that is engaged in an election
Correctness
Safety
Each participant has either elected no-one or the participant with the largest identifier that has not crashed
Liveness
All processes participate and eventually elected someone, or crash
Performance
Minimise total number of messages sent
Minimise
turnaround time
number of serialised message transmissions in a single run
Algorithms
Ring-based Algorithm
Assumes processes arranged in logical ring where each process knows its successor
Assumes asynchronous system
Messages:
election
,
coordinator
Bully Election Algorithm
Assumes synchronous system
Each process knows all other processes and can communicate with them
Messages:
election
,
answer
,
coordinator
Uses timeouts to detect failure
Group Communication
Types
Closed
When the size of the group remains unchanged
Open
Where membership changes from time to time
Peer-to-Peer
Where all members are equal, and each member communicates with its peers to sustain the group's activities
Hierarchical
Where one member is distinguished from the rest and communication typically takes place between this distinguished member and the rest of the group
Mutlicast
Assumptions
Group membership is known
Reliable one-to-one communication
Processes only fail by crashing
Basic Multicast
(B-multicast)
Operations
deliver(m)
delivers the message
m
to the calling process
multicast(g,m)
sends message
m
to all members of group
g
Guarantees
A correct process will eventually deliver the message
Only group members receive messages sent to group
Atomic Multicast
A multicast is
atomic
when the messages is received either by every functioning group member, or by no member at all.
Reliable Multicast
(R-multicast)
All processes in a group must receive copies of the messages sent to the group
Correctness
Integrity
A correct process delivers a message at most once and only group members receive messages sent to the group
Validity
If a correct process multicasts message
m
, it will eventually deliver
m
(ensures liveness for the sender)
Agreement
If a correct process delivers a message
m
, all other correct processes in the group will eventually deliver
m
(all or none)
Tolerates process crashes
Ordered Multicast
All messages sent to the group are delivered in the same order by each receiver
Ordering Types
FIFO Order
If a correct process sends message m then message m' to the same group, then any correct process delivers m before m'
Causal Order
If multicast(g,m) -> {causes} multicast(g,m') then any correct process deliver m before m'
implies FIFO order
Total Order
If a correct process delivers m before m', then any correct process delivers m before m'
does not imply FIFO or causal order
Implementing FIFO
Sender sends sequences numbers, receivers accept in order of sequence numbers
Implementing Total Ordering
Sequencer Process
A designated member of the group acts as sequencer, and from that point essentially FIFI sequence implementation
Distributed Agreement
Uses a 2-phase (commit) protocol, sending a timestamped message, receiving time stamped agreements. Then sends commit messages
Consensus
Consensus Problem
A group of processes must agree on a value, after one or more proposes a value
Assumptions
N processes
reliable communication
synchronous or asynchronous system
must tolerate process failures:
crash failures
byzantine failures (e.g. processes can lie)
Fault process: exhibits some specific types of fault; all other processes are correct
Requirements
Termination
Eventually each correct process sets its value
Agreement
All correct processes must decide on same value
Integrity
If all correct processes propose the same value, then any correct process that has decided must choose that value
Solutions
RTO Multicast
If only crash failures are allowed...
Solving consensus is equivalent to solving RTO-multicast
And vice versa
No solution for consensus == no solution for RTO-multicast
This is not a guaranteed solution in asynchronous systems
Summary
Assuming crash failures only, consensus in
synchronous
systems possible
Assuming arbitrary failures, consensus in
synchronous
systems only possible if N >= 3f + 1
Consensus in
asynchronous
systems impossible even with only one crash failure
Imperfect but still very good practical solutions exist
Replication and Consistency Models
Replication
Strategies
Prioritise Consistency (CP)
Need to synchronise replicas to achieve consistency, high cost
Optimistic Replication, Prioritise Availability (AP)
Possible conflicts, resolving them also has a cost
CAP Theorem
At most two of:
Consistency
Availability
Partition Tolerance
Can be achieved
Availability
The system can still be accessed despite failures
Partition Tolerance
The system still continues two function even if some of its nodes stop communicating together
Consistency
Any client's view of the system should be the same as any other's
Types
Passive Replication
Clients communicate with a single replica,
the primary
, and one or more replicas are used as backup copies.
Active Replication
Clients communicate with a group of k servers (also called replicas)
Consistency Models
Types
Weak Consistency Models
Used by AP replication approaches
No global ordering, more relaxed guarantees on consistency
Strong Consistency Models
Used by CP replication approaches
Require global ordering of updates
When updates are committed, replicas need to reach agreement on global ordering
Different models, depending on guarantees on total ordering
Models
Strict (Strong) Consistency
Requires that regardless of the number of replicas of a shared variable
x
, every process receives a response that is consistent with the real time (difficult to implement)
Linearizability
Ordering of operations is determined by time
Slightly weaker version of strict consistency
Each trace can interleave the individual reads and writes into a single total order that respects the local ordering of the reads and writes of every process
Sequential Consistency
Ordering of operations is determined by program order
Slightly weaker than above 2
Much more widely used
Requires that:
if a write x := u precedes another write x := v in a trace,
then no process reading x will read x = v before x = u
Eventual Consistency
Requires that after an unspecified time after an update, if there are no more updates, eventually all accesses will return the same value
Used when performance or availability is a high priority and temporary inconsistency is tolerable
conflicts need to be reconciled which needs consensus
Strong Eventual Consistency
Requires the use of a Conflict-Free Replicated Data Type which has a pre-defined set of operations than cannot cause conflicts
Guarantees
Eventual Delivery
Update executed at some correct replica eventually executes at all correct replicas
Termination
All update executions terminate
Strong Convergence
Correct replicas that have executed the same update have equivalent states (rather than eventually have equivalent states)
Approaches
Operation-based
On update, send operation only
On receipt, apply operation
Requires operations commute with each other
Needs recipe for reconciling non-commutative operations
State-based
On update, send full state
One receipt, merge with local state
Causal Consistency
All causally related writes must be seen by every process in the same order
Values returned by reads must be consistent with the causal order
Writes that are not causally related can be seen in any order
Quorum-based protocols
A quorum is a partial set of replicas intended to improve availability
There should be a
read quorum
of replicas and a
write quorum
of replicas that intersect
The idea is then to read a file, the client needs to assemble a read quorum of R servers
And to write to a file, the client needs to assemble a write quorum W of servers
Trace
A sequence of read (R) and write (W) operations on a shared variable x
Remote Invocation
Request-Reply Protocols
Supports roles and message exchanges in typical client-server interactions
Usually synchronous
Can be reliable
Can use UDP instead of TCP as acknowledgements are redundant since requests are followed by replies
Message identifiers
Request ID
Sender ID
Failure Model
Process Crashes
Channel omission failures
No order guarantee
Timeouts
Return an error code/exception
Retransmit
Spin Offs:
Request (R)
Request-Reply-Acknowledge (RRA)
History
Used to store previous replies so the operation doesn't have to be performed again
Remote Procedure Call (RPC)
Improving transparency by calling procedures on remote machines just as if they were local procedures
The underlying RPC system hides:
The encoding and decoding of parameters and results
The passing of messages
The preserving of the required semantics for the procedure call
Design Issues
RPC Call Semantics
Maybe Semantics
No fault tolerance measures applied
Susceptible to omission and crash failures
At-least-once Semantics
Retransmission of request messages
Crash failures and arbitrary failures
Arbitrary can occur when the server executes the request more than once, possibly storing/returning the wrong values
At-most-once Semantics
Retransmission of request messages, duplicate filtering, retransmit reply
Crash failures only
Exactly-Once Semantics
This is what is used for local procedure calls
Programming with Interfaces
Cannot directly access server variables
Parameter passing mechanisms: call-by-value vs call-by-reference
Addresses cannot be used as input/output arguments of remote calls
Transparency
ie things that can break the transparency that RPC tries to create
Remote procedure calls are more vulnerable to failure than local ones
Latency is several orders of magnitude greater than a local one
RPC does not offer call by reference
Remote Method Invocation (RMI)
Calling a method on a potentially remote object
Commonalities
between RMI and RPC
They use interfaces
They're built on top of request-reply protocols
They aim for transparency
How RMI
differs
from RPC
RMI can use OOP
All objects have unique object references and can be passed as parameters
Object Models
Conventional Object Model
Distributed Object Model
Objects physically distributed into different processes or computers in a DS
Could have Objects managed by servers with clients that invoke their methods using RMI
Could then copy a remote object locally to improve access performance
Can mask heterogeneity
Actions upon method invocation
The state of invoked object may change
Further invocations on methods in other objects may take place
A new object may be instantiated
Web Services
Definitions
Web Servers
A server accessible over the internet that provides resources to web browsers via the HTTP protocol
Web Services
Application-specific client interacts with a remote service via a functionally specialised interface over the internet
Communication Paradigms
Asynchronous communication
For time-consuming operations, replies can be received later on
E.g. processing of a booking
Synchronous communication
For interactive, fast operations, i.e. request reply
E.g. booking processing status enquiries
Loose Coupling
Goal
Minimise dependencies between different web services to avoid changing one web service requiring the update of other services
How is it achieved?
Focusing on interfaces rather than implementation
Designing simple and generic interfaces
Opting for asynchronous rather than synchronous communication
SOAP Protocol
Simple Object Access Protocol
Async/sync interactions over the internet
Defines how to use XML to represent the content of messages
Defines how messages are exchanged
RESTful Web Services
Representational State Transfer (REST)
Operations
Simple minimum uniform set of operations
GET
PUT
POST
DELETE
Focuses on resources rather than the operations allowed on them
Each resource can be accessed through a specific, unique URL
Statefullness (or lack of)
Be Stateless
i.e getPage=2 rather than getNextPage
getNextPage requires a state that know current page
Resource Representation
XML
JSON
Peer-to-peer
Description
Many hosts contribute data and computation
Peers are spread over the internet
Peers can leave or fail
Computational power, memory, disk, and bandwidth increase with the number of peers
Challenges
Churn
The independent arrival and departure by thousands or millions of peers
How to distribute data and computation
How to access distributed data/computation
How to cope with failures and high level of
churn
Case Studies
Napster
First working, large-scale application of p2p paradigm (1999)
Famous for music exchange in MP3 format
Up to millions of users simultaneously download/uploading music files
Shut down for copyright infringement in 2001
BitTorrent
Well-known p2p system for file sharing
Enables sharing large files (e.g. movies)
Principal Design Features
Files split into chunks
Chunks are replicated over peers
A "client" peer can download different chunks from different "server" peers in parallel