Architectures of Distributed System

Architectural Models

is its structure in terms of separately specified components and their interrelationships.

Goal:

ensure that the structure will meet present and likely future demands on it.
make the system reliable, manageable, adaptable and cost-effective.

Architectural Elements

what is communicating and how those entities communicate together define a rich design space for the distributed systems developer to consider

the entities that communicate in a distributed system are typically processes, leading to the prevailing view of a distributed system as processes coupled with appropriate interprocess communication paradigms

communication paradigm:

interprocess communication

remote invocation

indirect communication

relatively lowlevel support for communication between processes in distributed systems, including message-passing primitives, direct access to the API offered by Internet protocols (socket programming) and support for multicast communication

most common communication paradigm in distributed systems, covering a range of techniques based on a two-way exchange between communicating entities in a distributed system and resulting in the calling of a remote operation, procedure or method

Request-reply protocols

Request-reply protocols are effectively a pattern imposed on an underlying messagepassing service to support client-server computing

pairwise exchange of messages from client to server and then from server back to client, with the first message containing an encoding of the operation to be executed at the server and also an array of bytes holding associated arguments and the second message containing any results of the operation, again encoded as an array of bytes

Remote procedure calls (RPC)

attributed to Birrell and Nelson [1984

procedures in processes on remote computers can be called as if they are procedures in the local address space. The underlying RPC system then hides important aspects of distribution, including the encoding and decoding of parameters and results, the passing of messages and the preserving of the required semantics for the procedure call

supports client-server computing with servers offering a set of operations through a service interface and clients calling these operations directly as if they were available locally

offer (at a minimum) access and location transparency.

Remote method invocation (RMI)

resembles remote procedure calls but in a world of distributed objects

a calling object can invoke a method in a remote object

As with RPC, the underlying details are generally hidden from the user

supporting object identity and the associated ability to pass object identifiers as parameters in remote calls

benefit more generally from tighter integration into objectoriented languages

key techniques

Group communication: Group communication is concerned with the delivery of messages to a set of recipients and hence is a multiparty communication paradigm supporting one-to-many communication.

Publish-subscribe systems: Many systems, such as the financial trading can be classified as information-dissemination systems wherein a large number of producers (or publishers) distribute information items of interest (events) to a similarly large number of consumers (or subscribers).also called distributed eventbased systems

Message queues: Whereas publish-subscribe systems offer a oneto-many style of communication, message queues offer a pointto-point service whereby producer processes can send messages to a specified queue and consumer processes can receive messages from the queue or be notified of the arrival of new messages in the queue. Queues therefore offer an indirection between the producer and consumer processes

Tuple spaces: Tuple spaces offer a further indirect communication service by supporting a model whereby processes can place arbitrary items of structured data, called tuples, in a persistent tuple space and other processes can either read or remove such tuples from the tuple space by specifying patterns of interest. Since the tuple space is persistent, readers and writers do not need to exist at the same time.

Distributed shared memory: Distributed shared memory (DSM) systems provide an abstraction for sharing data between processes that do not share physical memory.

Architecture Styles

Organize into logically different components, and subsequently distribute those components over the various machines

(a) Layered style is used for client-server system;
(b) object-based style for distributed object systems.

Decoupling processes in space (“anonymous”) and also time (“asynchronous”) has led to alternative styles:

(a) Publish/subscribe and
(b) Shared data space (combination of Event based and data centered architecture)

System Architecture

Centralized Architectures

Simple/basic Client–Server Model

Characteristics: • There are processes offering services (servers) • There are processes that use services (clients) • Clients and servers can be distributed across different machines • Clients follow request-reply model/behavior with respect to using services

Multi-Tiered Architectures (Physical Distribution)

• Single-tiered: dumb terminal/mainframe configuration • Two-tiered: client/single server configuration • Three-tiered: each layer on separate machine

Application Layering (logical view)

Traditional three-layered view: • User-interface layer contains units for an application’s user interface • Processing layer contains the functions of an application, i.e. without specific data • Data layer contains the data that a client wants to manipulate through the application components

Also referred as Vertical Distribution: Placing logically different components on different machine

• Fat Client: Most of the application’s code resides on client side • Fat Server: Most of the application’s code resides on the server side

2-Tier Application

• Two-tier applications remain the most common client/server architecture. • The entire application is decomposed into two sets of services. • The client combines UI services + business services and the other data services.

3-Tier Application

• It decomposes an application into three sets of services: UI, business, and data. • Business logic is moved to an application server • Shared data to a database server

Decentralized Architectures

Peer-to-peer system

• A horizontal distribution where a client or server may be physically split up into logically equivalent parts, but each part is operating on its own share of the complete data set, thus balancing the load. • Interaction between processes is symmetric: each process will act as a client and a server at the same time (acting as a servant)

• All of the processes involved in a task or activity play similar roles, interacting cooperatively as peers • No distinction between client and server processes or the computers that they run on

Aim of P2P: • Exploit the resources (data and hardware) in a large number participating computers for the fulfillment of a given task or activity

Application

• Composed of large number of peer processes running on separate computers • Pattern of communication between depends on application requirements

An individual computer holds only a small part of the application database and the storage, processing and communication loads are distributed across many computers and network links.

Superpeers

Sometimes it helps to select a few nodes to do specific work: superpeer

Examples: • Peers maintaining an index (for search) • Peers monitoring the state of the network • Peers being able to setup connections

Hybrid Architectures

• Client-server architectures combined with decentralized architecture (peer-to-peer solutions)

Example: Edge-server architectures, which are often used for Content Delivery Networks (example: youtube)

Example: Collaborative Distributed Systems - Combining a P2P download protocol with a client-server architecture for controlling the downloads: Bittorrent

Basic idea: Once a node has identified where to download a file from, it joins a swarm of downloaders who in parallel get file chunks from the source, but also distribute these chunks amongst each other.