Please enable JavaScript.
Coggle requires JavaScript to display documents.
SWEN439 (Main Goals is to maintain a database which has the following…
SWEN439
-
Modern Databases
Cloud Databases (70%)
CAP ( Consistency, Availability, Partition Tolerance)
BASE properties: are opposite to ACID which are 4 main properties of transactional relational sql db.
B means basically available but not always when you connect to the internet in most cases you'll get your DB but not always
-
-
E for eventually consistent means if u stop updating it after awhile it will be consistent but at the very moment you query a DB it might be inconsistent
Sociability: Means that performance of a service does not depend on the number of users and clients requesting the service.
A system whose performance improves after adding hardware
proportionally to the capacity added, is said to be a scalable
system,
If the system fails when the workload or the quantity of hardware
added increase, it does not scale
Replication
In replication, we also need versioning so, when we do replication update anomalies happens DBMS gets confused which one is correct several values for the same attributes
Versioning helps to keep our replicas update, in order to have the latest data.
-
Problems
In a networked and replicated database system,
where a client may issue a read or update request to
any node, nodes have to communicate in order to:– Propagate the client’s updates to all replicas
Very often, nodes communicate using a gossip
protocol
The term epidemic protocol is sometimes used as a
synonym for a gossip protocol,
The core of the protocol involves periodic, pair wise inter node
interactions
There is some form of randomness in the peer
Peers may be selected from the full set of nodes or from a
smaller set of nodes (nodes hosting a replica of the same data)
-
-
Horizontal Scaling: Horizontal scaling is important to cloud databases because of its cost effectiveness
A horizontally scalable cloud database can be run on cheaper
commodity hardware
– As the number of users and data grow and more performance is required, more cheap nodes are added, and data and workload are distributed to the new nodes
-
-
What is cloud computing
Cloud Computing Services
Software as a Service (SaaS)
Software applications are provided to users by cloud providers where
users do not have to install them at their sites
Platform as a Service (PaaS)
A hardware or software platform (a computer, an operating system, a
programming environment, run-time libraries,… ) is provided to users
Infrastructure as a Service (IaaS)
A service where users can use expensive hardware like array
processor servers, and network processors
Database as a Service (DaaS)
Cloud storage service where users hire storage facilities, including a
DBMS and pay only for storage space they use
Cloud Computing is based on the subscription model that is very similar to utility services like electricity, gas or water. Good example is that we only pay for the service that we are using.
Users do not have to invest in any major hardware, or
any major software and they access them and use them on the cloud and pay only for the resources they use
Availability: of a service relates to the fact that the service is always ready to use. For example: a service is highly available if it has a small latency.
Cloud databases are distributed database systems that are accessed via cloud services, Most database services offer web-based consoles, which the end user can use to provision and configure database instances.
Database services consist of a database manager
(DBMS) component, which controls the underlying
database instances using a service API
The service API is exposed to the end users, and permits users to perform maintenance and scaling operations on their database
instances
-
TWO Architecture
Shared Nothing
where each node contains a database partition
and whole responsibility for data it holds,
Reads and Writes involving data on a single node are efficient,
Reads and Writes involving data on multiple nodes are inefficient, since joins and constraints must be validated over multiple nodes,
Nodes can be added, or removed easily without affecting other nodes, making the architecture scalable
Shared Disk
where all nodes have access to shared disks containing all database data and all nodes share responsibility for the sole copy of the database
All nodes have access to the entire database (no middleware
required),
-
Locking (for consistency) and logging (for recovery) introduce
overheads, so hard to scale
Machine Layout
-
Depending on its performance, each PN contains a
number of virtual nodes
-
Data Warehousing (Business Intelligence) (30%)
Data Warehousing is about making the business decision based on huge amount of historical data based on computing averages and other attribute functions.
-
-
-
-
-
-