Please enable JavaScript.
Coggle requires JavaScript to display documents.
Storage and Databases 🗄 - Coggle Diagram
Storage and Databases 🗄
Instance stores and Amazon Elastic Block Store (Amazon EBS)
Instance Stores
Provide temporary
Block Level Storage
volumes
Behave like physical hard drives
Efficient storage type when working with
Apps like
databases
Enterprise
software
File systems
Disk storage that is physically attached to the host computer for an
EC2 instance
Therefore
has the same lifespan as the instance
When the
instance is terminated
, you lose any data in the instance store
Amazon Elastic Block Store (Amazon EBS)
A service that provides
block-level storage
volumes
That you can use with
Amazon EC2 instances
You stop or terminate an
Amazon EC2 instance
All the data on the attached EBS volume remains available
To create an
EBS volume
You define the configuration
Such as
volume size
and type
Provision the
EBS volume
Created the
EBS volume
and configurated
It can attach to an
Amazon EC2 instance
EBS volumes
are for data that needs to persist
It is important to back up the data
You can take incremental backups of
EBS volumes
By creating
Amazon EBS snapshots
Amazon EBS
volumes are an
Availability Zone-level
resource. In order to attach
EC2
to
EBS
, you need to be in the same
AZ
If you provision a two terabyte
EBS
volume and fill it up, it doesn't automatically scale to give you more storage
Amazon EBS snapshots
An incremental backup
Means that the
first backup taken of a volume copies all the data
Subsequent backups
Only the blocks of data that have changed
Since the
most recent snapshot
are saved
Different from
full backups
All the data in a
storage volume
Copies each time a backup occurs
Full backup
includes
Data that has not changed since the most recent backup
Amazon Simple Storage Service (Amazon S3)
Service that provides
object-level storage
Store and retrieve an unlimited amount of data at any scale
Store objects in buckets, similar to a file directory
The maximum file size for an object in
Amazon S3
is 5 TB
Version objects, to protect them from accidental deletion, always keeping the last version of an object
Create multiple buckets, and store them across different classes or tiers of data.
Create permissions to limit who can see or even access objects
Store data as objects
Amazon S3
offers unlimited storage space
Amazon S3
versioning feature to track changes to your objects over time
Object storage
Consist of
Data
Image
Video
Text document
Any other type of file
Metadata
Contains information about
What the data is
How it is used
The object size and so on
Key
Object’s unique identifier
Amazon S3 storage classes
You pay only for what you use
Choose from a range of storage classes to select a fit for your business and cost needs
When selecting an
Amazon S3 storage class
, consider these two factors
How often you plan to retrieve your data
How available you need your data to be
S3 Standard
Designed for frequently accessed data
Stores data in a minimum of
three Availability Zones
S3 Standard
provides high availability for objects
Makes it a good choice for a wide range of use cases
Websites
Content distribution
Data analytics
S3 Standard
has a higher cost than other storage classes intended for
Infrequently accessed data
Archival storage
S3 Standard-Infrequent Access (S3 Standard-IA)
Ideal for infrequently accessed data
Similar to S3 Standard
but has a lower storage price and higher retrieval price
S3 Standard-IA
is ideal for data infrequently accessed but requires high availability when needed
S3 Standard-IA
store data in a minimum of
three Availability Zones
S3 Standard-IA
provides high availability for objects
S3 One Zone-Infrequent Access (S3 One Zone-IA)
Stores data in a
single Availability Zone
This makes it a good storage class to consider if the following conditions apply
You want to save costs on storage
You can easily reproduce your data in the event of an
Availability Zone failure
Has a lower storage price than
S3 Standard-IA
S3 Intelligent-Tiering
Ideal for data with unknown or changing access patterns
Requires a small monthly monitoring and automation fee per object
Amazon S3
monitors objects’ access patterns
If you haven’t accessed an object for 30 consecutive days
Amazon S3
automatically moves it to the infrequent access tier,
S3 Standard-IA
If you access an object in the infrequent access tier
Amazon S3
automatically moves it to the frequent access tier,
S3 Standard
S3 Glacier
Low-cost storage designed for data archiving
Able to retrieve objects within a few minutes to hours
You might use this storage class to
Store archived customer records
Store and archive older photos and video files
S3 Glacier Deep Archive
Lowest-cost object storage class ideal for archiving
Able to retrieve objects within 12 hours
Amazon Elastic File System (Amazon EFS)
A managed file system
Keep existing file systems in place but let
AWS
do all the heavy lifting of the scaling and the replication
EFS
allows you to have multiple instances accessing the data in
EFS
at the same time
It scales up and down as needed without you needing to do anything to make that scaling happen
Amazon EFS
can have multiple instances reading and writing from it at the same time
File system for Linux
It is also a
regional resource
Meaning any
EC2 instance
in the Region can write to the
EFS file system
As you write more data to
EFS
, it automatically scales
File storage
Multiple clients can access data that is stored in shared file folders
A storage server uses block storage with a local file system to organize files
Clients access data through file paths
Ideal for use cases in which a
large number of services and resources need to access the same data at the same time
Scalable file system used with
AWS Cloud services
and on-premises resources
As you add and remove files,
Amazon EFS
grows and shrinks automatically
It can scale on demand to petabytes without disrupting applications
Amazon Relational Database Service (Amazon RDS)
Relational databases
Data is stored in a way that relates it to other pieces of data
For example
, an inventory management system
Each record in the database would include data for a single item , such as
Product name
Size
Price
Brand, etc
ID
Use
structured query language (SQL)
to store and query data
Allows data to be stored in an easily
Understandable
Consistent
Scalable way
Business owners can write a
SQL query
to identify all the customers whose most frequently purchased a determined product
A service that enables you to run relational databases in the
AWS Cloud
Amazon RDS
is a managed service that automates tasks such as
Hardware provisioning
Database setup
Patching
Backups
You can integrate
Amazon RDS
with other services to fulfill your business and operational needs, such as
Using
AWS Lambda
to query your database from a serverless application
Amazon RDS
provides a number of different security options
Many
Amazon RDS
database engines offer
Encryption at rest
(protecting data while it is stored)
Encryption in transit
(protecting data while it is being sent and received)
Amazon RDS
is built for business analytics
Amazon RDS database engines
Amazon RDS
is available on six database engines
Which
optimize for memory, performance, or input/output (I/O)
Supported
database engines
include
Amazon Aurora
PostgreSQL
MySQL
MariaDB
Oracle Database
Microsoft SQL Server
Amazon Aurora
An enterprise-class relational database
Compatible with
MySQL
and
PostgreSQL
relational databases
It is up to
five times faster than standard MySQL
databases
It is up to
three times faster than standard PostgreSQL
databases
Helps to reduce your database costs by reducing unnecessary input/output (I/O) operations
While
ensuring that your database resources remain reliable and available
Consider
Amazon Aurora
if your workloads require high availability
It
replicates six copies of your data
Across
three Availability Zones
And
continuously backs up your data to Amazon S3
Amazon DynamoDB
Nonrelational databases
In a
nonrelational database
, you create tables
A table is
a place where you can store and query data
Sometimes referred to as
“NoSQL databases”
Use structures other than rows and columns to organize data
One
type of structural approach for nonrelational databases
is
key-value pairs
Data is organized into items
(keys)
And items have attributes
(values = different features of your data)
In a
key-value database
You can
add or remove attributes from items in the table
at any time
Not every item in the table has to have the same attributes
A key-value database service
It
delivers single-digit millisecond performance at any scale
DynamoDB
is serverless
Which means that you do not have to
Provision
Patch
Manage servers
You also do not have to
Install
Maintain
Operate software
DynamoDB
automatically scales
Adjust for changes in capacity while maintaining consistent performance
Suitable choice for use
cases that require high performance while scaling
DynamoDB
is used where you do not need complex joint functionality
Amazon Redshift
A data warehousing service
You can use for big data analytics
Massively scalable
In cooperation with
Amazon Redshift Spectrum
You can directly run a single
SQL query
against exabytes of unstructured data running in data lakes
When it comes to handle massively larger data sets of business intelligence workloads
Redshift
uses a variety of innovations that allow you to achieve up to
10 times higher performance
than traditional databases
When you need big data
Business Intelligence
solutions
Redshift
allows you to get started with
a single API call
It offers the ability to collect data from many sources
Helps you to understand relationships and trends across your data
AWS Database Migration Service (AWS DMS)
Enables you to migrate
Relational databases
Nonrelational databases
Other types of data stores
Amazon DMS
migrate data between a source and a target database
The
source database remains fully operational during the migration
Minimizing downtime to applications that rely on that database
The
source and target databases do not have to be of the same type
Homogenous databases migration
From
MySQL
To
Amazon RDS
for
MySQL
From
Microsoft SQL Server
To
Amazon RDS
for
SQL Server
From
Oracle
To
Amazon RDS
for
Oracle
Schema structures, data types
, and
database code
is
compatible between source and target
The source database can be located on-premises, running on
Amazon EC2 Instances
Or it can be an
Amazon RDS database
The target itself can be a database in
Amazon EC2
or
Amazon RDS
Create a migration task
with connections to the source and target databases
AWS Database Migration Service
takes care of the rest
Heterogeneous databases migrations
Schema structures, data types
, and
database code
are
different between source and target
First you need to convert them using the
AWS Schema Conversion Tool
This will
convert the source schema and code to match that of the target database
Second step is then to use
DMS
to migrate data from the source database to the target database
Other use cases for
DMS
Development and test database migrations
When you want to develop this to test against production data
Without affecting production users
Use
DMS
to migrate a copy of your production database
To your dev or test environments
Either
once-off
or
continuously
Database consolidation
When you have several databases and want to consolidate them into one central database
Continuous database replication
When you use
DMS
to perform continuous data replication
Could be for disaster recovery
Because of geographic separation
Additional database services
Amazon DocumentDB
A document database service that supports
MongoDB
workloads. (
MongoDB
is a document database program)
Amazon Neptune
A graph database service
Use
Amazon Neptune
to build and run applications that work with highly connected datasets
Work with highly connected datasets
Recommendation engines
Fraud detection
Knowledge graphs
Amazon Quantum Ledger Database (Amazon QLDB)
A ledger database service
Use
Amazon QLDB
to review a complete history of all the changes that have been made to your application data
Amazon Managed Blockchain
A service that you can use to create and manage blockchain networks with open-source frameworks
Blockchain
Is a distributed ledger system that lets multiple parties run transactions and share data without a central authority
Amazon ElastiCache
A service that adds caching layers on top of your databases to help improve the read times of common requests
It supports two types of data stores
Redis
Memcached
Amazon DynamoDB Accelerator
An in-memory cache for
DynamoDB
It helps improve response times from single-digit milliseconds to microseconds