Please enable JavaScript.

Coggle requires JavaScript to display documents.

Storage and Databases 🗄 - Coggle Diagram

- - - - Behave like physical hard drives
        
        Efficient storage type when working with
        
        Apps like databases
        
        Enterprise software
        
        File systems
    - - Therefore has the same lifespan as the instance
        
        When the instance is terminated, you lose any data in the instance store
  - - - That you can use with Amazon EC2 instances
        
        You stop or terminate an Amazon EC2 instance
        
        All the data on the attached EBS volume remains available
    - - You define the configuration
        
        Such as volume size and type
        
        Provision the EBS volume
        
        Created the EBS volume and configurated
        
        It can attach to an Amazon EC2 instance
    - - It is important to back up the data
        
        You can take incremental backups of EBS volumes
        
        By creating Amazon EBS snapshots
    - - If you provision a two terabyte EBS volume and fill it up, it doesn't automatically scale to give you more storage
  - - - Means that the first backup taken of a volume copies all the data
        
        Subsequent backups
        
        Only the blocks of data that have changed
        
        Since the most recent snapshot are saved
      - Different from full backups
        
        All the data in a storage volume
        
        Copies each time a backup occurs
        
        Full backup includes
        
        Data that has not changed since the most recent backup
- - - - Store objects in buckets, similar to a file directory
      - The maximum file size for an object in Amazon S3 is 5 TB
      - Version objects, to protect them from accidental deletion, always keeping the last version of an object
      - Create multiple buckets, and store them across different classes or tiers of data.
      - Create permissions to limit who can see or even access objects
      - Store data as objects
      - Amazon S3 offers unlimited storage space
      - Amazon S3 versioning feature to track changes to your objects over time
  - - - Data
        
        Image
        
        Video
        
        Text document
        
        Any other type of file
      - Metadata
        
        Contains information about
        
        What the data is
        
        How it is used
        
        The object size and so on
      - Key
        
        Object’s unique identifier
  - - - Choose from a range of storage classes to select a fit for your business and cost needs
        
        When selecting an Amazon S3 storage class, consider these two factors
        
        How often you plan to retrieve your data
        
        How available you need your data to be
    - - Designed for frequently accessed data
      - Stores data in a minimum of three Availability Zones
      - S3 Standard provides high availability for objects
        
        Makes it a good choice for a wide range of use cases
        
        Websites
        
        Content distribution
        
        Data analytics
      - S3 Standard has a higher cost than other storage classes intended for
        
        Infrequently accessed data
        
        Archival storage
    - - Ideal for infrequently accessed data
      - Similar to S3 Standard but has a lower storage price and higher retrieval price
      - S3 Standard-IA is ideal for data infrequently accessed but requires high availability when needed
      - S3 Standard-IA store data in a minimum of three Availability Zones
      - S3 Standard-IA provides high availability for objects
    - - Stores data in a single Availability Zone
        
        This makes it a good storage class to consider if the following conditions apply
        
        You want to save costs on storage
        
        You can easily reproduce your data in the event of an Availability Zone failure
      - Has a lower storage price than S3 Standard-IA
    - - Ideal for data with unknown or changing access patterns
      - Requires a small monthly monitoring and automation fee per object
      - Amazon S3 monitors objects’ access patterns
        
        If you haven’t accessed an object for 30 consecutive days
        
        Amazon S3 automatically moves it to the infrequent access tier, S3 Standard-IA
        
        If you access an object in the infrequent access tier
        
        Amazon S3 automatically moves it to the frequent access tier, S3 Standard
    - - Low-cost storage designed for data archiving
      - Able to retrieve objects within a few minutes to hours
      - You might use this storage class to
        
        Store archived customer records
        
        Store and archive older photos and video files
    - - Lowest-cost object storage class ideal for archiving
      - Able to retrieve objects within 12 hours
- - - - It scales up and down as needed without you needing to do anything to make that scaling happen
    - - Meaning any EC2 instance in the Region can write to the EFS file system
        
        As you write more data to EFS, it automatically scales
  - - - A storage server uses block storage with a local file system to organize files
        
        Clients access data through file paths
- - - - Each record in the database would include data for a single item , such as
        
        Product name
        
        Size
        
        Price
        
        Brand, etc
        
        ID
    - - Allows data to be stored in an easily
        
        Understandable
        
        Consistent
        
        Scalable way
  - - - Hardware provisioning
      - Database setup
      - Patching
      - Backups
    - - Using AWS Lambda to query your database from a serverless application
    - - Many Amazon RDS database engines offer
        
        Encryption at rest (protecting data while it is stored)
        
        Encryption in transit (protecting data while it is being sent and received)
  - - - Which optimize for memory, performance, or input/output (I/O)
        
        Supported database engines include
        
        Amazon Aurora
        
        PostgreSQL
        
        MySQL
        
        MariaDB
        
        Oracle Database
        
        Microsoft SQL Server
  - - - Compatible with MySQL and PostgreSQL relational databases
        
        It is up to five times faster than standard MySQL databases
        
        It is up to three times faster than standard PostgreSQL databases
      - Helps to reduce your database costs by reducing unnecessary input/output (I/O) operations
        
        While ensuring that your database resources remain reliable and available
      - Consider Amazon Aurora if your workloads require high availability
        
        It replicates six copies of your data
        
        Across three Availability Zones
        
        And continuously backs up your data to Amazon S3
- - - - A table is a place where you can store and query data
    - - Use structures other than rows and columns to organize data
        
        One type of structural approach for nonrelational databases is key-value pairs
        
        Data is organized into items (keys)
        
        And items have attributes (values = different features of your data)
    - - You can add or remove attributes from items in the table at any time
      - Not every item in the table has to have the same attributes
  - - - DynamoDB is serverless
        
        Which means that you do not have to
        
        Provision
        
        Patch
        
        Manage servers
        
        You also do not have to
        
        Install
        
        Maintain
        
        Operate software
      - DynamoDB automatically scales
        
        Adjust for changes in capacity while maintaining consistent performance
        
        Suitable choice for use cases that require high performance while scaling
      - DynamoDB is used where you do not need complex joint functionality
- - - - Massively scalable
        
        In cooperation with Amazon Redshift Spectrum
        
        You can directly run a single SQL query against exabytes of unstructured data running in data lakes
        
        When it comes to handle massively larger data sets of business intelligence workloads
        
        Redshift uses a variety of innovations that allow you to achieve up to 10 times higher performance than traditional databases
        
        When you need big data Business Intelligence solutions
        
        Redshift allows you to get started with a single API call
- - - - Minimizing downtime to applications that rely on that database
    - - Homogenous databases migration
        
        From MySQL
        
        To Amazon RDS for MySQL
        
        From Microsoft SQL Server
        
        To Amazon RDS for SQL Server
        
        From Oracle
        
        To Amazon RDS for Oracle
        
        Schema structures, data types, and database code is compatible between source and target
        
        The source database can be located on-premises, running on Amazon EC2 Instances
        
        Or it can be an Amazon RDS database
        
        The target itself can be a database in Amazon EC2 or Amazon RDS
        
        Create a migration task with connections to the source and target databases
        
        AWS Database Migration Service takes care of the rest
      - Heterogeneous databases migrations
        
        Schema structures, data types, and database code are different between source and target
        
        First you need to convert them using the AWS Schema Conversion Tool
        
        This will convert the source schema and code to match that of the target database
        
        Second step is then to use DMS to migrate data from the source database to the target database
        
        Other use cases for DMS
        
        Development and test database migrations
        
        When you want to develop this to test against production data
        
        Without affecting production users
        
        Use DMS to migrate a copy of your production database
        
        To your dev or test environments
        
        Either once-off or continuously
        
        Database consolidation
        
        When you have several databases and want to consolidate them into one central database
        
        Continuous database replication
        
        When you use DMS to perform continuous data replication
        
        Could be for disaster recovery
        
        Because of geographic separation
- - - - Use Amazon Neptune to build and run applications that work with highly connected datasets
        
        Work with highly connected datasets
        
        Recommendation engines
        
        Fraud detection
        
        Knowledge graphs
  - - - Use Amazon QLDB to review a complete history of all the changes that have been made to your application data
  - - - Blockchain
        
        Is a distributed ledger system that lets multiple parties run transactions and share data without a central authority
  - - - It supports two types of data stores
        
        Redis
        
        Memcached
  - - - It helps improve response times from single-digit milliseconds to microseconds