S3 Object Lock

S3 Object lock

Can use to store objects using a write once, read many (WORM) model

it can help you prevent objects from being deleted or modified for a fixed amount of time or indefinitely

can use to meet regulator requirements that require WORM storage or add an extra layer of protection against object changes and deletion

Governance Mode

⭐ Users can't overwrite or delete an object version or alter its lock settings unless they have special permissions

protect objects against being deleted by most users, but you can still grant some users permission to alter the retention settings ore delete the object if necessary

Compliance Mode

⭐ a protected object version can't be overwritten or delete by any user, including the root uses in your AWS account.

When an object is locked in compliance mode, its retention mode can't be changed and its retention period can't be shortened.

Compliance mode ensures an object version can't be overwritten or deleted for the duration of retention period

⭐ Retention Periods

protects an object version for a fixed amount of time. When you place a retention period on an object version, Amazon S3 stores a timestamp in the object version's metadata to indicate when the retention period expires

After the retention period expires, the object version can be overwritten or deleted unless you also placed a legal hold on the object version

⭐ Legal Holds

S3 Object Lock also enables you to place a legal hold on a object version

Like a retention period, a legal hold prevents an object version from being overwritten or deleted

However, a legal hold doesn't have an associated retention period and remains in effect until removed. Legal holds can be freely placed and removed by any user who has the s3:PutObjectLegalHold permission

Glacier Vault Lock

Allows you to easily deploy and enforce compliance controls for individual S3 Glacier vaults with a Vault Lock policy

You can specify controls, such as WORM, in a Vault Lock policy and lock the policy from future edits. Once locked, the policy can no longer be changed

Exam tips

Use S3 Object Lock to store objects using a write once, read many (WORM) model

Object locks can be individual objects or applied across the bucket as a whole

Object lock come in two models

Governance mode

compliance mode

S3 Performance

What is a prefix with S3

ex: mybucketname/folder1/subfolder/myfile.jpg

prefix: /folder1/subfolder1

S3 Performance

S3 has extremely low latency. You can get the first byte of S3 within 100 - 200 ms

You can also achieve a high number of requests

3,500 PUT/COPY/POST/DELETE

5,500 GET/HEAD request per second per prefix

KMS Request Rates

Limitations

If you are using SSE-KMS to encrypt your objects in S3, you must keep in mind the KMS limits

when you upload a file, you will call GenerateDataKey in the KMS API

When you download a file, you will call Decrypt in the KMS API

S3 Limitation when using KMS

Uploading/ downloading will count toward the KMS quota

Region-specific, however, it's either 5,500, 100,000, or 30,000 request per second

Currently, you can not request a quota increase for KMS

Multipart Uploads

Recommended for files over 100 MB

Required for files over 5 GB

Parallelize uploads (increases efficiency)

1

Downloads

S3 Byte-Range Fetches

Parallelize downloads by specifying byte ranges

If there's failure in the download, it's only for a specific byte range

2

Can be used speed up downloads

Can be used to just download partial amounts of the file (e.g., header information)

Exam tips

What is a prefix with S3

ex: mybucketname/folder1/subfolder/myfile.jpg

prefix: /folder1/subfolder1

You can also achieve a high number of requests

3,500 PUT/COPY/POST/DELETE

5,500 GET/HEAD request per second per prefix

You can get better performance by spreading your reads across different prefixes

EX: if you are using two prefixes, you can achieve 11,000 request per second

If using SSE-KMS to encrypt your objects in S3, you must keep in mind the KMS limits

Uploading/downloading will count to toward the KMS quota

Region specific, however it's either 5,500, 10,00 or 30,000 requests per second

Currently, you cannot request a quota increase for KMS

Use multipart uploads to increase performance when uploading files to S3

should be used for any files over 100 MB and must be used for any file over 5 GB

Use S3 byte-range fetches to increase performance when downloading files to S3

S3 Select

what is ?

enables application to retrieve only a subset of data from an object by using simple SQL expression

By using S3 Select to retrieve only the data needed by your application, you can achieve drastic performance increase in many cases, you can get as much as a 400% improvement

3

Glacier Select

Some companies in highly regulated industries e.g., financial services, healthcare, and others

write data directly to Amazon Glacier to satisfy compliance needs like SEC rule 17a-4 or HIPAA

Many S3 users have lifecycle polices designed to save on storage costs by moving their data into Glacier when they no longer need to access it on a regular basis

🏅 Glacier Select allows you to run SQL queries against Glacier directly

Exam Tips

Remember that S3 Select is used to retrieve only a subset of data from an object by using simple SQL expressions

Get data by Rows or Columns using simple SQL expressions

Save money on data transfer and increase speed

AWS Organizations & Consolidated Billing

What is AWS Organizations

AWS Organizations is an account management service that enables you to consolidate multiple AWS accounts into an organization that you create and centrally manage

4

Advantages of Consolidated Billing

One bill per AWS account

Very easy to track charges and allocate costs

Volume pricing discount

Some best practice with AWS Organizations

Always enable multi-factor authentication on root account

Always use a strong and complex password on root account

Paying account should be used for billing purpose only. Do not deploy resources into the paying account

Enable/Disable AWS services using Service Control Polices (SCP) either on OU or on individual accounts

S3 - Cross Account Access

3 different ways to share S3 buckets across accounts

Using Bucket Policies $ IAM ( applies across the entire bucket). Programmatic Access Only

Using Bucket ACLs & IAM ( individual objects). Programmatic Access Only

Cross-account IAM Roles. Programmatic AND Console access

Cross Region Replication

Versioning must be enabled on both the source and destination buckets

Files in an existing bucket are not replicated automatically

All subsequent updated files will be replicated automatically

Delete markers are not replicated

Deleting individual versions or delete markers will not be replicated

Understand what Cross Region Replication is at a high level

S3 Transfer Acceleration

what is ?

Utilises the CloudFront Edge Network to accelerate your uploads to S3

Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to an edge location which will then transfer that file to S3. You will get a distinct URL to upload to : abc.s3-accelerate.amazonaws.com

AWS DataSync

Used to move large amounts of data from on-premises to AWS

Used with NFS and SMB compatible file systems

Replication can be done hourly, daily, or weekly

Install the DataSync agent to start the replication

Can be used to replicate EFS to EFS

CloudFront

What is ?

A content delivery network (CDN) is a system of distributed servers (network) that deliver webpages and other web content to a user based on the geographic locations of the user, the origin of the webpage, and a content delivery server

Key Terminology

Edge Location

This is location where content will be cached. This is separate to an AWS Region/AZ

Origin

This is the origin of all the files that the CDN will distribute. This can be an S3 Bucket, an EC2 Instance, an Elastic Load Balancer or Route 53

Distribution

This is the name given the CDN which consists of a collection of Edge Locations

can be used to deliver you entire website, including dynamic, static, streaming, and interactive content using a global network of edge locations. Requests for your content are automatically routed to the nearest edge location, so content is delivered with the best possible performance

Web Distribution

Typically used for Websites

RTMP

Used for Media Streaming

Edge locations are not just READ only

you can write to them too

Objects are cached for the life of the TTL (Time to Live)

You can clear cached objects, but you will be charged

CloudFront Signed URLs and Cookies

CloudFront Signed URLs

Signed Cookies

A signed cookie is for multiple files

1 cookies = multiple files

Policy

can include

URL expiration

IP ranges

Trusted signers (which AWS accounts can create signed URLs

A signed URL is for individual files

1 file = 1 URL

Can have different origins. Does not have to be EC2

Key-pair is account wide and managed by the root user

Can utilize caching features

Can filter by date, path, IP, address, expiration, etc.

S3 Signed URL

Issues a request as the IAM user who creates the presigned URL

Limited life time

Exam tips

Use signed URLs/cookies when you want to secure content so that only the people you authorize are able to access it

A signed URL is for individual files. 1file = 1 URL

A signed cookie is for multiple files. 1 cookie = multiple files

If you origin is EC2, then use CloudFront

Snowball

Import to S3

Export from S3

S3 Versioning

Stores all versions of an object (including all writes and even if you delete an object)

Great backup tool

Once enabled, Versioning cannot be disabled, only suspended

Integrates with Lifecycle runs

Versioning's MFA Delete capability, which uses multi factor authentication, can be used to provide an additional layer of security

Lifecycle Managment

Automates moving your objects between the different storage tiers

Can be used in conjunction with versioning

Can be applied to current versions and previous versions

S3 Security & Encryption

Can setup access control to your buckets using

Bucket Policies

Access Control Lists

by default

all newly created buckets are PRIVATE

S3 bucket

can be configured to create access logs which log all request made to the S3 bucket

This can be sent to another bucket and even another bucket in another account

The Basics

Encryption At Rest (Server Side) is achieved by

S3 Managed Keys - SSE - S3

AWS Key Management Service, Managed Keys - SSE - KMS

Server Side Encryption With Customer Provided Keys - SSE - C

Client Side Encryption

Encryption In Transit is achieved by

SSL/TLS

S3 Pricing Tiers

What are the different Tiers

S3 Standard

S3 - IA

S3 One Zone - IA

S3 Intelligent Tiering

S3 Glacier

S3 Glacier Deep Archive

What makes up the cost of S3

What makes up the cost of S3

Storage

Request and Data Retrievals

Data Transfer

Management & Replication

Storage Gateway

What is ?

is as service that connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization's on-premises IT environment and AWS's storage infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage

AWS Storage Gateways software appliance is available for download as a virtual machine (VM) image that you install on a host in your datacenter. Storage Gateway supports either VMware ESXi or Microsoft Heyper-V. Once you've installed your gateway and associated it with your AWS account through the activation process, you can use the AWS Management Console to create the storage gateway option that is right for you

Types of Storage Gateway

💫

Volume Gateway (iSCSI)

File Gateway (NFS & SMB)

Tape Gateway (VTL)

Files are stored as objects in your S3 buckets, accessed through a Network File System (NFS) mount point

Ownership, permissions, and timestamps are durably stored in S3 in the user-metadata of the object associated with the file

Once objects are transferred to S3 the can be managed as native S3 objects, and bucket policies such as versioning, lifecycle management, and cross-region replication apply directly to objects stored in you bucket

🇿🇦

what is ?

The volume interface presents you applications with disk volumes using iSCSI block protocol

Data written to these volumes can be asynchronously backed up as point-in-time snapshots of you volumes, and stored in the cloud as Amazon EBS snapshots

Snapshots are incremental backups that capture only changed blocks. All snapshots storage is also compressed to minimize your storage charges

types

Stored Volumes

Cached Volumes

let you store your primary data locally, while asynchronously backing up that data to AWS.

Stored volumes provide your on-premises applications with low-latency access to their entire datasets, while providing durable, off-site backups.

You can create storage volumes and mount them as iSCSI devices from your on-premises application servers

Data written to your stored volumes is stored on your on-premises storage hardware.

This data is asynchronously backed up to Amazon Simple Storage Service (Amazon S3) in the form of Amazon Elastic Block Store (Amazon EBS) snapshots.

1GB - 16TB in size for Stored Volumes

Let you use Amazon S3 as you primary data storage while retaining frequently accessed data locally in your storage gateway

Cached volumes minimize the need to scale you on-premises storage infrastructure, while still providing your application with low-latency access to their frequently accessed data.

You can create storage volumes up to 32 TB in size and attach to them as iSCSI devices from your on-premises application servers.

Your gateway stores data that you write to these volumes in Amazon S3 and retains recently read data in your on-premises storage gateway's cache and upload buffer storage.

1 GB - 32TB in size for Cached Volumes

What is

Offers a durable, cost-effective solution to archive your data in the AWS cloud.

The VTL interface it provides lets you leverage your existing tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your tape gateway.

Each tape gateway is preconfigured with a media changer and tape drives, which are available to your existing client backup application as iSCSI derives.

You add tape cartridges as you need to archive your data. Supported by NetBackup, Backup Exec, Veeam ...

Exam tips

File Gateway

For flat files, stored directly on S3

Volume Gateway

Stored Volumes

Entire Dataset is stored on site and is asynchronously backed up to S3

Cached Volumes

Entire Dataset is stored on S3 and the most frequently accessed data is cached on site

Gateway Virtual Tape Library

Athena vs Macie

Athena ?

Interactive query service which enables you to analyses and query data located in S3 using standard SQL

Serverless, nothing to provision, pay per query / per TB scanned

No need to set up complex Extract/Transform/Load (ETL) process

Works directly with data stored in S3

Can Athena be used for ?

Can bed used to query log files stored in S3, e.g. ELB logs, S3 access logs etc

Generate business reports on data stored in S3

Analyse AWS cost and usage reports

Run queries con click-stream data

Macie ?

What is PII

Personal data used to establish an individual's identity

This data cloud be exploited by criminals, used in identity theft and financial fraud

Home address, email address SSN

Passport number, deriver's license number

D.O.B, phone number, bank account, credit card number

What is Macie

Security service which uses Machine Learning and NLP (Natural Language Processing) to discover, classify and protect sensitive data stored in S3

Uses AI to recognize if your S3 objects contain sensitive data such as PII

Dashboards, reporting and alerts

Works directly with data stored in S3

Can also analyze CloudTrail logs

Great for PCI-DSS and preventing ID theft

S3 & IAM Summary

IAM

consists

Users

Groups

Roles

Policies

So far

IAM is universal. It does not apply to regions at this time

The *"root account" is simply the account created when first setup your AWS account. It has complete Admin access

New Users have NO permissions when first created

New Users are assigned Access Key ID & Secret Access Keys when first created

These are not the same as a password. You cannot use the Access key ID & Secret Access Key to Login in to the console. You can use this to access AWS via the APIs and Command line, however

You only get to view these once. If you lose them, you have to regenerate them. So, save them in a secure location.

Always setup Multifactor Authentication on your root account

You can create and customize your own password rotation policies

S3

S3 is Object-based: i.e. allows you to upload files

Files can be from 0 bytes to 5TB

There is unlimited storage

Files are stored in Buckets

S3 is a universal namespace. That is, names must be unique globally

Not suitable to installing an operating system on

Successful uploads will generate a HTTPs 200 status code

By default, all newly created buckets are *PRIVATE. You can setup access control to your buckets using

S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be sent to another bucket and even another bucket in another account

Bucket Polices

Access Control Lists

The Key fundamentals of S3 are

Key (This is simply the name of the object)

Value ( This is simply the data and is made up of a sequence of bytes

Version ID (Important for versioning)

Metadata (Data about data you are storing)

🏮

Sub resources

Access Control Lists

Torrent

.....

S3・101

guarantees

Built for 99.99% availability for the S3 platform

Amazon Guarantee 99% availability

Amazon guarantees 11x9s durability for S3 information

Features

⭐Tiered Storage Available

Lifecycle Management

Versioning

Encryption

MFA Delete

Secure your data

Access Control List

Bucket Policies

Data consistency

Read after Write consistency for PUTs of new Object

Eventual Consistency for overwrite PUTS and DELETE

Charged for S3 ways

Storage

Requests

Storage Management Pricing

Data Transfer Pricing

Transfer Acceleration

Cross Region Replication Pricing

S3 Transfer Acceleration

enables

fast

easy

secure transfers of files over long distance between your end users and an S3 bucker

takes advantage of Amazon Cloud Front's globally distributed edge location

Data arrives at an edge location, data is routed to Amazon S3 over an optimized network path

Restricting Bucket Access

Bucket Policies

Applies across the whole bucket

Object Polices

Applies to individual files

IAM Policies to Users&Groups

Applies to Users & Groups

S3 Storage Classess

S3 Glacier Deep Archive

is Amazon S3's lowest-cost storage class where a retrieval time of 12 hours is acceptable

S3 Glacier

can reliably store any amount for data at costs that are competitive with or cheaper than on-premises configurable from minutes to hours

a secure, durable and low-cost storage class for data arching

S3 Intelligent Tiering

Designed to optimize cost by automatically moving data to the most cost-effective access tier

without performance impact or operation overhead

S3 One Zone - IA

for where want a lower-cost option for infrequently accessed data

not require the multiple Availability Zone data resilience

S3 - IA

Lower fee than S3

but you are charged a retrieval fee

for data that is accessed less frequently, but requires rapid access when needed

S3 Standard

is designed to sustain the loss of 2 facilities concurrently

stored redundantly across multiple devices in multiple facilites

99.99999999999% durability

99.99% availability

basics

Not suitable to install an operating system on

buckets name

S3 is an universal namespaces

names must be unique globally

Unlimited storage

Object based

allow you to upload files

file can be

5 TB

0 bytes

Files are stored in Buckets

when upload file

sucessfull

200 code

consist

Sub resources

Torrent

Access Control List

Metadata

Data about data you are storing

Version Id

important for versioning

Value

made up of sequence of bytes

simply the data

Key

name of the object

Think of objects just as files

What is S3

The data is spread across multiple devices and facilities

Object based storage

safe place to store files

IAM

Key Features

Centralised control of your AWS account

Shared Access to your AWS account

Granular Permissions

Identity Federation (Active Directory, Facebook, Linked ...)

Multifactor Authentication

Provide temporary access for users/devices and Service where necessary

Allows you to set up your own password rotation policy

Integrates with many different AWS services

Supports PCI DSS Compliance

Key Terminology

Users

End users

people

employees of organization

Groups

A collection of users

User in group will inherit the permissions of the group

Policies

made up of documents

called Policy documents

format called JSON

give permissions as to what a User/Group/Role is able to do

Roles

Create roles

assign them to AWS resources