Please enable JavaScript.
Coggle requires JavaScript to display documents.
Compute - Coggle Diagram
EC2
Auto Scaling Group (ASG)
Scaling Strategy
Target Tracking
- Set a target metric and define a threshold to react to it
Scheduled Actions
- Based on the known patterns (peak time, etc.)
Simple / Step
- Manually add/remove units based on external triggers (ex. CloudWatch alarm)
Simple
- Define one policy that controls scaling
- Obviously, only can listen to one external trigger
- Has 'cooldown' period which ASG waits for some time before taking next scaling action (300 secs by default)
Step
- Multiple policy where each policy listen to external trigger
- Scaling can be more fine-grained utilizing multiple triggers
- Can configure warm-up period, which if the instance is in then is not counted toward the EC2 metrics
Predictive Scaling
- Use the historic data to forecast the usage
- Enough data must be accumulated
Lifecycle
Lifecycle Hooks
- Optionally, hooks can be added to each instance, so it can be possible to have more through control of the instances in ASG
Health Check
- ASG periodically checks EC2 instances.
- Manual configuration is needed (endpoint to health check)
- Must check only the status of the instance itself, not the dependencies
Updating Instances
Doubling
- Simply increase the ASG capacity twice as desired size, so there will be both old and new versions running
- Once all new instances are running, decrease the capacity back to the desired size, so the legacy instances will be removed
ALB Redirection
- Create a new scaling group and add it to the existing ALB.
- Traffic will shift gradually from old ASG to new ASG
- Once all traffic flow into the new ASG, old ASG is removed
Instance Refresh
- Instances will gradually shutdown and restarted to avoid service interruption
Warm-up Time
- The time needed for a restarted instance to be ready for service
Min-Healthy Percentage
- The percentage to maintain for instances to be running
- If # of running instances fall below the line, refresh task will stop until the time come
DNS Redirection
- Create a new ALB and a ASG
- Route 53's weighted record is used to redirect some of the traffics to the new ALB
- Once the new ALB with ASG is ready, abandon the old one
- Manual testing is also possible on the new ALB accessing directly to the ALB
Launch Configuration
- Configuration template used by ASG to launch EC2 instances
- AMI, instance type, key pair, security groups, block device, etc.
Instance Tenacity
- Define how to distribute EC2 instances accross physical hardware
types
Dedicated (dedicated instance)
- Run on single-tenant hardware
- If an instance fail, the replacement will be placed in a selection of 'hardwares' available
Host (dedicated host)
- A physical host or VM
- If an instance fail, the replacement will launch in the same hardware
Shared (default)
- Multiple AWS accounts share the same physical hardware
Tenacity attribute of VPC
- The VPC can set default tenacity setting to be respected by ASG
- If the launch configuration is empty but VPC has tenacity set, the tenacity of VPC will be followed
EC2 Instance
Pricing
Dedicated Hosts
- Pay per physical hardware running the instances.
- Can bring per-socket, per-core, per-VM software licenses
Reserved
- Pay upfront for instance type,family,term
Saving Plans
- EC2 costs can be decreased by consistent amount of usage
- Can be set in 1 or 3 years arrangement
Spot
- Set price threshold and utilize the instance only when the pricing fall below the threshold
- 2 minutes grace period is given when the instance shutdown
-
Dedicated Instances
- Pay per logical hardware running the instances
Capacity Reservation
- Pin-down EC2 instances in a specific AZ
Types
R (Memory)
- Focused on more memory allocation
C (Processing)
- Focused on more processing power
G (GPU)
- Special unit equipped with GPU
- Usually for video rendering and ML
T2/T3 (Burstable)
- Processing power can be bursted in high demand
- Most affordable option
unlimited
- The bursting can be over the baseline
D (Storage)
- Best performance on sequential I/O access
A (Gravition)
- Dedicated processor by AWS
- Not available for Windows
- ARM based
Graviton 2
- 40% performance boost compared to 5th gen x86
-
I (Throughput)
- Best performance on random access I/O
M (General Purpose)
- Balance between memory and processing
Spot Fleet
- Collection of Spot (or On-demand) instances
- Set the price target per instance or overall price target
- Cannot use custom AMI #
Lowest Price
- Choose the pool that costs least
- Maybe hard to find the suitable pool
Diversified
- Instances distributed across all pools
- Instance type can be inconsistent for all instances
-
Hibernate
- Save RAM contents to EBS and hibernate
- The cost is same as stopping the instance
- Must be enabled when first launching EC2 instance
Docker Service
Elastic Container Service (ECS)
- Containers are configured based on the 'task definition'
- Each container must be registered to a 'service' to be provisioned
- The underlying infrastructure (EC2, Spot, etc.) to run the container is provisioned automatically
Configuration
ECS Task
- Definition of a single container image, resource requirements, and network configuration
- CPU quota/count, RAM, storage options, etc.
ECS Service
- Definition of things like # of tasks to run, replicas, restarting policy, load balancing, etc.
ECS IAM Roles
- Default roles that can be assumed by any EC2 instances of the ECS Cluster.
- Required to make API calls to ECS, send logs to CloudWatch, etc.
Role: Instance Profile
- The actual role that any EC2 instance of the ECS Cluster can assume
- Can be assigned to the ECS Cluster when creating the cluster
Role: Task IAM Role
- Role that is assumed used by the ECS task, instead of EC2
- Can be assigned to the ECS Task when crating a new task
ECS Cluster
- Logical grouping of all EC2 instances deployed by the ECS
ALB Integration
- Dynamic Port Mapping can be used to map the ingress port to any of the EC2 instances
- Each instance can have 65535 ports open, so theoretically, a single instance can host 65535 containers
Auto Scaling
- Average CPU/RAM usage are tracked via CloudWatch out of the box
- Depending on the ECS Task/Service definition, ECS will automatically provision/terminate instances (also out of the box)
Spot Instance Support
- Easily manage spot instances with less operational overhead
Classic Spot Instances
- Spot instances are provisioned manually, and ECS deploy containers to the instances as necessary
- As Spot instances may not available all the time, it's not suitable for tasks that require high reliability
Fargate Spot Instances
- Serverless spot instances are automatically provisioned, and ECS deploy the containers to them
- The region and instance type is fixed, so pricing is more predictable while being more reliable
Elastic Container Registry (ECR)
- The docker image repository natively supported by AWS
- Images can be replicated to multiple regions for better availability/reliability
- By default, images are going through 'Basic Scanning' to find out CVEs.
Elastic Kubernetes Service (EKS)
- AWS provided managed Kubernetes cluster
- Nodes are created/terminated automatically by EKS, and nodes are in the ASG
- If needed, nodes can be manually provisioned and registered to EKS cluster later
App Runner
- Easily fully setup the infrastructure needed without operational overhead
- Only the container image and source code is required once configured
- Route 53 has built-in support to redirect traffics to instances run by App Runner in different regions
- Natively connects with DynamoDB
Container Networking
None
- Only container to container communication
Bridge (default)
- Docker's built-in virtual network interfacing the internet
Host
- The EC2 instance's ENI maps the port to the containers in it
- Container bypass the Docker's virtual network
- Dynamic Port Mapping is not allowed
Awspvc
- Each container gets the ENI
- Default mode for Fargate
AWS Batch
- Automatically provision instances with ECS
- Use pre-built images from ECR
- More suitable for large amount of computing resources
Managed Compute Environment
- Minimum and maximum vCPU can be set
- AWS Batch automatically provision instances within the limit
Unmanaged Compute Environment
- The user provision all instances manually
Multi-node Mode
- Automatically distribute tasks across multiple compute nodes
- Spot Instance disabled
AWS Lambda
Resource Limits
CPU/RAM
- 2 vCPUs comes with 1769MB RAM, and it can scale up to 6 vCPUs with 10240MB RAM.
Execution Time
- Process terminate after 15 minutes
-
Deployment
- 50MB raw or 250MB zipped source code
- Container image can be up to 10GB
Concurrency
- 1000 per account
- After 1000, additional functions throttle
- Use reserved concurrency limit to set maximum # of concurrent execution for a task
-
Networking
- No public IP addresses allowed
- Only can be placed in a private subnet if place outside of Lambda Service
Versioning
Linear
- Little by little every N minutes
Canary
- Only a portion is shifted, and all traffic is shifted after the set period of time
All at once
- Immediately shift all traffics to new version
Rollback
- Linear or Canary can set rollback option so if something went wrong with the new version, automatically last successful version is used
Deployment
Default
- Lambda is placed in the VPC of Lambda service account
- Thus, Lambda cannot access services inside another VPCs
In VPC
- Lambda can be optionally placed inside the VPCs
- To access services outside of the VPC, NAT or the Service Endpoint
Invocation
Synchronous
- Caller block until the task is done
- Callers: CLI, SDK, API Gateway
Asynchronous
- Caller invoke the task and leave freely
- In case of failure, retried 3 times automatically, or else sent to Dead Letter Queue.
- All tasks in this invocation mode must be idempotent
- Caller: S3, SNS, EventBridge
Amazon Machine Image (AMI)
- Image to be used to operate the VMs
- Instances can use AMI in the same region only