Please enable JavaScript.
Coggle requires JavaScript to display documents.
Architecting for the Cloud (Databases - able to choose the right DB…
Architecting for the Cloud
Differences btw Cloud vs Traditional
Global, Available, and Scalable Capacity
Access to a broad set of services
Built-in security
Architecting for Cost
IT Assets as Provisioned Resources
Design Principles
Scalability - economies of scale
Scale vertically - increase in specs for individual resource
Scale horizontally - increase number of resources
Use Cases
Stateless applications - scale horizontally
Distribute load to multiple nodes
ELB - push model
Route 3 DNS round robin
store tasks/data in queue SQS or as streaming data (Kinesis) - pull model - multiple compute resources pull and consume these
Stateless Components - DynamoDB to store stateful info, S3 or EFS for larger temporary files, or AWS Step Functions to store state of workflow
Stateful Components
May be able to scale horizontally by distributing to nodes with session affinity. - however, existing connections don't benefit directly from additional resources, session affinity cannot be guaranteed,
Implement session affinity - for HTTP/HTTPS traffic use sticky sessions feature of App Load Balancer
Client Side load balancing
Distributed Processing - if processing very large amounts of data, distribute to multiple nodes
Offline batch jobs - use distributed data processing engines like AWS Glue, Batch, or Hadoop
real time processing of streaming data - Amazon Kinesis partitions data in shards and can be consumed by multiple instances
Disposable Resources Instead of Fixed Servers
Instantiating Compute resources
Bootstrapping - scripts that install software and bring resource to specific state - parameterize configuration details that vary between environments
Set up with data scripts, cloud-init directives, use Chef or Puppet, custom scripts
Golden Image - launch from snapshot - can customize an EC2 instance and save as Amazon Machine Image AMI
Can use Golden Image for RDS DB or EBS to start with pre-populated data
Containers (Docker)
Use Amazon ECS Elastic Container Service or Beanstalk or Fargate to manage them
Infrastructure as Code - CloudFormation templates
Automation
Serverless Management and Deployment
Automate deployment pipeline - CodePipeline, CodeBuild, CodeDeploy
Infrastructure Management and Deployment
Elastic Beanstalk - deploy web apps, service handles details
EC2 Auto Recovery - CloudWatch alarm that monitors and automatically recovers if instance becomes impaired
Systems Manager - software inventory, apply OS patches, create system image
Auto-Scaling - EC2, DynamoDB, ECS, EKS
Alarms and Events
CloudWatch Alarms - Send SimpleNotification SNS when a metric goes beyond threshhold - they can then kick off a Lambda function, send message to SQS queue, or perform a post to an HTTP/HTTPS address
CloudWatch Events - near real-time stream of events describing changes in the AWS resources - can route to Lambda function, Kinesis stream, or SNS topic
Lambda scheduled event
WAF Security Automations - can administer through APIs, making responses to incidents easy and fast
Loose Coupling - reduce interdependencies, so a failure won't cascade
Well-defined interfaces - microservices architecture - Amazon API Gateway allows devs to create APIs easily
Service Discovery
ELB is a possiblity, or enable service discovery
Asynchronous integration - integrate through storage layer
Asynchronous Integration - no immediate response, just an ack that request received
One component generates events, another consumes them, with some data store as intermediary (SQS queue, Kinesis, Step Functions, etc)
Best Practices
Fail Gracefully
Failed requests - retry (backoff and Jitter strategy) or store for later processing
Front-end interfaces: provide alternative or cached content
Route 53 gives ability to send users to a backup website instead
Use More Services rather than Servers
Use AWS Managed Services - cheaper because you don't pay for underutilized services, don't need to provision redundant architecture
Can combine with S3 for static content and produce a "Serverless Multi-Tier Architecture"
Databases - able to choose the right DB technology for the job rather than going with what you already have infrastructure for
Scalability - scale vertically or horizontally with read replicas; for write operations consider DB partitioning or sharding
High Availability - recommend deploying standby instance in another AZ
Primarily indexing and joins, no joins or complex transactions - noSQL - scales horizontally
DynamoDB
Dynamo Accelerator - Cache DAX
Replicates across three datacenters in a region
Scalable - automatically partitions
For large binary files, save data on S3 and hold metadata in db
Data Warehouses
Amazon RedShift - less than 1/10 the cost of traditional solutions
RedShift spectrum - analysis against S3
Not great for high-concurrency workloads
High Availability - continuously backed up to S3, recommend multi-node clusters
Search - enables databases to be queried which are not precisely structured
CloudSearch - little configuration, scales automagically
ElastiSearch - more control over config details
Graph Databases - Neptune
Data Lake - for massive amounts of data in central location
High Availability
Redundancy
Stand-by redundancy
Active Redundancy
Use Multi-AZ
RDS, S3 and, DynamoDB are already set up to store redundant copies across multi-AZ in region
Detect failure
ELB or Route 53 to detect issues and reroute
Autoscaling to replace unhealthy nodes
EC2 auto-recovery feature
Design good health checks
Typical app, health checks on ELB
Durable Data Storage
Synchronous replication - downside is that a transaction isn't acknowledged until saved to all replicas - limit replicas
Doesn't replace backups
Quorum based replication - a minimum number of nodes must participate
Use versioning on S3
Fault Isolation
Shuffle sharding - group redundant nodes into "shards"
Optimize for Cost
Benchmark application and select right sized resources based on needs
Continuously reevaluate based on usage and new options available - use tagging
Use elasticity - use auto-scaling for EC2, automating turning off non-prod workloads when not in use
Assess what could be implemented on Lambda so you're not paying for idle resources
Replace EC2 workloads with managed services
Use Purchasing options - On Demand instance, reserved instances, spot instances
Use Caching
Elasticache
DynamoDB Accelerator DAX
EdgeCaching on the CDN CloudFront
Security
Instead of storing service account credentials in a file, you can use IAM short-term credentials - rotated
Federated access
Security as Code - template that defines a Golden Environment
CloudFormation deploys resources in alignment with
real-time auditing - config, trusted advisor, inspector
App logging - Cloudwatch; API logging - Cloudtrail