Please enable JavaScript.
Coggle requires JavaScript to display documents.
AWS (Technical Benefits of Cloud Computing (Automation – “Scriptable…
AWS
-
-
Cloud Concepts
- Building Scalable Architectures
-
-
-
Best Practices
- Design for failure and nothing will fail
- Questions to ask on software failure
- What if the cache keys grow beyond memory limit of an instance?
- What if downstream service times out or returns an exception?
- What happens to my application if the dependent services changes its interface?
- Mechanisms to handle that failure
- Build process threads that resume on reboot
- Allow the state of the system to re-sync by reloading messages from queues
- Have a coherent backup and restore strategy for your data and automate it
- Keep pre-configured and pre-optimized virtual images to support (2) and (3) on launch/boot
- Avoid in-memory sessions or stateful user context, move that to data stores
- AWS specific tactics for implementing this best practice
- Failover gracefully using Elastic IPs: Elastic IP is a static IP that is dynamically re-mappable. You can quickly remap and failover to another set of servers so that your traffic is routed to the new servers. It works great when you want to upgrade from old to new versions or in case of hardware failures.
- Utilize multiple Availability Zones: Availability Zones are conceptually like logical datacenters. By deploying your architecture to multiple availability zones, you can ensure highly availability. Utilize Amazon RDS Multi-AZ [21] deployment functionality to automatically replicate database updates across multiple Availability Zones
- Maintain an Amazon Machine Image so that you can restore and clone environments very easily in a different Availability Zone; Maintain multiple Database slaves across Availability Zones and setup hot replication.
- Utilize Amazon CloudWatch (or various real-time open source monitoring tools) to get more visibility and take appropriate actions in case of hardware failure or performance degradation. Setup an Auto scaling group to maintain a fixed fleet size so that it replaces unhealthy Amazon EC2 instances by new ones.
- Utilize Amazon EBS and set up cron jobs so that incremental snapshots are automatically uploaded to Amazon S3 and data is persisted independent of your instances.
- Utilize Amazon RDS and set the retention period for backups, so that it can perform automated backups.
- Questions to ask on hardware failure
- What happens if a node in your system fails?
- How do you recognize that failure?
- How do I replace that node?
- What kind of scenarios do I have to plan for?
- What are my single points of failure?
- If a load balancer is sitting in front of an array of application servers, what if that load balancer fails?
- If there are master and slaves in your architecture, what if the master node fails?
- How does the failover occur and how is a new slave instantiated and brought into sync with the master?
- Rule of thumb: Be a pessimist when designing architectures in the cloud; assume things will fail. In other words, always
design, implement and deploy for automated recovery from failure.
- Use Amazon SQS as buffers between components
- Design every component such that it expose a service interface and is responsible for its own scalability in all appropriate dimensions and interacts with other components asynchronously
- Use Amazon SQS to isolate components
- Bundle the logical construct of a component into an Amazon Machine Image so that it can be deployed more often
- Make your applications as stateless as possible. Store session state outside of component (in Amazon SimpleDB, if appropriate)
- Which business component or feature could be isolated from current monolithic application and can run standalone separately?
- And then how can I add more instances of that component without breaking my current system and at the same time serve more users?
- How much effort will it take to encapsulate the component so that it can interact with other components asynchronously?
-