Please enable JavaScript.
Coggle requires JavaScript to display documents.
Decoupling Workflows - Coggle Diagram
Decoupling Workflows
Things to Know
-
SQS
-
Nothing last forever messages stored in SQS can only persist up to 14 days, this is configurable
Queue aren't bi-directional: if you need communication to return tot he message producer, you will need a second queue
Does order matter? if message ordering is important, make sure to select SQS FIFO queue
SQS can duplicate messages this is only once in a while, check for a mis-configured visibility timeout or if the developer is failing to make the delete API call
SNS
Proactive notification = SNS any time a question asks about email, text, or any type of push-based notification, think about SNS
CloudWatch & SNS scenarios that talk about getting a notification from a CloudWatch alarm should immediately make you look for SNS in the answer
AWS Batch
Queued workload if you see a question about batch workloads requiring queues, then think of AWS Batch
On-demand alternative to AWS Lambda: questions regarding an alternative solution to AWS Lambda due to runtime requirements could likely involve AWS Batch
Long-running, batched workloads anything that is related to batch and is running > 15 minutes will likely involve AWS Batch
Amazon MQ
-
Specific messaging protocols if you see messaging protocols like JMS, AMQP, MQTT, OpenWire or STOMP, then Amazon MQ has to be in the answer. SQS and SNS do not support these protocols
Managed messaging broker if there is a mention of a managed broker service, think of Amazon MQ
AWS Step Function
-
Different workflow decision requirements whenever the solution requires different state or logic during workflows (e.g. condition check, failure checks, wait periods), think of Step Functions
If a question refers to a lengthy wait period up to 1 year, your answer is Step function
Amazon AppFlow
SaaS data integrations for architecture requiring simplified data ingestion from external SaaS into AWS
Bi-Directional can be bi-directional in certain use cases. Data can be ingested into AWS service depending on configurations
Typical question refers to any third party SaaS data that needs to live within S3 on a regular basis
API Gateway
No need to go in depth. Just remember this act as a secure front door to external request to your env/app
Simple Queue Service
SQS
Is messaging queue that allows asynchronous processing of work. One resource will write a message to a SQS queue and then another resource will retrieve that message from SQS
-
Pull-based Messaging
- Producers put message in a queue
- Consumers pick messages from the queue at their own pace (aka when ready)
Dead-Letter Queue
(DLQ)
If a message can't be consumed successfully, you can send it to a dead-letter queue (DLQ). Dead-letter queues let you isolate problematic messages to determine why they are failing.
When you designate a queue to be a source queue, a DLQ is not created automatically. You must first create a queue to designate as the DLQ. The DLQ queue type (standard or FIFO) must match the source queues. You can associate the same DLQ with more than one source queue.
The Maximum receives value determines when a message will be sent to the DLQ. If the ReceiveCount for a message exceeds the maximum receive count for the queue, Amazon SQS moves the message to the associated DLQ (with its original message ID).
It is just a normal queue that is used for a special function that is collecting messages that have problem in being processed (that are rejected). You can analyse these messages.
Set up a CloudWatch alarm to monitor queue depth, if the queue starts to fill up we have a problem
-
-
Type of Queues
Standard
-
-
Best-effort ordering (you can build order in your consumer app, but it is up to you)
-
SQS is a distributed message queuing service for storing messages in transit between systems and applications. SQS allows you to decouple your application components from their application state so that any application server failure does not result in application data loss
API
- CreateQueue (MessageRetentionPeriod)
- DeleteQueue
- PurgeQueue: delete all the messages in queue
- SendMessage (DelaySeconds)
- ReceiveMessage
- MaxNumberOfMessages: default 1, max 10
- ReceiveMessageWaitTimeSeconds: Long Polling, min 0, max 20sec
- DeleteMessage
- ChangeMessageVisibility: change the message timeout
- Batch API for SendMessage, DeleteMessage, ChangeMessageVisibility
AWS Batch
What is AWS Batch
-
Automatically Provision and Scale as capable of provisioning accurately sized compute resources based on number of jobs submitted and optimizes the workload distribution
Batched Workloads allows to run batch computing workloads within AWS on EKS, ECS and Fargate
-
Key Components
-
Job Queues Jobs get submitted to specific queues and reside there until scheduled to run in a compute environment
Jobs unit of work that are submitted (e.g. shell scripts, executables, Docker images)
-
-
AWS Batch vs. AWS Lambda
Runtime Limitations in Lambda, while AWS Batch uses Docker so any runtime is supported
Disk Space Lambda is limited in disk space, using EFS requires lambda to live in a VPC
-
Compute Environments
Managed
- AWS Manages capacity and instance types
- Compute spec defined at env creation time
- ECS instances launched in VPC subnets
- Default on the most recent/approved ECS AMI
- You can use your own AMI
- Leverage: Fargate, Fargate Spot resources, EC2, Spot EC2 resources and EKS resources
Unmanaged
- You manage your own resources entirely
- AMI must meet ECS AMI spec
- You manage everything
- Less commonly used
- Good choice for extremely complex or specific needs
- Leverage EC2 and Spot EC2 resources
Job definition
Orchestration Types
- Fargate (ECS)
- Recommended, 30sec provisioning
- Platform Version = LINUX, WINDOWS
- Ephemeral Storage [GB]
- Execution Role for ECS container and AWS Fargate agents
- Managed only
- EC2 (ECS)
- Recommended if you require: > 16 vCPUs, > 120GiB Memory, GPU, custom AMI, linuxParameters or very large scale workload
- Multi-node parallel (enabled/disabled)
- Managed and Unmanaged
- EKS
- Recommended only if you need to use K8s
- EKS pod properties: Service account name, host network, DNS policy
- Commons across Orchestration Type:
- Container configuration: container image, command (optional)
- Environment configuration: Job role, vCPU, Memory, Env Variables, Secrets
Job types
- Single Node Job
- Array Job
- Shares common job parameters (job definition, vCPUs, and memory)
- Runs as a collection of related, yet separate, basic jobs that may be distributed on many hosts and may run concurrently
- Multi Node Parallel Job
- Single job, and specified how many nodes to create for the job
- Leverages multiple EC2 / ECS instances at the same time
- 1 main node, and many child node
- Node Group is an identical group of job nodes that all share the same container properties (up to 5 node groups per job)
- Does not work with Spot Instances
- Not supported on UNMANAGED compute environments
- Works better if your EC2 launch mode is a placement group ”cluster”
-
AWS Step Functions
Integrated AWS Services (example):
- Lambda
- AWS Batch
- DynamoDB
- ECS / Fargate
- SNS
- SQS
- EventBridge
- API Gateway
- Step Functions
States and State Machine
-
States are elements in your state machines that are referred to by a name that is unique in the state machine namespace
Flexible leverage states to make decision based on input, perform certain actions or pass output
Execution Types
Standard
- Long-running workflows (up to 1 year)
- Rates up to 2,0000 executions per second
- Price based on per state transition
- Execution History in console and into CloudWatch Logs for 90 days
- Execution Model : Exactly-once Execution
Express
- Up to 5 minutes
- Can run for high-event-rate workloads 100,000 per second
- Example of use case: IoT event stream ingestion
- Price based on number of executions, durations and memory consumed
- Execution Model:
- At-least once - Asynchronous (doesn't wait for completion, result inn CW Log)
- At-most once - Synchronous (wait for completion)
Key Elements
-
-
Executions are instances where you run your workflows in order to perform your tasks. Each workflow has executions
States Types:
- Pass passes any input directly to output - no work done
- Task single unit of work performed (e.g. Lambda, Batch and SNS)
- Task State (push) integration with AWS services/resources
- Activity Task (pull) Activity Worker running on compute (e.g. EC2, ECS Task, on-premise) pull work to do from SF and returns SendTaskSuccess/SendTaskFail. Uses TimeoutSeconds and Hear tBeatSeconds setting to control how long a task can wait
- Choice adds branching logic to states machines
- Wait creates a specific time delay within the state machine
- Succeed stop executions successfully
- Fail stops executions and marks them as failed
- Parallel runs parallel branches of executions within state machines
- Map runs a set of steps based on elements of an inout array
Serverless orchestration service meant for event-driven task execution using AWS services for business applications
API Gateway
- Is a Fully Managed service that allows you to easily publish, create, maintain, monitor and secure your API
- It allows you to put a safe "front door" on your application
- the preferred method to get API calls into your app and AWS environment
Feature to Know
-
Ease of Use it is simple to get started, easily build the calls that will kick off other AWS services (e.g. Lambda) or HTTP/HTTPs in your account
Security API Gateway allows you to easily protect your endpoints by attaching a WAF. It supports IAM and AWS Cognito for auth
Versioning it allows versioning of your API, manage different environments (stages such as dev, test, test, prod) with their different version, manage different credentials for each stage/version and open access or not to specific stage/version
-
API Type
WebSocket API - Lambda, HTTPs, AWS Services
REST API - Lambda, HTTPs, AWS Services
HTTPs API
- Lambda Proxy, HTTP Proxy, VPC Link Proxy
- No usage plan / API keys
- No data mapping
- No Resource Policy
- No WAF
- No Edge/Private deployment
REST Private - Lambda, HTTPs, AWS Services
(that is only accessible from within a VPC)
Traffic throttling and Quotas you can define the max number of requests per second an API (global and service call) can receive (rates and burts), as well as metering plans for an API’s allowed level of traffic. Defined in a Usage plan
Caching It is possible to cache API responses to incoming requests to take the load off the backend service, as cached responses to an API with the same query can be answered from the cache
-
-
Integration type
AWS Service
- Define Service, Resource, IAM Role
- Define Method request / response - dictates how clients request the method and how the response is provided
- Define Integration request / response - allows to define Mapping templates
VPC Link
- Integrate with a resource that isn't accessible over the public internet
- Creates ENIs in the VPC
- VPC link proxy integration
- Send the request to your HTTP endpoint without customizing the integration request or integration response
- VPC link
- Define Method request / response - dictates how clients request the method and how the response is provided
- Define Integration request / response - allows to define Mapping templates
HTTP
- HTTP proxy integration
- Send the request to HTTP endpoint without customizing the integration request or response (no mapping template)
- HTTP request is passed to the backend
- HTTP response from the backend is forwarded by API Gateway
- Cann add HTTP headers (e.g. APIKey)
- HTTP (custom) integration
- Define Method request / response - dictates how clients request the method and how the response is provided
- Define Integration request / response - allows to define Mapping templates
Mock
- Generate a response based on API Gateway mappings and transformations
- Do not hit backend
Lambda function
- Lambda proxy integration
- Define HTTP method POST, Function ARN, IAM role with perm. to invoke it
- Send the request to your Lambda function as a structured event:
- The HTTP request is transformed into JSON including all request elements (method, url, headers, cookies ...)
- The Lambda JSON output (statusCode, headers, body, isBase64Econded) is transformed in HTTP response
- Lambda (custom) integration
- Requires all proxy steps + request/response data mapping
Throttling & Quotas
Configure throttling and quotas for your APIs to help protect them from being overwhelmed by too many requests
Both throttles and quotas are applied on a best-effort basis and should be thought of as targets rather than guaranteed request ceilings
Throttling Types
Per-account limits are applied to all APIs in an account in a specified Region. The account-level rate limit can be increased upon request. These limits can't be higher than the AWS throttling limits
Per-API, per-stage throttling limits are applied at the API method level for a stage. You can configure the same settings for all methods, or configure different throttle settings for each method. Note that these limits can't be higher than the AWS throttling limits
AWS throttling limits are applied across all accounts and clients in a region. These limit settings exist to prevent your API—and your account—from being overwhelmed by too many requests. These limits are set by AWS and can't be changed by a customer
Per-client throttling limits are applied to clients that use API keys associated with your usage plan as client identifier. Note that these limits can't be higher than the per-account limits
- Order of application:
- Per-client per-method set in API usage plan
- Per-method throttling limits that you set for an API stage
- Account-level throttling per Region
- AWS Regional throttling
Limits
- Account Limit
- API Gateway throttles requests at10,000 rps across all API
- Soft limit that can be increased upon request
- Stage limit & Method limits
- With Usage Plans you can set per client limit
- In case of throttling => 429 Too Many Requests (retriable error)
Usage Plan
A usage plan specifies who can access one or more deployed API stages and methods—and optionally sets the target request rate to start throttling requests
The plan uses API keys to identify API clients and who can access the associated API stages for each key.
API keys
API keys are alphanumeric string values that you distribute to application developer customers to grant access to your API
You can use API keys together with Lambda authorizers, IAM roles, or Amazon Cognito to control access to your APIs
A throttling limit sets the target point at which request throttling should start. This can be set at the API or API method level
A quota limit sets the target maximum number of requests with a given API key that can be submitted within a specified time interval. You can configure individual API methods to require API key authorization based on usage plan configuration
API Gateway Error
Server Errrors (5xx)
- 502 Bad Gateway
- usually for incompatible output from a Lambda proxy integration backend
- occasionally for out-of-order invocations due to heavy loads
- 503 Service Unavailable
- 504 Integration Failure
- Endpoint Request Timed-out Exception
- API Gateway requests time out after 29 second maximum
Client Errors (4xx)
- 400 Bad Request
- 403 Access Denied
- Authorization failure
- WAF filtered
- 429 Quota exceeded
Amazon AppFlow
AppFlow Overview
Ingest Data - Pulls data records from third-party SaaS vendors and stores them in AWS services like S3 and Redshift
Bi-Directional data transfers with limited combinations (not all sources can be mapped to all AWS service and vice versa)
-
-
Terms & Concepts
Flows are transfer of data between sources and destination, a variety of SaaS applications are supported
-
-
Triggers define how the flow is started.
- Run on demand,
- Run on event,
- Run on schedule
-
Exam Tips
Watch for solutions needing easy (managed) and fast transfer of SaaS or third-party vendor data into AWS services
Example: an application needs to reference large amounts of SaaS data regularly and the data needs to be accessed from within S3
Simple Email Service
SES
- Fully managed service to send emails securely, globally and at scale
- Allows inbound/outbound emails
- Reputation dashboard, performance insights, anti-spam feedback
- Provides statistics such as email deliveries, bounces, feedback loop results, email open
- Supports DomainKeys Identified Mail (DKIM) and Sender Policy Framework (SPF)
- Flexible IP deployment: shared, dedicated, and customer-owned IPs
- Send emails using your application using AWS Console, APIs, or SMTP
- Supports custom email header fields and many MIME types
- Use cases:
- transactional
- marketing
- bulk email communications
- HTML formatted email
Configuration Sets
- Groups of rules that you can apply to the emails you send
- Helps customize and analyze your email send events
- Event destination:
- Kinesis Data Firehose: receives metrics (numbers of sends, deliveries, opens, clicks, bounces, and complaints) for each email
- SNS - immediate feedback on bounce and complaint information
- IP pool management: use IP pools to send particular types of emails (e.g. use an IP Pool for marketing mail and anaother pool for transactional IPs)
SMTP
- Connect directly to SES SMTP interface from your applications, or configure your existing email server to use this interface as an SMTP relay
- Need to generate SMTP credentials first to use the interface
- Connecting to an Amazon SES SMTP endpoint
- STARTTLS is a means of upgrading an existing unencrypted connection to an encrypted connection
- TLS Wrapper is a means of initiating an encrypted connection without first establishing an unencrypted connection (client's responsibility to connect to the endpoint using TLS)
Amazon MQ
Key Facts
Variety support multiple programming languages, OS and messaging protocols JMS, AMQP, MQTT, OpenWire and STOMP
-
-
-
-
-
- Always favour Loose Coupling over Tight Coupling, for this reason you never want to have an EC2 talking directly to another EC2.
- Instead you want to have that scalable, high available, managed service in between.
- The bare minimum is ELB (but it doesn't queue requests), better decoupling is provided by SQS, SNS, API Gateway