Please enable JavaScript.
Coggle requires JavaScript to display documents.
Monitoring, Logging and Remediation - Coggle Diagram
Monitoring, Logging and Remediation
CloudWatch
CloudWatch Alarms
Key Concepts:
- Alarms: you can generate alarm from any metric, including estimated charges on your AWS bill (you need to enable billing alerts)
- Thresholds: static and anomaly detection, used to trigger alarms and actions to be taken if an alarm state is reached
- CloudWatch can be used to monitor your service quotas / limits and notify you if if you are about to reach the limit (for a subset of services)
- AWS Health Dashboard can send events to EventBridge, triggering a CloudWatch Alarm, which can trigger an action (changes in the health, send health events to EventBridge)-> EventBridge (trigger a CW Alarm) -> CloudWatch Alarm (trigger an Action) -> Action (send SNS notification, send message to SQS, trigger Lambda)
AWS Health -> EventBridge -> SNS/SQS/Lambda
- Composite alarms determine their states by monitoring the states of other alarms. You can use composite alarms to reduce alarm noise
- Use Case:
- An alarm sends SNS notification or executes an Auto Scaling policy if CPU utilization exceeds 90% on your EC2 for more than 5 minutes
Creating Quota/Limits Alarm:
- Use CloudWatch alarms to notify you automatically whenever a specified quota reaches a percentage of the maximum or reaches the maximum level. Once you have created an alarm use the CloudWatch console to configure notifications
- Alarm threshold: from 50% to 100% of the applied quota value
- Alarm name: required
Creating CloudWatch Alarm:
- Metric
- Statistic - avg, sum, max, min, percentile, trimmed mean, ... more here
- Period - when evaluating the alarm, each period is aggregated into one data point
- Conditions
- Threshold type:
- Static: greater, greater/equal, lower/equal, lower
- Anomaly detection:
- Outside of the band, Greater than the band, Lower than the band
- Anomaly detection threshold: based on a standard deviation. Higher number means thicker band, lower number means thinner band
- Datapoints to alarm: (N out of M) define the number (N) of datapoints (periods) within the evaluation period (last M periods) that must be breaching to cause the alarm to go to ALARM state (more here)
- Missing data treatment:
- notBreaching – Missing data points are treated as "good" and within the threshold
- breaching – Missing data points are treated as "bad" and breaching the threshold
- ignore – The current alarm state is maintained
- missing – If all data points in the alarm evaluation range are missing, the alarm transitions to INSUFFICIENT_DATA.
- Notification
- Alarm state trigger (Define the alarm state that will trigger this action):
- In alarm: the metric or expression is outside of the defined threshold
- OK: the metric or expression is within the defined threshold
- Insufficient data: the alarm has just started or not enough data is available
- Send a notification to SNS topic (new, existing or other account), you can have multiple notifications
- Auto Scaling action: choose the Resource type EC2 ASG (simple or step scaling policy) or ECS Service, you can have multiple actions
- EC2 action: Stop, Terminate and Reboot. While Recover is reserved for certain EC2 instance types
- Ticket action
- Investigation action: triggers SSM Incident Manager or CloudWatch Application Signals to start an automated triage process
- Systems Manager action:
- Create OpsItem: this will create an OpsItem within OpsCenter with the specified severity and category
- Create incident: this will start an incident using the response plan as a template
Metrics
Dimension
- Is a name/value pair that is part of the identity of a metric
- Dimensions describe specific characteristics of what you're measuring - think of them as categories
- Whenever you add a unique name/value pair to one of your metrics, you are creating a new variation of that metric
- For example, many Amazon EC2 metrics publish InstanceId as a dimension name, and the actual instance ID as the value for that dimension
- You can assign up to 30 dimensions to a metric.
With the CloudWatch Agent:
- In custom metrics CLI:
- aws cloudwatch put-metric-data --dimensions MyName1=MyValue1, MyName2=MyValue2
- aws cloudwatch get-metric-statistics --dimensions Name=MyName1, Value=MyValue1, Name=MyName2, Value=MyValue2
- In CloudWatch agent configuration JSON file:
- append_dimensions with only the following options:
"ImageID":"${aws:ImageId}", "InstanceId":"${aws:InstanceId}", "InstanceType":"${aws:InstanceType}", "AutoScalingGroupName":"${aws:AutoScalingGroupName}"
Aggregate Metrics
- You can aggregate statistics for your EC2 instances that have detailed monitoring enabled
- Instances that use basic monitoring are not included
- You can aggregate the metrics for AWS resources across multiple resources
- Metrics are completely separate between Regions, but you can use metric math to aggregate similar metrics across Regions
- Roll-up Retention: As data gets older, CloudWatch automatically "rolls it up" into larger buckets. E.g. 1-minute kept for 15 days, then aggregated into 5-minute
Metric Math
- Query multiple CloudWatch metrics and use math expressions to create new time series based on these metrics
- Allows to aggregate and transform metrics from multiple accounts and Regions
- Create alarms based on Metric Math expressions
- Visualize the resulting time series on the CloudWatch console and add them to dashboards
- Anomaly detection on metric math is a feature that you can use to create anomaly detection alarms on single metrics and the outputs of metric math expressions
- Example: divide the Lambda Errors metric by the Lambda Invocations metric to get an error rate
Metric Insight
- Use SQL to group and filter thousands of metrics across tags or regions instantly
Namespace
- Is a container for CloudWatch metrics that isolates metrics from different applications or services
- Metrics in different namespaces are isolated from each other to prevent accidental aggregation across applications
- There is no default namespace - you must specify one for each data point you publish
- AWS service namespaces follow the convention AWS/service (e.g., AWS/Lambda, AWS/EC2)
- Custom namespaces should avoid starting with "AWS/" and must be fewer than 256 characters
- Default EC2 host-level metrics: CPU, network, disk and status check
- A metric is uniquely identified by:
- namespace
- metric name
- dimensions (the complete set of dimension names, up to 30 per metric)
- values
- Metrics are stored indefinitely
- You can retrieve data from any EC2 or ELB instance, even after it has been terminated
- By default EC2 sends metric data to CloudWatch in 5 minutes intervals
- For an additional charge, you can enable detailed monitoring that sends metrics at 1 minute intervals
- Dashboard are multi regions
- Dashboards are global
- Dashboards can include graphs from different AWS accounts and regions
Custom Metrics
- Use API call PutMetricData or put-metric-data command
- Add Dimensions to segment metrics (instanceId, environment)
- Metric resolution
- Set with --storage-resolution
- Default is 1 minute
- You can configure high-resolution intervals: 1/5/10/15 seconds
- Accept 2 weeks in the past and 2 hours in the future with --timestamp
- If you do not set --timestamp, CloudWatch automatically assigns the time the metric data was received
- The Auto Scaling group automatically adds a tag to instances with a key of aws:autoscaling:groupName and a value of the Auto Scaling group name
- In Metrics Explorer create a CloudWatch dashboard that provides overview for all EC2 instances in the ASG including the newly launched EC2 based on tag aws:autoscaling:groupName
CloudWatch Logs
- Centralizes Logs for applications (e.g. Apache logs) systems logs (e.g. EC2) AWS Services (e.g. Route53, CloudTrail, ...)
- View, Search, Filter. Search based on error code and messages (e.g. 404 status in Apache logs)
- Notifications. Receive a notification whenever the rate of errors exceeds a threshold you specify
- Monitor Log Files - Monitor and troubleshoot your app using existing system and app log files
- Customize for you application - Monitor your logs in near real-time for specific phrases, values or patterns. To do this you need to use the CloudWatch Agent
Terminology:
- Log Event - Event message and timestamp
- Log Streams - Sequence of log events from the same source, e.g. an apache log from a specific host. Must belong a Log Group
- Log Group - Group Log Streams together, centrally manage retention, monitoring and access control settings. No limit on the number of log streams in a log group
- Usually refer to Log Group in IAM Permissions (for humans) but you also can refer to specific Log Stream (for service-to-service)
- Define encryption (default AWS managed key, you can set KMS key)
- Define retention
Example: 2 EC2 running apache, each instance send events log as part of its log stream. These log streams are part of the same log group to centrally control access control, retention ...
Retention Settings:
- By default logs are kept indefinitely
- You can set your retention period from 1 day - 10 years
- Expired log events are automatically deleted
- Retention settings can be applied to an entire Log Group
Metric Filter:
- Monitor events in a Log Group as they are sent to CloudWatch Logs. You can monitor and count specific terms or extract values from log events and associate the results with a metric. Filter for Warning, Errors, HTTP status codes, etc.
- Create Metric Filter at the Log Group level for any Log Stream
- Create Filter Pattern: when a metric filter matches a term, it increments the metric's count. For example, you can create a metric filter that counts the number of times the word ERROR occurs in your log events.
- Metrics details:
- Metric Namespaces: let you group similar metrics
- Metric Name: identifies this metric, and must be unique within the namespace
- Metric Value: is the value published to the metric name when a Filter Pattern match occurs. Valid metric values are: floating point number (1, 99.9, etc.), numeric field identifiers ($1, $2, etc.), or named field identifiers (e.g. $requestSize for delimited filter pattern or $.status for JSON-based filter pattern - dollar ($) or dollar dot ($.) followed by alphanumeric and/or underscore (_) characters).
- Default Value (optional): is published to the metric when the pattern does not match. If you leave this blank, no value is published when there is no match
- Unit (optional)
CloudWatch Insight
- Interactive query and analysis for data stored in CloudWatch Logs
- Query and filter logs directly
- Generate visualization e.g. bar graph, line graph or pie chart
Run queries to:
- filter logs (with basic bar chart on top of results)
- create visualization charts (line, stacked area, bar and pie)
- export results in various format (markdown, CSV, JSON, CSV, XLSX) either by copying to clipboard or downloading table
- add your result to a dashboard
Examples:
- Lambda
- View latency statistics for 5-minute intervals
- Determine the amount of overprovisioned memory
- Find the most expensive requests
- VPC Flow Logs
- CloudTrail
- Common queries
- 25 most recently added log events
- Number of exceptions logged every 5 minutes
- Route53
- AWS AppSynch
- NAT Gateway
CloudWatch Agent
CloudWatch Agent CLI basic operations:
- Install: sudo yum install amazon-cloudwatch-agent
- Create configuration file: sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
- If you store the configuration file locally: /opt/aws/amazon-cloudwatch-agent/bin/config.json
- If you store the configuration file in SSM Parameter Store answer yes when prompted in the wizard
- Start:
- sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
- sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name
- Stop: sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop
- Notes:
- You can mix config files and ssm parameters
- You can pass multiple config file and ssm parameters
- You can also append configuration to a running agent. You can use -a append-config instead of -a fetch-config
The unified CloudWatch agent enables you to do the following:
- Collect internal system-level metrics from Amazon EC2 instances across operating systems. The metrics can include in-guest metrics, in addition to the metrics for EC2 instances. The additional metrics that can be collected are listed here
- Collect system-level metrics from on-premises servers. These can include servers in a hybrid environment and servers not managed by AWS
- Custom metrics:
- You can define your own metrics using aws cli or api (more here)
- Retrieve custom metrics from your applications or services using the StatsD (Linux/Windows) and collectd (Linux) protocols
- Collect logs from Amazon EC2 instances and on-premises servers, running either Linux or Windows Server
Key Facts:
- The default namespace for metrics collected by the CloudWatch agent is CWAgent, although you can specify a different namespace when you configure the agent
- Metrics collected by the CloudWatch agent are billed as custom metrics. For more information about CloudWatch metrics pricing, see here
- Supported on x86-64 and ARM64 (Amazon Linux 2, Ubuntu 18.04, 20.04, RHEL 7.6, SLES 15)
- Your Amazon EC2 instances must have outbound internet access to send data to CloudWatch or CloudWatch Logs (you must allow list the CloudWatch and CloudWatch Logs public endpoints for the appropriate Regions) or you can setup a CloudWatch / CloudWatch Log VPC endpoint powered by PrivateLink
- If you're using SSM to install the agent or Parameter Store to store your configuration file, you must allow list the SSM endpoints for the appropriate Regions
- You can send metrics and logs to a different AWS Account
Agent Installation:
- You can download and install the CloudWatch agent manually using the command line, or you can integrate it with SSM. In any case:
- Create IAM roles or users that enable the agent to collect metrics from the server and optionally to integrate with AWS Systems Manager
- Policies to be used to with configuration file: CloudWatchAgentServerPolicy and AmazonSSMManagedInstanceCore (install and configure with SSM)
- Policies to be used to store configuration in Parameter Store: CloudWatchAgentAdminPolicy. The permissions for writing to Parameter Store provide broad access. This role shouldn't be attached to all your servers, and only administrators should use it. After you create the agent configuration file and copy it to Parameter Store, you should detach this role from the instance and use CloudWatchAgentServerRole instead.
- Download the agent package
- Modify the CloudWatch agent configuration file and specify the metrics that you want to collect
- Install and start the agent on your servers. As you install the agent on an EC2 instance, you attach the IAM role that you created in step 1. As you install the agent on an on-premises server, you specify a named profile that contains the credentials of the IAM user that you created in step 1
Use SSM to install and configure CW agent:
- Download the CW agent using SSM:
- Run Command > Command document = AWS-ConfigureAWSPackage > Target (your selection) > Action= Install > Name = AmazonCloudWatchAgent > Version = latest
- Start the CW agent using SSM:
- Run Command > Command document = AmazonCloudWatch-ManageAgent > Target (your selection) > Action= Configure > Optional Configuration Source = ssm > Optional Configuration Location = name of the agent configuration file that you created and saved to SSM Parameter Store > Optional Restart list = yes
Common scenarios with the CloudWatch agent:
- Run the CloudWatch agent as a different user
- Default is root > Linux and Local System (SYSTEM) > Windows
- Add custom dimensions to metrics collected by the CloudWatch agent
- Use multiple CloudWatch agent configuration files
- Aggregate or roll up metrics collected by the CloudWatch agent
- Collect high-resolution metrics with the CloudWatch agent
Dashboard
- Dashboards are global
- Dashboards can include graphs from different AWS accounts and regions
- You can share CloudWatch dashboards with people who do not have an AWS account
- Share by Email/Password - WS sends them a username and a temporary password to a dedicated, restricted login portal.
- Share Publicly - Generates a unique, non-guessable URL. Anyone with the link can view the dashboard without any login.
- Share via SSO Provider - You integrate your SSO provider (via Amazon Cognito). Users log in using their standard company credentials.
- Export Dashboard as a JSON file to:
- backup your dashboards
- move them to another AWS account,
- manage them as "IaC": populate DashboardBody property in the CloudFormation AWS::CloudWatch::Dashboard resource
-
CloudTrail
- Records user activity (AWS API) in your AWS Accounts
- Log both Console and AWS CLI, but not SSH or RDP
- Enabled by default
- Support almost every AWS services, unsupported services here mostly because they do not have public APIs
- Logs: who, when, what, where, source IP, parameters and response
- By default Event History keeps log for 90 days
- Organization trail logs all events for all AWS accounts in an organization (must be created in the management account)
- Enable for all accounts in my organization options (management or delegated administrator account only)
- Choose a bucket belonging to any account, but the bucket policy must grant CloudTrail permission to write to it
Use Case:
- Incident Investigation: after-the-fact investigation
- Security Analysis: near-real-time security analysis of user activity
- Compliance: can be used to help you meet industry, regulatory compliance and audit requirements
Keeping logs longer than 90 days:
- Create a trail: when you create a trail in the console, logs are saved indefinitely to an S3 bucket
- Secure by Default: Encrypted using Server Side Encryption. Log integrity validation means logs are digitally signed, so you can detect if a log was changed or deleted
- All Regions: by default (in AWS Console), a trail created in the console will apply to all regions (recommended). Use AWS CLI to log events in a single region
Near Real-Time:
- After making an API call, it can take up to 15 minutes for the call to appear in CloudTrail
- CloudTrail publishes logs to S3 approximately every 5 minutes
- Overall it can take between 15 to 20 minutes for a call to appear in the logs
Creating a Trail:
- Storage location: New or Existing S3 bucket
- Log file encryption is Enabled by default with SSE-KMS and you need to provide a New or Existing KMS key. If disabled, SSE-S3 it is used
- Log file validation is Enabled by default
- SNS notification is Not Enabled by default, when Enabled requires a New or Existing SNS topic. This for provide notification for every log file delivery, not for every event
- CloudWatch Logs to monitor your trail logs and notify you when specific activity occurs, by default Not enabled when Enabled requires a New or Existing Log Group
- CloudTrail supports sending data, CloudTrail Insights, and management events to CloudWatch Logs
- You can then define metric filters and alerts
- Events:
- Management events: capture (control plane) management operations performed on your AWS resources (options: Read, Write, Exclude AWS KMS events, Exclude Amazon RDS Data API events)
- Data events: log the resource operations performed on or within a resource (data plane operations)
- DynamoDB: PutItem, DeleteItem, and UpdateItem on Table
- S3: GetObject, DeleteObject, and PutObject on buckets and objects in buckets
- AWS Lambda function execution activity (the Invoke API)
- Insights events: identify unusual write activity, errors, or user behavior in your account (API call rate, API error rate)
- Network activity events: information about resource operations performed on a resource within a VPC endpoint
AWS Config
Dashboards:
- Resource Inventory: for AWS and non-AWS resources
- Compliance Status:
- Rules (compliant and non-compliant)
- Resources (compliant and non-compliant)
- Noncompliant rules by noncompliant resource count
Example Use Cases:
- EC2 must not have public IP, discovers noncompliant instances, perform automatic remediation (e.g. stop the non-compliant instance)
- Configuration Monitoring: continuous monitor (you can trigger rule re-evaluation)
- Desired State: evaluate configuration against
- Notification if a resource deviates from desired state:
- event to EventBridge (default)
- SNS (remediation action)
- Automatic Remediation:
- Remediation actions are run using AWS Systems Manager Automation
- Triggers action (SSM) that you define against non-compliant resources
- SSM Automation can send SNS notification
- You can create a custom SSM Automation Document (runbook) to invoke Lambda
- Change History:
- stored into S3 bucket created for us
- optionally send ALL configuration changes to SNS topic
- Integrated with:
- IAM
- EC2
- EBS
- ELB
- CloudFormation
- CloudFront
- CloudTrail
- KMS
- RDS
- S3
- Security Groups
- SNS
- VPC
Terminology:
- Rule: the desired configuration for a specific resource
- Managed Rules: 180 AWS provided managed rules for pre-defined common best practices. Examples:
- s3-bucket-public-read-prohibited
- desired-instance-type
- cloud-trail-encryption-enabled
- ec2-ebs-encryption-by-default
- required-tags
- Conformance Pack: a set of rules and remediation actions that can be deployed and managed as one. AWS provides over 100 pre-built templates for major frameworks, including:
- Operational Best Practices (PCI-DSS, HIPAA, SOC 2, FedRAMP)
- Well-Architected Framework Pillars (Security, Cost Optimization, Reliability)
- Service-Specific Packages (DynamoDB Best Practices, S3 Security)
Configure Rule:
- Evaluation Mode: determines when resources will be evaluated
- Proactive evaluation: pre-provisioning
- Detective evaluation: post-provisioning
- Both
- Trigger type:
- When configuration changes
- Periodic: evaluates resources when the trigger occurs
- Scope of changes:
- All changes
- Resources (Resource type, identifier)
- Tags
- Parameters: (key and value) define attributes for which your resources are evaluated; for example, a required tag or S3 bucket
- General settings (Recorder)
- Resource types to record
- Record all current and future resources supported in this region (optionally include global resources)
- Record all current and future resource types with exclusions
- Record specific resource types
- Data retention period
- Retain AWS Config data for 7 years (2557 days)
- Set a custom retention period for configuration items recorded (30 days - 7 years)
- AWS Config role
- Use an existing AWS Config service-linked role
- Choose a role from your account
- Delivery method
- Amazon S3 bucket (New, Existing, other Account)
- Amazon SNS topic (New, Existing, other Account) - Stream configuration changes and notifications to an Amazon SNS topic
- EventBridge - AWS Config sends detailed information about the configuration changes and notifications to EventBridge
- Rules Compliance Change
- Configuration Item Change
- Rules Re-evaluation Status
- Configuration Snapshot (a complete record of all resources) Delivery Status
- Configuration History Delivery Status
Remediation with SSM
-
-
Other Remediation Action (under the hood still use SSM)
- Notification publishing to SNS topic
- Delete unused resources (e.g. EBS, Elastic IP, SG, ...)
- Enable Encryption on a S3 buckets
- Disable Public Access for a Security Group
Remediation Action
- Select rule > Action > Manage remediation
- Select remediation method
- Automatic remediation: the remediation action gets triggered automatically when the resources in scope become noncompliant
- Manual remediation: manually choose to remediate the noncompliant resources
- Remediation action details: the execution of remediation actions is achieved using SSM
- Rate Limits: specify the percentage of resources against which SSM documents are executed at a time and also the percentage of failed SSM executions for which the entire batch is marked as failed
- Parameter: each parameter has either a static value or a dynamic value. If you choose a parameter from the Resource ID dropdown list, the RESOURCE_ID value is passed to the selected parameter. You can enter values for all the other keys. If you do not choose a parameter from the Resource ID dropdown list, you can enter values for each key. Example: The ARN of the role that allows Automation to perform the actions on your behalf
- Resource ID parameter: pass the resource ID of noncompliant resources to a remediation action by choosing a parameter that is dependent on the resource type. The parameters available in the dropdown list depend on the selected remediation action
Aggregtor
An aggregator is an AWS Config resource type that collects AWS Config configuration and compliance data from the following:
- Multiple accounts and multiple regions
- Single account and multiple regions
- An organization in AWS Organizations and all the accounts in that organization which have AWS Config enabled
- Use an aggregator to view the resource configuration and compliance data recorded in AWS Config
- Aggregators provide a read-only view into the source accounts and regions that the aggregator is authorized to view
- Aggregators do not provide mutating access into the source account or region. For example, this means that you cannot deploy rules through an aggregator or pull snapshot files from the source account or region through an aggregator
Multi-account and Multi-region data aggregation in AWS Config allows you to aggregate AWS Config configuration and compliance data from multiple accounts and regions into a single account. Useful for central IT administrators to monitor compliance for multiple AWS accounts in the enterprise
EventBridge
Schedule Event this is used to run a Rule on a schedule (e.g. reboot an instance every month at the same time)
CloudWatch Event is now EventBridge. EventBridge is the preferred way to manage your events. CloudWatch Events and EventBridge are the same underlying service and API, but EventBridge provides more features. Changes you make in either CloudWatch or EventBridge will appear in each console
Event-driven architecture (event = change in state). Various AWS services sends events to EventBrdige that match these events in Rules that route to Targets that take Actions (e.g. shutdown an EC2 that is marked non-compliant, trigger a Lambda function to take action in response to an event or send a SNS notification when a certain event is found in CloudTrail)
Use Case (not good examples):
- AWS Config detects EC2 with unencrypted EBS volume. An event is generated and sent to EventBridge, which triggers a Rule that invokes an action to send you an email using SNS
- CloudWatch detects an EC2 with 99% CPU Utilization, an event is generated and sent to EventBridge, which triggers a rule that invokes an action to send you an email using SNS
Key Concepts
EventBridge Rule
-
Rule detail
- Event Bus default vs custom, receive events from resources that emit events: 1) AWS services in your account or in other accounts, 2) SaaS partner services and 3) applications, and your own custom applications. When an event bus receives an event, EventBridge then checks whether the event matches the conditions of the rules associated with that event bus
- Rule type
- Rule with an event pattern
- Schedule
- Cron expression: a fine-grained schedule that runs at a specific time, such as 8:00 a.m. PST on the first Monday of every month
- Rate expression: a schedule that runs at a regular rate (Minutes, Hours, Day), such as every 10 minutes
- Event source
- AWS events or EventBridge partner events - Events sent from AWS services or EventBridge partners
- Other - custom events or events sent from more than one source, e.g. events from AWS services and partners
- All events - all events sent to your account
- Creation method - Event Pattern
- Use schema - use an Amazon EventBridge schema to generate the event pattern
- Select schema from Schema registry
- Enter schema
- Use pattern form - use a template provided by EventBridge to create an event pattern
- Event source - AWS service or EventBridge partner as source
- AWS service - the name of the AWS service as the event source
- Event type - the type of events as the source of the matching pattern
- Custom pattern (JSON editor) - write a pattern in JSON
- Target(s)
- Target types
- EventBridge event bus also in a different Account
- EventBridge API Destination - HTTP endpoints that you can invoke
- AWS service
- Enable Target input transformation to customize the text from an event before EventBridge send to the target
- Part of the matched event - specify the part of the event text is sent
- Constant (JSON text)
- Input transformer - specify how to change some of the event text before passing it to the target.
- Input path - Define key-value pairs to extract event values. Use JSON path to reference fields in your event, and store those field values in variables
- Template - Define how to format the event information retrieved by the input path, prior to it being sent to the target (JSON or text)
Bus
- An Event Rule works with a Event Bus
- Default Bus (AWS services events)
- Partner specific Event bus (Partner events)
- Custom buses (Custom application events)
- Event bus receive events from a variety of sources and match them to rules in your account
- Create Archives with encryption (AWS owned key or KMS key) and retention period (indefinite vs custom days)
- Reply archives
EventBridge Pipes
-
- Sources: receive events from a variety of sources, including DynamoDB, Kinesis, and SQS
- Filtering: define an event pattern to filter the events that are sent through the pipe
- Enrichment: transform your event or pull additional data into it using Lambda, Step Functions, or an API
- Target: send your event to an AWS service, an event bus, or an API destination
EventBridge Schedule
- A schedule invokes a target one-time or at regular intervals defined by a cron or rate expression
- A new EventBridge scheduling functionality that provides one-time and recurring scheduling functionality independent of Event buses and rules. You can create a schedule to invoke targets such as a Lambda function
- Does not rely on event bus
Schedule pattern:
- One-time schedule
- Recurring schedule (Cron vs Rate such as Scheduled Rules)
-
Integrations
Partner event source:
- Select partner event source
- Copy your AWS account information
- Create event bus for the selected partner (in partner tool/service)
- Associate the partner event source with an event bus in your account
API destinations:
- API destinations are third-party partner targets that you can invoke using an HTTPS endpoint
- Make use of Connections for EventBridge to use in connecting to a public or private API
- Authorization method and credentials
- network connectivity for both public and private APIs
Rules:
- A single rule can route to up to 5 targets, all of which are processed in parallel
- A rule can customize an event before it is sent to the target, by passing only certain parts or by overwriting it with a constant
Targets:
- There are over 15 AWS services available as event targets including: Lambda, SQS, SNS, Kinesis Streams, Kinesis Data Firehose, SSM, Step Function, CloudWatch Log Groups
- To deliver event data to a target, EventBridge needs IAM Role with permission to access the target resource
EventBridge Archive and Replay Events
- create an archive of events so that you can easily replay them at a later time
- determine which events are sent to the archive by specifying an event pattern
- retention period
Fail to send events to Target:
- Retry: by default, when an event is not received (by the target) retry sending the event for 24 hours and up to 185 times
- Dead-letter queue: to avoid losing events after they fail to be delivered to a target, configure a dead-letter queue and send all failed events to it
AWS Health Dashboard
- Service health - view the current and historical status of all AWS services
- Open and recent issues - View the current and historical status of all AWS services
- Service history - any AWS service issue in the last 12 months
- (Personal) Your account health with important events affecting your AWS resources
- Open and recent issues
- Scheduled changes - upcoming events and ongoing events from the past seven days that might affect your AWS infrastructure, such as scheduled maintenance activities
- Other notifications - ongoing events from the past seven days that might affect your AWS account, such as certificate rotations, billing notifications, and security vulnerabilities
- Event log
Organization Health - use the AWS Health Dashboard to get a centralized view for health events in your AWS organization
Integrations:
- Amazon EventBridge
- AWS Health Aware (customize AWS Health Alerts for Organizational and Personal AWS Accounts)
Service to Review:
- CloudWatch
- CloudTrail
- SSM [Session Logging?]
- EventBridge
- Config
- Health Dashboard