Please enable JavaScript.
Coggle requires JavaScript to display documents.
Monitoring - Coggle Diagram
Monitoring
AWS X-Ray
- AWS X-Ray is a service that collects data about requests that your application serves, and provides tools that you can use to view, filter, and gain insights into that data to identify issues and opportunities for optimization
- For any traced request to your application, you can also see data about calls that your application makes to downstream AWS resources, microservices, databases, and web APIs.
AWS services that are integrated with X-Ray add tracing headers to incoming requests, send trace data to X-Ray, or run the X-Ray daemon. For example: Lambda send trace and run the X-Ray daemon on workers to make it simpler using the X-Ray SDK.
Instrumenting your application involves sending trace data for incoming and outbound requests and other events within your application, along with metadata about each request:
- Auto instrumentation – instrument your application with zero code changes, typically via configuration changes, adding an auto-instrumentation agent, or other mechanisms.
- Library instrumentation – make minimal application code changes to add pre-built instrumentation targeting specific libraries or frameworks, such as the AWS SDK, Apache HTTP clients, or SQL clients.
- Manual instrumentation – add instrumentation code to your application at each location where you want to send trace information
There are several SDKs, agents, and tools that can be used to instrument your application for X-Ray tracing:
- AWS Distro for OpenTelemetry for Go, Java, JS, Python, .NET with auto-instrumentation for Java and Python
- AWS X-Ray SDKs for Go, Java, Node.js, Python, .NET, Ruby with auto-instrumentation for Java
Exam Concepts
- Segments data containing resource name, request details
- Subsegments segments providing more granular timing data
- Service Graph graphical representation of interacting services in requests
- Traces trace ID tracks paths of request and collect all segments
- Tracing header extra HTTP header containing sampling decision and trace ID
- Tracing header containing added information is named X-Amzn-Trace-Id
Keywords: app insight, response time of downstream resources, HTTP response analysis
Integrations with: EC2, Lambda, ECS, API Gateway, SQS, SNS and Elastic Beanstalk
CloudWatch
Features
Application Metrics installing the CloudWatch agent, you can get info from inside your EC2 instances
Alarms use metric to alert something goes wrong. There are no default alarms
- Metric
- Period
- Conditions
- Threshold: Statics vs Anomaly Detection
- Alarm Condition: >, >=, <=, <
- Datapoint to alarm (number of datapoints within the evaluation period that must be breaching to trigger the alarm)
- Notification
- Actions
System Metrics metrics you get out of the box. The more managed the service is, the more you get
Each metric data point must be marked with a timestamp. The timestamp can be up to two weeks in the past and up to two hours into the future. If you do not provide a timestamp, CloudWatch creates a timestamp for you based on the time the data point was received.
Metric math enables you to query multiple CloudWatch metrics and use math expressions to create new time series based on these metrics
Types of Metrics
Default these metric are provided OOTB and do not require any additional work on your part. E.g. CPU Utilization, Network Throughput
Custom These metrics will need to be provided by using the CloudWatch agent installed on the EC2 instance OS. E.g. EC2 Memory Utilization, EBS Disk Space Utilization
- CloudWatch is a monitoring and observability platform that was designed to give users insight into our AWS architecture
- It allows to monitor multiple level of apps and identify potential issues
- CloudWatch does not aggregate data across regions
Resolution
Standard
- 1 minute
- Default metrics produced by AWS services
High Resolution
- 1 second
- Provides more insight into your application’s sub-minute activity (detailed monitoring?)
Not everything should go through CloudWatch. For instance, AWS standards and configurations should be watched by AWS Conifg
CloudWatch Logs
CloudWatch Logs Terms
-
Log Group A a collection of log streams. For example, you'd group all your apache web server logs across hosts together (e.g. you don't care about a individual EC2/httpd but to the cluster/the service you provide)
-
Features
CloudWatch Logs Insights Allows you to query all your logs using a SQL-like language interactive solution
-
File Patterns Look for specific terms (e.g. HTTP 400 Error) and every occurrence is a data point, you can monitor trends in your data points
- Is a tool that allows you to monitor, store and access log files from a variety of different sources.
- It gives you the ability to query/analyze logs to look for potential issues or data that is relevant for you
- Common utilisation is for EC2, on-premise, RDS, Lambda and CloudTrail
- Go to solution for log collection and analytics not in real-time
- If you do not need to analyze log send them to S3
- If you need real-time log analytics go to Amazon Kinesis
Agent Based
- The CloudWatch agent must ne installed and configured. It is not automatic
- The EC2 instance running the agent needs to have permission (IAM Role attached) to push logs into CloudWatch Logs
CloudWatch AuthN & AuthZ
Managing Access Control:
- Dashboard Permissions
- IAM identity-based policies
- Service-linked roles
A permissions policy describes who has access to what:
- Identity-Based Policies
- Resource-Based Policies
-
There are no CloudWatch ARNs for you to use in an IAM policy. Use an * (asterisk) instead as the resource when writing a policy to control access to CloudWatch actions
CloudWatch Dashboard
-
All dashboards are global, not region-specific
Customizable home pages in the CloudWatch console that you can use to monitor your resources even those spread across different regions
Share your dashboards with users with no access to your AWS account via:
- Share a single dashboard and designate specific email addresses and passwords
- Share a single dashboard publicly, so that anyone who has the link can view
- Share all dashboards in your account and specify a third-party SSO provider for dashboard access
-
Amazon Managed Grafana
Fully managed Grafana service allowing secure data visualization for instantly query, correlate and visualize your operational metrics, logs and traces from different sources
Key Facts
Managed Service AWS manages setup, scaling and maintenance for all workspaces
-
-
-
Grafana Easy Made deploy, operate and scale Grafana
Data Sources integrate with several sources CloudWatch, Amazon Managed Prometheus, Amazon OpenSearch, Amazon Timestream, AWS X-Ray and many more
-
Use Cases
-
-
Container Metrics Visualization Connect to data sources like Prometheus visualizing EKS, ECS, or your self-managed K8s cluster
Exam Tips
Questions to ask in the Exam:
1) What is the best tool to monitor with?
2) Is that metric available by default?
3) Where can I find those logs?
4) Do I need to adjust my alarm threshold?