Please enable JavaScript.
Coggle requires JavaScript to display documents.
Datadog Event - Coggle Diagram
Datadog Event
Xymon Alerts
Timings of reports
Existing Alerts
Page based Monitoring
Email Alert
SMS Alert (6 Min)
Overnight SMS alert - Longer
Confirm BB alert - EXPORT- Alerting Rules - Info files regX
Xymon Settings For BB - (Groups, Levels)
Production one Page ? - other stuff another Page
Major incident
https://confluence.its.uq.edu.au/confluence/display/governance/Incident+Management+Procedure
MiM manager - Second
Phone ops Genie and in confuence :fist-2:
Manual Update - ITS Centre
Via Outlook template
SMS Senior managers
updates 30 min
or provide a time
SME to get updates or Manager or Nominated person
MIM request Communication path
mim creates an incident
Selects team or individual
Communicate with tech to determine who will be communicated to
Argue with Technical Owner - if its a P! or Do - F-- Off
CRM ticket at P1 level- First :fist:
opsgenie :princess::skin-tone-5:
https://itsuq.app.opsgenie.com/
or App on phone
WarRoom :gun:
Decide on what channel will be used -
Zoom incident management Channel
Team
Teams ??
Nominate People who need to be added
ops genie Documentation
https://confluence.its.uq.edu.au/confluence/display/IIM/Role+-+First+Responder?
Scheduled Updates
on call
https://confluence.its.uq.edu.au/confluence/display/ITSIOAM/Who+is+On-Call
What is a P1
Any impact to BB
Outage
Degradation
Multiple Reports
Nature of Incident
Any Impact to WSA ??
Service owner or failed service or Downstream
No understanding of What Dependencies look like
Define what a P1 is normally or during exams
COVID Process
Degradation for Students 500+
Tell the MIM - Student impact
Technical Owner of Service
Should put up Service Notification
Service view once and then as required
Update Business Owner
Team
Helpdesk