PM backlog

Check any processes went down

check for new rules created

Check if processing has completely stopped or slow by checking audit_files_processed table. If stopped, check server logs and pstacks where PM is stuck. Taking restart help if issue is at dispatcher due to any issue in file which is stuck. 2.after taking restart if PM goes down again, inmemory counter corruption could be reason. inmemory cleanup and complete restart helps. In case of slowness, check if any recent rule creation or modification, Check iam_rules_participated table for rules if there is a higher participation. 2. Log mode of server logs.

check SC/TC status

Check memory utilization of app servers

click to edit

check other processes if sc/tc are running

If yes restart the vidhaatha

If not check the server logs

if not, restart SC/TC

check the processing Trend (Use the hourly/Daily query)

nm logs

Check recently any rules got modified

Any environment issue

identify the process causing the backlog by using pstack

check for errors in log

memory & cpu utilization

check for recent logs why both went down

click to edit

Top command to check any process utilizing more memory

Check for increase in inflow of file to PM. (to be checked at Spark)

Collect pstacks for further investigation

restart the ones which are failed

Check if backlog is more & load is more for participated records

collect recent logs for the processes which went down

Check which stream is taking more time to clear backlog and based on the logs & requirement we can add record processor or dispatcher

PM backlog

click to edit

1) Rule Modification

2) Reload configuration

3) Mount point issue

4) I/o oepration issue - server logs and system logs

5) DB slowness

-- AWR report during the time of the issue

-- long running queries

-- Inavlid sessions

-- Long running sessions

click to edit

6) Vidhaatha logs and Zookeeper logs for any communication issue or slowness

7) Slog and server logs , look for errors, repeated warning , DB ORA errors

8) Inmemory counter corruption or shared memory corruption

9) Pstcacks , check for waiting threads, long running threads , OCI issues

10) Any patches

11) High level of logging enabled at NM

click to edit

12) NM is down, thread stuck at logging

13) Port communication issue, connection to DB

14) Huge number of sessions open

15) server level slowness

16) Load increase sffecting the participation