Anomaly Detection

identification of rare events or suspicious activity

what are anomalies

Noise, outliers, Standard Deviations and exceptions

Classify

Network anomalies

it is anomaly when N/w behaviour deviates from normal

Web Application

suspicious application behavior that might security issue attack

Why it is so important?

Business Risk

IT Security

Cloud Traffic

Challenges

Data Quality Issues

Sensitivity

Training

Types of Anomalies

Point

Contextual

Collective

How ML Models can be used for anomaly detection

Isolation Forest

click to edit

unsupervised and nonparametric tech (similar like random forest)

1 anomaly are part of minority class compared to normal data

anomalies tends to be found fastly with shortest average path

local outlier factor

density Based clustering

determines the outliers whether a data point deviates beyond the normal

by Calculating local density deviation of a specific item

Auto Encoder

unsupervise model

Two NN

Encoder

Decoder

Approaches of Anomaly Detection

Unsupervise Clustering

Supervised

Semi Supervised

Algorithm

Angle Based outlier Detector

Data Points treated with less angle variations as out lier

K nearest Neighbour Detector

isolation forest

Histogram based outlier Detection

one class SVM

Types of Classification

what is a classifier?

Assign label too data input

Radial Basis Function Kernel

Naive bayes

Decision Tree

Break data sets into smaller and smaller datasubsets

Entropy and Information gain

Ensemble Methods

Team of Models

Gradient Boost Classification

its an ensemble learning strategy that trains a series of weak models , each one attempting to correctly predict the observations of previous model got wrong

step 1 Initial Predictions with a simple decision tree Step 2 Calculate the actual- prediction value Step 3 Build another decision tree that predicts error or residual based on all independent values step 4 update the original predictions with new predictions multiplied by learning rate steo 5 repeat the steps 2 - 4 for certain number of iterations

Information Theory for anomaly detection

Entropy

Conditional Entropy

Relative Conditional Entropy

Information Gain

Information cost

click to edit

Step 1 Classify belonging to Normal abnormal and particular class of intrusion Step 2 Construction Classifier

click to edit

Simple Ensemble

Advance Ensemble

Node

Leaf

why to use this algorithm

Decision Tree usually mimic the human thinking ability while making any decision

logic behind decision tree can be easily understood becoz it shows a tree like structure

Primarily used for classification

Gradient Boost updates the wts by computing negative gradient of loss function WRT predicted o/p It is robots and can use wide range of base learners such as decision trees and linear models

Text Classification even for high dimensional training sets

its a probabiltistic classifier

Advantages- Fast and easy algo to predict the class of any data set

Applications

Credit scoring

Medical data Classification

Real Time Predictions

Text Classification (Spam Filtering and Sentiment Analysis)

Perceptron in ML

its a building block on ANN

how does it works?

types

single layer

Multi layered Perceptron Model

How does IDS work?

Monitors network

Analyze

compares n/w activity with set rules

investigate the alert see activity logs