Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Mining - Coggle Diagram
Data Mining
Data Warehousing
-
-
OLAP
Roll-Up OLAP Operations
Meaning -> Just merge the data like cities into country to make higher dimensions
-
-
-
-
Data Cube
Measures
Distributive
count(), sum(), min(), max()
-
-
-
-
-
-
-
-
Data Analysis
-
-
-
-
-
-
Correlation
Close to zero, no relation

-
-
Classification
-
k-nearst neighbours
One-nearest neighbour

-
-
model evalution
-
Categorising predictions
-
-
-
-
-
-
-
\(F_1 \) Score
\( \frac{2 \times \text{precision} \times \text{recall}}{\text{precision} + \text{recall}} \)
Precision - Recall Curve

Confusion Matrix

-
-
-
-
-
Clustering
-
k-means clustering
initialise

Assign

-
Distance function
Euclidean distance
\( e(x, x') = ||x - x'|| = \sqrt{\sum_{i=1}^n (x_i - x_i')^2} \)
-
-
-
Outliers
Detection
-
-
-
-
Parametric Methods
-
Grubb's Test
\( G > \frac{N-1}{\sqrt{N}} \sqrt{\frac{t_{\alpha/(2N),N-2}^2}{N - 2 + t_{\alpha/(2N),N-2}^2}} \)
Mahalanobis distance
\(MDist(x, \bar{x}) = (x − \bar{x})^TS-1(x − \bar{x})\)
-
-
-
-
-
-
-
-
-
-