Please enable JavaScript.
Coggle requires JavaScript to display documents.
One liner, R square (Limitations (increase by adding independent variables…
-
R square
-
-
- R-square is the percentage of variation explained by the relationship between two variables.
- Goodness of fit of a regression model.
-
Correlation value close to 1 or -1 are good and tell you two quantitative variables (i.e., weight and size) are strongly related
Distance Measures
L1 norm/ Manhattan distance / Taxicab norm
All components weighted equally
L1 Norm of vector: Absolute value
L2 Norm / Euclidean norm
Outliers / Some dimensions are weighted more because of component of vector are squared
L2 Norm of vector: Euclidean distance
-
Cosine Distance and Cosine similarity
used to reduce curse of dimensionality
Better to use row normalization before implementing this.
-
Correlation
-
Spearman's rank correlation coefficient
using rank rather than actual value in pearson correlation.
outliers not affect much.
-
Learning algorithms
Unstable algo
- Neural network
- Decision tree
- Regression tree
Stable algo
- K- nearest neighbour
- SVM
- Linear regression
-
-
-
-
-
Bias and Variance
Variance
High Variance:
- Over fit
- Training error << test error
-
Bias
High bias:
- Under fitting
- Over simplification
- High training error