Please enable JavaScript.
Coggle requires JavaScript to display documents.
Classification Logistic regression - Coggle Diagram
Classification
Logistic regression
When dependent variable or target is not a number but is a category, we use
classification
technique.
Logistic regression is look for the best logit that fits our data to solve classification problems
Sigmoid curve
y = f(a + bx)
Logistic Regression or Logit function
Logistic regression models the
probabilities
for classification problems with two possible outcomes. It's an extension of the linear regression model for classification problems.
Value from 0 to 1
Finding the best curve
The best a and b of the formula
Minimizing Log Loss or Cross Entropy
Odds
probability of event divided by probability of no event .
Advantages
give probabilities
easy multiple classes
quick to train
Disadvantages
linear boundaries
assumes variables are independent
coefficient interpretation is difficult
Multiple Dimensions
Thresholds
Value used to
convert probabilities to classes
(classification)
Depending on the cost of misclassifications, will draw the threshold in a different place
Confussion Matrix
For each threshold, will get a different Confussion Matrix
is a performance measurement for machine learning classification.
Performance measures
Accuracy = TP + TN / TP+TN+FP+FN
Sensitivity or Recall ot TPR = TP / TP+FN
Precision = TP / TP+FP
Specificity or TNR = TN / TN+FP
F1 Score = 2 (Precision * Recall) / (Precision + Recall)
ROC Curves
AUC
Area under the curve %
The larger the area, the better the model
1 more item...
Link
Probit Regression
Cummulative distribution function for the normal distribution