Chapter 7: Decision Analytics Thinking I : (Evaluating Classifiers (The…
Chapter 7: Decision Analytics Thinking I :
Ex: Binary Classification -> Positive or Negative
Plain Accuracy and Its Problems
Accuracy = (# of correct decisions made) / (Total # of decisions made) = 1-error rate
Classification Accuracy: Any general measure of classifier performance
Pros: popular metric & easy to measure Cons: Too simplistic for applications of data mining techniques to real business problems.
The Confusion Matrix
separates out the decisions made by the classifier, making explicit how one class is being confused for another.
For a Problem: its a n X n matrix with columns and rows with predicted classes.
Errors of the Classifier
False Positives: Negatives instances classified as positive
False Negatives: positives classified as negative
Problems with Unbalanced Classes
As class distribution becomes more skewed, evaluation based on accuracy breaks down
Accuracy is the wrong thing to measure
They operate differently. Classifier A: falsely predicts that customers will churn when they will not. Classifier B: falsely predicts that customers will not churn when in fact they will.
Better Model? Classifier B!
Problems with Unequal Costs and Benefits
Classification accuracy as a metric makes no distinction between false positives and false negative errors
Two errors very different; have different costs; should be counted separately
Solution: Estimate cost or benefit of each decision a classifier can make ====== expected profit (or expected benefit or expected cost)
Generalizing Beyond Classification
R^2 is important
A Key Analytical Framework: Expected Value
Decomposes data-analytic thinking into:
1.) the structure of the problem
2.) The elements of the analysis that can be extracted from the data
3.) the elements of the analysis that need to be acquired from other sources
the weighted average of the values of the different possible outcomes, where the weight given to each value is its probability of occurrence.
Equation: EV = p(o1)
Using Expected Value to Frame Classifier Use
provides a framework; use historical data to find probability
Using Expected Value to Frame Classifier Evaluation
Look at how well each model does and what is its expected value
Evaluating: These probabilities can be estimated from the tallies in the confusion matrix by computing the rates of errors and correct decisions
Count(h,a): each cell of the confusion matrix contains a count of the number of the decisions corresponding combination of (predicted, actual)
We reduce these counts to rates or estimated probabilities p(h,a) ; this is done by dividing each count by the total number of instances
Costs and Benefits
Correct Classification: correspond to the benefits b(Y,p) and b(N,n)
Incorrect Classifications: correspond to the "benefit" b(Y,n) and b(N,p), respectively, which may well actually be a cost (negative benefit) and referred to as costs c(Y.n) and c(N,p)
Important Note: while probabilities can be estimated from data The costs and benefits often cannot
Common way of expressing expected profit
Factor out the probabilities of seeing each class, referred to as, class priors
Rule of Basic probability:
Evaluation, baseline Performance, and Implications for Investments in Data
Important Note: Important to consider carefully what would be a reasonable baseline against which to compare model performance
Good Baseline: Majority Classifier
Naive classifier that always chooses the majority class of the training dataset