Please enable JavaScript.
Coggle requires JavaScript to display documents.
Provost - Chapter 7 (Expected Value (The expected value is then the…
Provost - Chapter 7
Expected Value
Remember to address questions: What is the goal? What is important in the application? Are we assessing the results of data mining appropriately given the goal?
-
The expected value is then the weighted average of the values of the different possible outcomes, where the weight given to each value is its probability of occurrence
-
-
-
-
-
Comparing models
The set of decisions made by a model need to be evaluated when applied to a set of examples, before comparing one model to another.
Each model will make some decisions better than the other model - what matters is how well each model does IN AGGREGATE
-
Costs & Benefits
-
The cost-benefit matrix specifies, for each (predicted,actual) pair the cost or benefit of making such a decision.
Probabilities can be estimated from data, the costs and benefits often cannot
Depend on external information provided via analysis of the consequences of decisions in the context of the specific business problem.
A false positive occurs when we classify a consumer as a likely responder and therefore target her, but she does not respond (Benefit is negative - money spent)
A false negative is a consumer who was predicted not to be a likely responder (so it wasn't offered), but would have bought it if offered (Nothing gained, nothing lost)
-
A true negative is a consumer who was not offered a deal and who would not have bought it even if it had been offered (Nothing gained nothing lost)
Cost & Benefit matrix is multiplied with matrix of probabilities & summed into a final value = total expected profit
Class priors - factoring out the probabilities of seeing each class. Factoring these out allows us to separate the influence of class imbalance from the fundamental predictive power of the model.
Evaluating Classiiers
-
Good Outcome - Referred to as negative (only legitimate activity) & (unworthy of attention eg. Fraud detection - nothing detected)
-
Classification model takes an instance for which we do not know know the class and predicts its class.
Evaluation, Baseline Performance, and Implications for Investments in Data.
It is important to consider carefully what would be a reasonable baseline against which to compare model performance.
^ This is important for the data science team in order to understand whether they indeed are improving performance
-
-
-