Please enable JavaScript.
Coggle requires JavaScript to display documents.
Decision Analytic Thinking (Expected Value (DataSet (Induction Algorithm),…
Decision Analytic Thinking
Goals
What am I trying to prove?
What am I trying to make sense of?
Find reasonable baseline against which to compare model performance
persistence: the weather tomorrow is going to be wahtever it was today
Evaluate Classifiers
How do we determine how well model works?
Accuracy
Classifier accuracy: any general measure of classifier performance.
= 1 - error rate
common evaluation metric often used in data mining because it reduces classifier perfofrmance to a sing number.
Not the right thing to measure
Confusion Matrix
a problem involving n cases is an n x n matrix (columns: actual classes/ rows: predicted classes)
Main diagonal: correct decisions
Why isn't my data the same as hers?
classes are unbalanced
Is the data the same?
Did you collaborate along the way?
Other diagonal: errors (false pos/false neg)
unusual data skews the model
Differentiating errors
false pos + false neg = errors
carry different weights
false pos for cancer vs false neg for cancer
Majority Classifier
Evaluate Regression
what is mean squared error
Is there a better metric?
gives us a value, not a class
Expected Value
weighted avg of values of different outcomes
weight is probability \
business values/rules normally acquired
probability can be inferred
Measure of IN AGGREGATE how well each model does
DataSet
Induction Algorithm
Model
Confusion Matrix
Expected Rates
Expecteded Values
Cost/Benefit info
can't be estimated from data
Cost Benefit Matrix
Class priors
2 more items...
pitfalls
2 more items...
Precision and Recall
Sensitivity and Specificity
training portion of data taken from dataset
used to create model