Please enable JavaScript.
Coggle requires JavaScript to display documents.
Larson Chapters 2-7 (Ch 2: Automating Machine Learning (8 Criteria for…
Larson Chapters 2-7
Ch 2: Automating Machine Learning
AutoML: any machine learning system that automates the repetitive tasks required for effective machine learning
Saves time on process design
Improves efficiency when monitoring hundreds of different algorithms and their hyperparameters
Context-specific tools
implemented within another system or for a specific purpose
General platforms
designed for general purpose machine learning
Open source
Tools are developed by and for data scientists
Programming knowledge usually required
AutoML knowledge usually required
Not the best for visualizing results and making decisions
Commercial
8 Criteria for Machine Learning excellence (2.4)
Accuracy
Productivity
Ease of use
Understanding and learning
Resource availability
Process transparency
Generalization
Recommend actions
Ch 4: Aquire Subject Matter Expertise
Helps explain what features mean
Helps set realistic expectations for model performance
Suggests ideas for data collection
You want them as early on in the project as possible
Ch 6: Define Prediction Target
Sometimes data needs to "mature" before collection, as is the case with the lending club data
Without a target, there is simply no way for humans or machines to learn what associations drive an outcome
Classification (our focus)
predicts the category to which a new case belongs
carefully examine the possible states in which a variable can exist and consider whether they can be simplified
Regression
if simplified into quantitative categories, or buckets, a regression target can become a classification one
Ch 7: Success, Risk, and Continuation
Identify success criteria
Who will use the model?
Is management on board with the project?
Can the model drivers be visualized?
How much value can the model produce?
Foresee risks
Model
Ethical
Cultural
Environmental
Decide whether to continue
Hallmarks of analytics competitors
Fact-based decision-making is part of the organizational culture
Copious data and analytics is performed, and results are shared both internally and with partners and customers
The top management team recognizes and drives use of analytics and innovative measures of success
A “test and learn” culture exists where opportunities to learn from data and test hypotheses is a natural part of the workplace environment
Analytics is used not only for core capabilities, but also in related functions like Marketing and Human Resources
Ch 5: Decide on Unit of Analysis
A unit of analysis is the what, who, where, and when of our analysis
For each unit of analysis, there can be numerous outcomes that we might want to predict
Helps to think about what the prediction target is, then the unit of said analysis usually becomes clear
Unit of analysis will tell you which feature to have an identity column for
Ch 3: Specify Business Problem/Opportunity
Any proposed project should be evaluated against three criteria:
Does the project statement specify actions that should result from the project?
How could solving this problem impact the bottom line?
Is the project statement presented in the language of business?