Please enable JavaScript.
Coggle requires JavaScript to display documents.
Larsen - Ch. 2-7 (p.30-70) (Ch2 (. ... …
Larsen - Ch. 2-7 (p.30-70)
Ch2
Machine Learning Life Cycle
Define Project Objectives
Acquire and explore data
Model data
Interpret & communicate
Implement, Document, & maintain
Not a linear process - Future discoveries or interpretations may result in an an upset with the original objectives
Key productivity increasers
Exploratory data analysis - Return all stats + relationships
Feature engineering - Clean the data, deal with NULLS, etc
Choose an accurate algorithm - Then parameter tuning
Model diagnostics - Evaluate the top models and probability cutoffs
This is not automatic analysis
Worthy questions and model evaluation skills are required
Python is used for general ML platforms
Learn: Auto-Sklearn
. ... 8 Criteria for AutoML Excellence
Understanding and Learning
Explain your findings, front and back
Process Transparency
Instill trust in the system - what manipulated/cleaned the data?
Ease of Use
Open source
Generalizable across Contexts
Sample down, handle any amount of data, across any field
Productivity
Accurate and quick
Recommend Actions
Program context in so a decision can be made
Accuracy
Critical. This must have smart selection and ranking algorithms
Resource Availability
Integrate with already fetched/obtainable data
Ch5
UNIT OF ANALYSIS
The WWWW[Nix the Why and the How]
Examples:
Who was readmitted into hospice?
Where will the crime take place?
What ad did the user click on?
When will the machine break down?
Determining
Examine the prediction target
Discover the logical unit of analysis - what the target goal hinges on
Ch3
Start with a Business Problem
"I want to predict....."
Grounded in requisite data, backed and approved by stakeholders
Three questions:
How can solving this issue impact the bottom line? It's all about that profit
Does the project statement specify actions that should result from the project?
Is the project statement presented in the language of business?
Ch4
Subject Matter Expertise - And why it's so crucial
The business problem can be about anything - any subject in biz
Having knowledge about [accounting/supply chain/medicine] can prevent unseen obstacles
Or can lead to logical problems being overcome without ML - thus cleaning the data/results
Also you want to educate yourself on the company/workings of the sector
Especially for when it comes time to present the data!
"Hospice care" example
An expert can also assist with data collection
No model will be on the nose
Ch6
Prediction target
Finalized column that shows the predicted target behavior
Interesting, check the data date, time
since collection for the set
Perhaps omit incomplete rows - EX: Loan payback
Targets are required
As this is how computers (and humans!) 'learn' the data through associations
Titanic data
Two kinds
Classification
Predicts the category an entry will be placed into
Divorced? T/F?
Regression
Target numerical values
How many years were they together?
Ch7
Success criteria
Factors:
In management supporting the project?
Can the model drivers be visualized?
Who will use the model?
How much value can the model produce?
Value can only be found after data has been cleaned and a MODEL has been considered, evaluated
Risks
Try playing Devil's advocate
Consider facing issues in obtaining data
Types:
Model risks
Ethical risks
Privacy
Cultural risks
Environmental risks
or any biases at play during the collection
Then utilize AutoML
Be wary of black swan data changes
"Flash crash" in the May 2010 market
And then always consider if the project should rightfully move forward