Please enable JavaScript.
Coggle requires JavaScript to display documents.
Machine Learning - Coggle Diagram
Machine Learning
-
Learning Methods
Supervised Learning
Regression
Linear regression
-
-
Linear regression
Econometrics Approach
Definiton
Branch of economics that develops and uses statistical methods for estimating economic relationships
-
-
-
-
-
-
-
-
Classification
Types
Decision Trees
-
Definitions
Decision trees are ML algorithms that are progressively divide data sets into smaller data grous based on a descriptive feature, until they reach sets that are small enough to be described by some label
-
-
-
-
In general
The prediction of the algorithm at each terminal node will be the category with the majority of data ponts - most commonly occuring class
-
Logistic Regression
Method
Instead of minimizing the average loss we maximize the likelihood of the training data according to our model. Maximum likelihood estimation
Likelihood function describes the joint probability of the observed data as a function of the parameters of the model
-
-
-
Evaluation Metrics
-
-
Precision
-
-
When my model says relevant, how likely is that is relevant
-
-
ROC
Definition
A graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.
-
-
-
-
-
AUC
Definition
Area under the ROC curve - overall ability of the classifier to discriminate between positive and negative classes
-
-
Key characteristics
Interpretability - probability that a randomly chosen positive instanve is ranked higher than a randomly chosen negative instance
-
Cross-Validation
Definition
Resampling procedure to evaluate ML models by training multiple models on different subsets of the available data and assessing their performance on complementary subsets
Types
K-Fold Cross validation
Process
-
-
Repeat the process K times, each time with a different validation fold
-
-
-
-
-
-
-
-
-
-
Unsupervised Learning
Clustering
Types
-
Partition-based
K-Means clustering
In general
-
Start by picking k, the number of clusters
Getting the k right
- 2 more items...
-
Example: pick one point at random, then k-1 other points, each as far away as possible
Populating clusters
2 - After all points are assigned, update the locations of the centroids of the k clusters
-
1 - For each point, place it in the cluster whose current centroid it is nearest
-
In plain English
1.Assignment step: assign each observation to the cluster whose mean yields the least within-cluster sum of squares. Since the sum of squares is the squared Euclidean distance, this is intuitively the nearest mean.
-
-
-
-
-
-