Please enable JavaScript.
Coggle requires JavaScript to display documents.
Workflow of a Machine Learning Project - Coggle Diagram
Workflow of a Machine Learning Project
Gathering Data
Data pre-processing
Most of the real-world data is messy, some of these types of data are
Missing data
Noisy data
Inconsistent data
Types of Data
Numeric e.g. income, age
Categorical e.g. gender, nationality
Ordinal e.g. low/medium/high
How can data pre-processing be performed?
Conversion of data
Ignoring the missing values
Filling the missing values
Machine learning
Outliers detection
Researching the model that will be best for the type of data
Supervised Learning
Categories
Classification
problem is when the target variable is categorical (i.e. the output could be classified into classes)
Models
K-Nearest Neighbor
Naive Bayes
Decision Trees/Random Forest
Support Vector Machine (SVM)
Logistic Regression
Regression
problem is when the target variable is continuous (i.e. the output is numeric).
Models
Linear Regression
Support Vector Regression (SVR)
Decision Tress/Random Forest
Gaussian Progresses Regression (GPR)
Ensemble Methods
Unsupervised Learning
Categories
Clustering
: a set of inputs is to be divided into groups. Unlike in classification, the groups are not known beforehand.
Models
Gaussian mixtures
K-Means Clustering
Boosting
Hierarchical Clustering
Spectral Clustering
Association
Training and testing the model on data
Evaluation