Machine Learning
What?
is an evolving branch of computational algorithms that are designed to emulate human intelligence by learning from the surrounding environment
Where
pattern recognition
Computer Vision
Spacecraft Engineering
computational biology to biomedical and medical applications
entertainment
Finance
Why
Ecology
produce models that can analyze bigger, more complex data and deliver faster, more accurate results
identifying profitable opportunities
avoiding unknown risks
How
Data
Inputting Training Data 70% / 80%
New input Data (Test Data) 30% / 20%
Machine Learning algorithm (Training)
Trained Model
Supervised
Unsupervised
Evaluation and Prediction
If the prediction and results don’t match, the algorithm is re-trained multiple times until the data scientist gets the desired outcome
Supervised Learning
What
requires the data scientist to train the algorithm with both labeled inputs and desired outputs
Where
Bioinformatics
Speech Recognition
Spam Detection
sentiment analysis
Object Detection
weather forecasting and pricing prediction
Why
When
Classification
Regression
A regression problem is when the output variable is a real value, such as “dollars” or “weight”.
A classification problem is when the output variable is a category such as "Apple" or "Orange" or "disease" and "no disease"
When
classification
clustering
regresiion
dimensionality reduction
Is this credit card transaction fraudulent or not? Is this email spam or not? Machine learning is a great tool when you need to divide objects (for example clients or products) into two or more pre-defined groups.
ML discovers patterns in chaos. It enables those who use it to find parallels between data points and divide objects into similar groups (clusters). What is important, there is no need to define the groups in advance
It's like future prediction. On the basis of an input from a dataset (usually historical data plus other factors), ML estimates the most likely numeric value of a particular quantity. such as stock or real estate prices, consumer behaviour,...
In an ocean of information, ML can choose which data are the most significant and how they can be summarised. In practice, it is applied in such fields as photo processing and text analysis.
allows collecting data and produces data output from previous experiences.
Helps to optimize performance criteria with the help of experience
helps to solve various types of real-world computation problems.
Unsupervised Learning
What?
requires the data scientist to train the algorithm with both labeled inputs and desired outputs
Where
anomaly detection
recommendation engines
Customer segmentation
medical imaging,etc
Why
Classifying big data can be challenging.
There is lesser complexity compared to the supervised learning task. Here, no one is required to interpret the associated labels and hence it holds lesser complexities.
When
Association
An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.
Clustering
A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior
Dimensionality reduction
is a learning technique used when the number of features (or dimensions) in a given dataset is too high. It reduces the number of data inputs to a manageable size while also preserving the data integrity
Linear Regression
Logistic Regresssion
polynomial regression
SVM
Decision Trees
Random Forest
KNN
Naïve Bayes
Stochastic Gradient Descent
K-means
Hierarchical Clustering
DBSCAN
Apriori
Principal Component Analysis (PCA)
Singular Value Decomposition (SVD), etc