Machine Learning

What?

is an evolving branch of computational algorithms that are designed to emulate human intelligence by learning from the surrounding environment

Where

pattern recognition

Computer Vision

Spacecraft Engineering

computational biology to biomedical and medical applications

entertainment

Finance

Why

Ecology

produce models that can analyze bigger, more complex data and deliver faster, more accurate results

identifying profitable opportunities

avoiding unknown risks

How

Data

Inputting Training Data 70% / 80%

New input Data (Test Data) 30% / 20%

Machine Learning algorithm (Training)

Trained Model

Supervised

Unsupervised

Evaluation and Prediction

If the prediction and results don’t match, the algorithm is re-trained multiple times until the data scientist gets the desired outcome

Supervised Learning

What

requires the data scientist to train the algorithm with both labeled inputs and desired outputs

Where

Bioinformatics

Speech Recognition

Spam Detection

sentiment analysis

Object Detection

weather forecasting and pricing prediction

Why

When

Classification

Regression

A regression problem is when the output variable is a real value, such as “dollars” or “weight”.

A classification problem is when the output variable is a category such as "Apple" or "Orange" or "disease" and "no disease"

When

classification

clustering

regresiion

dimensionality reduction

Is this credit card transaction fraudulent or not? Is this email spam or not? Machine learning is a great tool when you need to divide objects (for example clients or products) into two or more pre-defined groups.

ML discovers patterns in chaos. It enables those who use it to find parallels between data points and divide objects into similar groups (clusters). What is important, there is no need to define the groups in advance

It's like future prediction. On the basis of an input from a dataset (usually historical data plus other factors), ML estimates the most likely numeric value of a particular quantity. such as stock or real estate prices, consumer behaviour,...

In an ocean of information, ML can choose which data are the most significant and how they can be summarised. In practice, it is applied in such fields as photo processing and text analysis.

allows collecting data and produces data output from previous experiences.

Helps to optimize performance criteria with the help of experience

helps to solve various types of real-world computation problems.

Unsupervised Learning

What?

requires the data scientist to train the algorithm with both labeled inputs and desired outputs

Where

anomaly detection

recommendation engines

Customer segmentation

medical imaging,etc

Why

Classifying big data can be challenging.

There is lesser complexity compared to the supervised learning task. Here, no one is required to interpret the associated labels and hence it holds lesser complexities.

When

Association

An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.

Clustering

A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior

Dimensionality reduction

is a learning technique used when the number of features (or dimensions) in a given dataset is too high. It reduces the number of data inputs to a manageable size while also preserving the data integrity

Linear Regression

Logistic Regresssion

polynomial regression

SVM

Decision Trees

Random Forest

KNN

Naïve Bayes

Stochastic Gradient Descent

K-means

Hierarchical Clustering

DBSCAN

Apriori

Principal Component Analysis (PCA)

Singular Value Decomposition (SVD), etc