Data Science RoadMap

Computer Science

Mathematics #

Linear Algebra

Matrix

Calculus

Vectors

Tensors

Optimization

Statistics #

Probability Theory

Discrete Distribution

Continuous Distribution

Hypothesis Testing

Bayes Theorem

Normal Distribution

Random Variables

ANOVA

Simulations

Statistical Quality Control

Programming Languages

Research

Data Storage

Computing Infrastructure

Data formats

PC/laptop

cluster

Cloud

GPU/CPU

Data Exploration

Domain Knowledge

Software

OS

Integrated Development Environment (IDE)

Win

Mac

Linux

Data Visualization

Dashboards

Data Presentation

Data type

video

audio

image

text

time series

Regression

Set Theory

Number Theory

Algebra

Algorithms

Machine Learning #

R

Python

Matlab

Julia

C/C++

javascript

bash

SQL

Web Tools

program design and patterns #

Virtualization

Virtual Machines

Containers

Version Control

html

css

Deep Learning

Generative Adversarial Networks (GAN)

Convolutional Neural Networks (CNN)

Recurrent Neural Networks

Unsupervised Learning

Dimensionality Reduction

Linear Discriminant Analysis (LDA)

Principal Component Analysis (PCA)

Clustering

Density-based Clustering

DBSCAN

Partional clustering

PAM

CLARA

K-means

Hierachical clustering

Median

Avergae Linkage

Complete-Linkage

Centroid

Ward

Single-Linkage

Supervised Learning

Regression

Random Forest

Support Vector Machine (SVM)

Decision Tree

Linear Regression

Ridge

Lasso

Classification

Support Vector Machine (SVM)

Naive Bayes

Logistic Regression

Decision Tree

K nearest neighbor (KNN)

Random Forest

Data Bases

Neural Networks

Auto-Encoders #

Elastic Net

t-SNE

UMAP

Multidimensional scaling (MDS)

Data Manupulation

Explainable AI