Data Science RoadMap
Computer Science
Mathematics #
Linear Algebra
Matrix
Calculus
Vectors
Tensors
Optimization
Statistics #
Probability Theory
Discrete Distribution
Continuous Distribution
Hypothesis Testing
Bayes Theorem
Normal Distribution
Random Variables
ANOVA
Simulations
Statistical Quality Control
Programming Languages
Research
Data Storage
Computing Infrastructure
Data formats
PC/laptop
cluster
Cloud
GPU/CPU
Data Exploration
Domain Knowledge
Software
OS
Integrated Development Environment (IDE)
Win
Mac
Linux
Data Visualization
Dashboards
Data Presentation
Data type
video
audio
image
text
time series
Regression
Set Theory
Number Theory
Algebra
Algorithms
Machine Learning #
R
Python
Matlab
Julia
C/C++
javascript
bash
SQL
Web Tools
program design and patterns #
Virtualization
Virtual Machines
Containers
Version Control
html
css
Deep Learning
Generative Adversarial Networks (GAN)
Convolutional Neural Networks (CNN)
Recurrent Neural Networks
Unsupervised Learning
Dimensionality Reduction
Linear Discriminant Analysis (LDA)
Principal Component Analysis (PCA)
Clustering
Density-based Clustering
DBSCAN
Partional clustering
PAM
CLARA
K-means
Hierachical clustering
Median
Avergae Linkage
Complete-Linkage
Centroid
Ward
Single-Linkage
Supervised Learning
Regression
Random Forest
Support Vector Machine (SVM)
Decision Tree
Linear Regression
Ridge
Lasso
Classification
Support Vector Machine (SVM)
Naive Bayes
Logistic Regression
Decision Tree
K nearest neighbor (KNN)
Random Forest
Data Bases
Neural Networks
Auto-Encoders #
Elastic Net
t-SNE
UMAP
Multidimensional scaling (MDS)
Data Manupulation
Explainable AI