Please enable JavaScript.
Coggle requires JavaScript to display documents.
Machine Learning with Tensorflow, Variations, Deep Learning Refresher, ML…
Machine Learning with Tensorflow
Tensorflow
Tensor
Types:
0D Tensor
Eg: np.array(10)
Vectors (1D Tensor)
Eg: np.array([1, 2, 3, 4])
Matrices (2D Tensor)
Eg: np.array([ [1,2,3,4],[5,6,7,8], [9,10,11,12], [13,14,15,16] ])
N. ND Tensor
N-Dimensional tensor can be made by packing multiple (N-1)-Dimensional tensors in a single array.
MNIST Dataset
Selecting single sample
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Batches of Data
Slice: [0:x1], [x1:x2], [x2:x3],...
Selecting multiple samples (Slice)
Common Dataset Examples
2D- Tensors
Shape: (Samples, Features)
Example: Vectors
3D Tensors
Shape: (Samples, Timesteps, Features)
Example: Timeseries, Sequence
4D Tensors
Shape: (Samples, Channels, Height, Width)
Example: Images
5D Tensors
Shape: (Samples, Frames, Channels, Height, Width)
Example: Video
Data container, n-dimensional array
Attributes:
Dimensions(ndim)
Shape(shape)
Data Type(dtype)
Mathematical Foundations
Key Operations in Neural Network
tf.keras.layers.Dense(128, activation='relu')
output = relu(dot(w, input) + b)
np.dot() is used for dot product
Broadcasting Tensors
Reshape
Practical
Optimization Algorithms
Variation of Gradient Descent
Momentum Based, etc.
Regularization
Dropouts
Data Pipeline
tf.Data
API
Data Source
dataset = tf.data.Dataset.from_tensor()
dataset = tf.data.Dataset.from_tensor_slices()
TF Record
tf.data.TFRecordDataset()
Create a Python iterator
next(it).numpy()
it = iter(dataset)
functions
dataset.reduce()
Dataset Structure
Datasets contains elements
Each elements can be of different types
Tensor
Sparse Tensor
Ragged Tensor
Tensor Array
tf.TypeSpec
Dataset.element_spec
tf.data.Dataset.zip()
Batching Dataset Elements
Week 1:
Some Early Concepts
ML
ML: Input + Output -> Rules, then use the rule in applications
Therefore, Training Phase and Prediction Phase/Inference Phase.
Therefore, we need data:
Features
Numeric
Categorical
Convert to Numerical
One Hot Encoding
Embedding
Labels
Present : Supervised Learning
Discrete Labels: Classification Problem
Continuous Labels: Regression Problems
Absent : Unsupervised Learning
Algorithms: Input + Rules -> Output
Steps in Machine Learning:
Data Preprocessing
Feature Normalization
z - score
min-max
Log Transformation, Square root transform, etc.
Model Building
Linear Regression
Polynomial Regression
Logistic Regression
Neural Network
Layers
Input
Hidden
output
Parameters
Activation Functions
Loss Function: J(w,b)
For Regression Problems. e.g., MSE
For Classification Problems. e.g., Cross-Entropy
Loss Surface
Optimization
Loss Visualization v/s Model Visualization
Minimize J(w,b) w.r.t. w and b (parameters).
Gradient Descent
Initialize random weights/parameters.
Loss Calculation
Gradient Calculation.
Update weights (consider Learning rate)
Repeat: step 2.
Predict the output
Repeat until convergence:
Variations
Batch Gradient Descent
all 'n' data samples are passed at once
Mini-Batch Gradient Descent
batch of size 'k' used. Therefore, n/k batches in one epoch.
Steps:
Draw a set of k samples
Randomly initialize parameters
Repeat until convergence:
Predict the output
Calculate Loss
Calculate gradient of loss w.r.t. parameters
update weights simultaneously
Stochastic Gradient Descent
k = 1
Few Important Points
Convergence
If Gradient becomes zero and no further update of parameters.
Fixed number of steps
Epoch
1 epoch means, one full iteration over all the data points/samples in the dataset.
Diagnosing
How do i know if my model is learning?
Monitor Learning curve
Adjust Learning Rate based on observation from learning curve
Evaluation
Scenarios
Solution:
Analyse Loss v/s Model Complexity Curve
Point where Loss is minimum for both Validation set and Train set, is the sweet spot.
Note
: Model complexity means: number of parameters
Underfitting
Increase the complexity by adding more parameters
And correspondingly add more features
Overfitting
Get more data
Reduce Complexity
Regularization
L2- Regularization/Ridge Regularization
L1-Regularization
Elastic Net Regularization
Underfitting
Just Right Fit
Overfitting
Train-Test Split
80% - 20%, usually.
Measures of performance
Regression
Mean Squared Error
Mean Absolute Error
Classification
First, construct
confusion matrix
then calculate: Precision, Recall, F-1 Score.
Accuracy: though not good for situation where classes are imbalanced
ROC curve
PR Curve
Caveats
A model expects train and test data from same data distribution.
Week 2
Variations
Deep Learning Refresher
Neural Networks
Logistic Regression as NN
Linear Regression as Neural Network
Building Blocks
Input Layer
Hidden Layer
Output Layer
Neurons
Linear Combination
Activation Function
Number of Parameters
Number of Edges + Number of Bias
Concepts
Automatic Feature Learning / Selection
Reason for late development
Availability of Large Datasets
Availability of Powerful hardware: GPUs and TPUs
Development of advanced Algorithms and techniques to make learning Deep Networks possible.
Feed Forward Neural Network
Network
Loss
MSE for Regression
Cross Entropy for Classification
Optimization
RMS Prop
Adam
ML Pipeline:
Data
Model Building
Training
Model Selection
Prediction
<<<<
How to understand a new ML Model:
5 Questions
What training Data ?
Model ?
Loss Function used?
Training algorithm used?
Evaluation measure used?
Backpropagation
(Practical implementation of Gradient Descent)