Please enable JavaScript.

Coggle requires JavaScript to display documents.

Machine Learning with Tensorflow, Variations, Deep Learning Refresher, ML…

- - - - 0D Tensor
        
        Eg: np.array(10)
      - Vectors (1D Tensor)
        
        Eg: np.array([1, 2, 3, 4])
      - Matrices (2D Tensor)
        
        Eg: np.array([ [1,2,3,4],[5,6,7,8], [9,10,11,12], [13,14,15,16] ])
      - N. ND Tensor
        
        N-Dimensional tensor can be made by packing multiple (N-1)-Dimensional tensors in a single array.
    - - Selecting single sample
      - (x_train, y_train), (x_test, y_test) = mnist.load_data()
      - Batches of Data
        
        Slice: [0:x1], [x1:x2], [x2:x3],...
      - Selecting multiple samples (Slice)
    - - 2D- Tensors
        
        Shape: (Samples, Features)
        
        Example: Vectors
      - 3D Tensors
        
        Shape: (Samples, Timesteps, Features)
        
        Example: Timeseries, Sequence
      - 4D Tensors
        
        Shape: (Samples, Channels, Height, Width)
        
        Example: Images
      - 5D Tensors
        
        Shape: (Samples, Frames, Channels, Height, Width)
        
        Example: Video
    - - Dimensions(ndim)
        
        Shape(shape)
        
        Data Type(dtype)
  - - - tf.keras.layers.Dense(128, activation='relu')
        
        output = relu(dot(w, input) + b)
        
        np.dot() is used for dot product
      - Broadcasting Tensors
      - Reshape
    - - Optimization Algorithms
        
        Variation of Gradient Descent
        
        Momentum Based, etc.
      - Regularization
        
        Dropouts
  - - - tf.Data API
        
        Data Source
        
        dataset = tf.data.Dataset.from_tensor()
        
        dataset = tf.data.Dataset.from_tensor_slices()
        
        TF Record
        
        tf.data.TFRecordDataset()
        
        Create a Python iterator
        
        next(it).numpy()
        
        it = iter(dataset)
        
        functions
        
        dataset.reduce()
      - Dataset Structure
        
        Datasets contains elements
        
        Each elements can be of different types
        
        Tensor
        
        Sparse Tensor
        
        Ragged Tensor
        
        Tensor Array
        
        tf.TypeSpec
        
        Dataset.element_spec
        
        tf.data.Dataset.zip()
        
        Batching Dataset Elements
- - - - ML: Input + Output -> Rules, then use the rule in applications
      - Therefore, Training Phase and Prediction Phase/Inference Phase.
      - Therefore, we need data:
        
        Features
        
        Numeric
        
        Categorical
        
        Convert to Numerical
        
        One Hot Encoding
        
        Embedding
        
        Labels
        
        Present : Supervised Learning
        
        Discrete Labels: Classification Problem
        
        Continuous Labels: Regression Problems
        
        Absent : Unsupervised Learning
  - - - Feature Normalization
        
        z - score
        
        min-max
      - Log Transformation, Square root transform, etc.
    - - Linear Regression
      - Polynomial Regression
      - Logistic Regression
      - Neural Network
        
        Layers
        
        Input
        
        Hidden
        
        output
        
        Parameters
        
        Activation Functions
    - - For Regression Problems. e.g., MSE
      - For Classification Problems. e.g., Cross-Entropy
      - Loss Surface
    - - Loss Visualization v/s Model Visualization
      - Minimize J(w,b) w.r.t. w and b (parameters).
      - Gradient Descent
        
        Initialize random weights/parameters.
        
        Loss Calculation
        
        Gradient Calculation.
        
        Update weights (consider Learning rate)
        
        Repeat: step 2.
        
        Predict the output
        
        Repeat until convergence:
        
        Variations
        
        Batch Gradient Descent
        
        all 'n' data samples are passed at once
        
        Mini-Batch Gradient Descent
        
        batch of size 'k' used. Therefore, n/k batches in one epoch.
        
        Steps:
        
        Draw a set of k samples
        
        Randomly initialize parameters
        
        Repeat until convergence:
        
        Predict the output
        
        Calculate Loss
        
        Calculate gradient of loss w.r.t. parameters
        
        update weights simultaneously
        
        Stochastic Gradient Descent
        
        k = 1
        
        Few Important Points
        
        Convergence
        
        If Gradient becomes zero and no further update of parameters.
        
        Fixed number of steps
        
        Epoch
        
        1 epoch means, one full iteration over all the data points/samples in the dataset.
        
        Diagnosing
        
        How do i know if my model is learning?
        
        Monitor Learning curve
        
        Adjust Learning Rate based on observation from learning curve
    - - Scenarios
        
        Solution:
        
        Analyse Loss v/s Model Complexity Curve
        
        Point where Loss is minimum for both Validation set and Train set, is the sweet spot.
        
        Note: Model complexity means: number of parameters
        
        Underfitting
        
        Increase the complexity by adding more parameters
        
        And correspondingly add more features
        
        Overfitting
        
        Get more data
        
        Reduce Complexity
        
        Regularization
        
        L2- Regularization/Ridge Regularization
        
        L1-Regularization
        
        Elastic Net Regularization
        
        Underfitting
        
        Just Right Fit
        
        Overfitting
      - Train-Test Split
        
        80% - 20%, usually.
      - Measures of performance
        
        Regression
        
        Mean Squared Error
        
        Mean Absolute Error
        
        Classification
        
        First, construct confusion matrix then calculate: Precision, Recall, F-1 Score.
        
        Accuracy: though not good for situation where classes are imbalanced
        
        ROC curve
        
        PR Curve
      - Caveats
        
        A model expects train and test data from same data distribution.
- - - - Linear Combination
      - Activation Function
- - - - Model Selection
        
        Prediction
        
        <<<<