Please enable JavaScript.

Coggle requires JavaScript to display documents.

LSTM (Mini-Course (Foundations (Lesson 3: How to prepare data for LSTMs…

- - - - Recurrent Neural Networks
        
        How to Train Recurrent Neural Networks:BPTT.
        
        How to Have Stable Gradients During Training:LSTM
      - LSTM
        
        Forget Gate
        Input Gate
        Output Gate
        
        [samples, time steps, features]
    - - Backpropagation Through Time
      - Truncated Backpropagation Through Time:TBPTT(k1,k2)
      - Prepare Sequence Prediction for TBPTT
        
        How to Seed State for LSTMs
        
        Stateful and Stateless LSTM ：
        resetting/no-resetting and seeding/no-seeding
    - - Scale Data
        
        Normalize Series Data:scikit-learn MinMaxScaler.
        
        Standardize Series Data:scikit-learn StandardScaler
      - One Hot Encode
        
        Manual One Hot Encoding
        
        One Hot Encode with scikit-learn: LabelEncoder and OneHotEncoder
        
        One Hot Encode with Keras: to_categorical()
      - Handle Missing Timesteps
        
        Remove Missing Sequence Data: dropna() Pandas
        
        Replace Missing Sequence Data: fillna() Pandas
        
        Mark Missing Values: replace() Pandas
        
        Masking Missing Values:using a Masking layer
        
        Impute Missing Values: Imputer() scikit-learn
      - Convert a Time Series to a Supervised
        
        The series_to_supervised() Function:Pandas shift() Function
      - Handle Very Long Sequences with
        
        Use Sequences As-Is:A reasonable limit of 250-500 time steps
        
        Truncate Sequences
        
        Summarize Sequences
        
        Random Sampling
        
        Use Truncated Backpropagation Through Time
        
        Use an Encoder-Decoder Architecture
      - Data Preparation for Variable Length Input
        
        Sequence Padding: pad_sequences() function in the Keras
    - - 5 Step Life-Cycle:
        Define Network
        Compile Network
        Fit Network
        Evaluate Network
        Make Predictions
- - - - Sequence Classification
        
        overfitting：Dropout
        
        combine CNN with LSTM
      - Time Series Prediction
        
        LSTM Network for Regression
        
        LSTM for Regression Using the Window Method
        
        LSTM for Regression with Time Steps
        
        LSTM with Memory Between Batches
        
        Stacked LSTMs with Memory Between Batches
      - Time Series Forecasting
        
        Transform the dataset to make it suitable for the LSTM model, including:
        1.Transforming the data to a supervised learning problem.
        2.Transforming the data to be stationary.
        3.Transforming the data so that it has the scale -1 to 1.
        Fitting a stateful LSTM network model to the training data.
        Evaluating the static LSTM model on the test data.
        Report the performance of the forecasts.
    - - return_sequences=True
    - - How to Use the TimeDistributed Layer
        
        The input must be (at least) 3D
        
        The output will be 3D
        
        keeping the internal process for each time step separate.
        
        Simplifies the network by requiring far fewer weights
      - Add Numbers with an Encoder-Decoder LSTM
      - use an Encoder-Decoder LSTM
        
        Sequence Echo Problem
        
        Generate Random Sequence: randint() function
        
        One Hot Encode Random Sequence: one_hot_encode()
        one_hot_decode(): argmax() NumPy function
        
        converting sequences to supervised:
        Pandas shift() function dropna() function.
        
        Echo Whole Sequence (sequence-to-sequence model)
        
        TimeDistributed wrapper
        
        Echo Partial Sequence (encoder-decoder model)
        
        RepeatVector() layer
        
        TimeDistributed wrapper
    - - Compare LSTM to Bidirectional LSTM
        
        LSTM (as-is)
        LSTM with reversed input sequences (go_backward=True)
        Bidirectional LSTM
      - Comparing Bidirectional LSTM Merge Modes
        
        ‘sum‘
        ‘mul‘
        ‘concat‘
        ‘ave‘
    - - Attention is the idea of freeing the encoder-decoder architecture from the fixed-length internal representation.
- - - - Evaluate the Skill of Deep Learning Models
        
        Estimating Model Skill
        (Controlling for Model Variance)
        
        Use a Train-Test Split
        
        Use k-Fold Cross Validation
        
        Estimating a Stochastic Model’s Skill
        (Controlling for Model Stability)
        
        Fix the Random Seed
        
        Repeat Evaluation Experiments
      - Tune LSTM Hyperparameters
        
        Tuning the Number of Epochs
        
        Tuning the Batch Size
        
        Tuning the Number of Neurons
      - Grid Search Hyperparameters
        
        GridSearchCV class in the scikit-learn
      - Improve Deep Learning Performance
        
        1.Improve Performance With Data.
        
        Get More Data.
        Invent More Data.
        Rescale Your Data.
        Transform Your Data.
        Feature Selection.
        
        2.Improve Performance With Algorithms.
        
        Spot-Check Algorithms.
        Steal From Literature.
        Resampling Methods.
        
        3.Improve Performance With Algorithm Tuning.
        
        Diagnostics.
        Weight Initialization.
        Learning Rate.
        Activation Functions.
        Network Topology.
        Batches and Epochs.
        Regularization.
        Optimization and Loss.
        Early Stopping.
        
        4.Improve Performance With Ensembles.
        
        Combine Models.
        Combine Views.
        Stacking.