Please enable JavaScript.

Coggle requires JavaScript to display documents.

Transformer don't learn, Provided - Coggle Diagram

- - - - Hypothesis: Data --> Too small?
        
        Dataset ablation
        
        Representational similarity
        between models with and without and across
        If models within std not significantly different
        
        (Optional) Stiching between "no" and "normal" mode
        
        Models seem to need more data w.r.t.
        CNNs
        
        Was it because we messed up in training?
        Do other ppl have the same issues (without knowing)?
        
        Replace attention layers of loaded architectures
        Measure performance drops
        
        Relevance of single layers
        (Shapley/Leave-One-Out) & Relate relevance to Conv Layers
        
        Propose something as solution?
        
        Transfer learning from NI
        
        Distillation to mitigate this?
        
        Cityscapes/NI suffers the same "fate"?
        
        Training Dynamics
        
        (1024total ) Transet 924 --> 1 (logarithmically) | Split Test dataset
        Scale Total Segmentator
      - Some popular segmentation method from the NI domain one could use out of the box.
      - Potential Candidates:
        
        Segformer
        
        Segmenter
- - - - Representational Change
        
        Average difference in representations
        (Transfomers trained/Cut with and without self-attention)
        
        Relative representations of Transformer with attention & without attention
        
        Linear Probing performance before transformer blocks & after transformer blocks
        
        Change in predictions
        
        Measure how the predictions change
        when post-hoc removing/replacing with linear layers.
        Interpret resulting changes/errors.
        
        Questions to be answered:
        What did the Attention learn?
        Long/Short range?
        Are we finding less objects/are they learning to improve boundaries better?
        
        Subsample Patches from Dataset
        In order to collect Representations (and measure predictive changes) it's easier to look at singular patches instead of doing it on whole patients due to nnU-Net's inference scheme.
        
        --> Extract corresponding patches from models!
      - Architectures
        
        UNETR
        
        5
        
        on cluster
        
        finished
        
        SWINUNETR
        
        COTR
        
        TransFUSE
        
        UTNet
        
        TransBTS
        
        TransUNet
        
        nnFormer
        
        SwinUNET
        
        Position of the Q,K,V calculation and shapes are weird
        
        Multiple Checkpoints
        every 50 epochs
        (Of 3 Major Trans. Architectures)
        All Modes
        
        nnU-Net
        
        100% Runs that are to do
        
        Identity (Whole Block)
        
        Identity MLP
        
        Identity Attention
        
        Projection Attention
        
        To Implement today
        
        MLP Attention
      - Dataset creation
        
        10 Steps of Dataset + Corresponding folds
        (Tassilo)
        
        Make that Logarithmic
        
        AMOS
        
        50 Val Split
        
        Kits21?
        
        Target structure removal for
        Total Segmentator
        
        103 targets
        
        [10, 25, 50, 75, 103?]
      - Checkpoints of other people
        
        Exists publicly
        
        SwinUNETR
        
        BTCV
        
        UNTER
        
        BTCV
        
        nnFormer
        
        ACDC
        
        Synapse
        
        tumor
        
        Not publicly available
        
        TransBTS
        
        Contacted (Github issue 20.03)
        
        TransFuse
        
        Emailed (20.03.)
        
        UTNet
        
        Github issue (20.03)
        
        CoTr
        
        Github Issue (20.03)
        
        TransUNET
        
        Github Issue (20.03)
        
        Waiting for response
        
        SwinUNET
        
        Basically metioned in ReadMe they can't send it out
        
        Declined