Please enable JavaScript.
Coggle requires JavaScript to display documents.
RL, Multi-task Learning, Meta-learning [C], Model-based [C], Information…
-
Multi-task Learning
-
goal-conditioned
-
skill
Eysenbach, Diversity is All you need [C]
Sharma Gu, DADS, 2019 [C]
-
Architecture
MultiHead
soft-parameter sharing
Multi-gate Mixture-of-Experts (MMoE)
-
Meta-learning [C]
black-box approaches [C]
-
Feedforward+average
Garnelo, Conditional Neural Processes, ICML 18
-
-
-
RL[C]
-
Wang, Learning to Reinforcement learn, CogSci 17
-
Rakelly, PEARL, Efficient Off-policy Meta-reinforcement Leanring via Probabilitic Context Variables, ICML 19
optimization-based [C]
Finn, ICML 2017 MAML:\({\rm min}_\theta\sum_{{\rm task \ } i} \mathcal{L}\big(\theta - \alpha\bigtriangledown_\theta \mathcal{L}(\theta, \mathcal{D}_i^{\rm tr}), \mathcal{D}_i^{\rm ts} \big)\)
Finn & Levine ICLR 18, For a sufficiently deep network, MAML function can approximate any function of \(\mathcal{D}_i^{\rm tr}, x^{\rm ts}\)
-
Yu, Finn et al. One-shot Imitation from Observing Humans, RSS2018
Meta-Learning GNN Initializations for Low-Resource Molecular Property Prediction, 2020
Few-Shot Human Motion Prediction via Meta-learning, ECCV 2018
Ravi ICLR 17, precedes MAML:\(\phi_i=\theta - \alpha f(\theta, \mathcal{D}_i^{\rm tr}, \bigtriangledown_\theta\mathcal{L})\)
-
-
RL [C]
-
MAML+MBRL
Nagabandi, Learning to Adapt in Dynamic Environemtnts through Meta-RL, ICLR 19
-
-
-
Model-based [C]
-
Optimize over actions using model \({\rm max}_{a_{t:t+H}} \sum_t r(s_t, a_t)\)
-
-
-
Information theoretic
Mutual information
\(\mathcal{I}(x;y)=D_{\rm KL}(p(x,y) || p(x)p(y))=\mathcal{H}(p(x))-\mathcal{H}(p(x|y))\)
if x and y are independent, mi will be low
-
-
-
-
-
Hierachical
design choices
-
self-terminating
Bacon, The Option-Critic Architecture, 2016
pretrain
Hess, Learning Locomotor Controllers, 2016
goal-condition
Gupta, Relay Policy Learning, 2019
-
Nachum, Why Does Hierarchy (Sometimes) Work? 2019 [C]
Ideas
Different agents at the different scene and then switching the agents, while different agent can live across environments
-
Pareto optimal
From loss perspective
-
\( \theta_a \) dominates \( \theta_b\) of \( \mathcal{L}_i(\theta_a) \leq \mathcal{L}_i(\theta_b) \forall i\) and if \(\sum_i\mathcal{L}_i(\theta_a)\neq\sum_i\mathcal{L}_i(\theta_b)\)
-
-
-
-
-