Please enable JavaScript.
Coggle requires JavaScript to display documents.
DL(23) - Variational Autoencoders - Coggle Diagram
DL(23) - Variational Autoencoders
Generative Models
Explicit density estimation
: explicitly define and solve pmodel(x)
approximation of the density via Markov Chains (Gibbs Sampling)
Variational Autoencoders
: use an approximation of the density via a
variational approach
(ELBO)
Impicit density estimation
: learn a model that can sample from pmodel(x) without explicitly defining it
Generative Adversial Network (GANS)
: use a
direct
approach based on
Game Theory
Evidence Lower Bound
with
RBM
we want to find
W
, b, c such that L(∙) maximize
with
VAE
we want to maximize the
ELBO
lower bound
L
(
v
,
θ
,
q
) = log p(
v
;
θ
) -
Dᵏᴸ
( q(
h
|
v
) || p(
h
|
v
;
θ
) )
(some calculation)
L
(
v
,
θ
,
q
) = E h~q [ log p(
h
,
v
) ] + H(
q
)
Differentiable Generator Network
is a differentiable function g(
z
,
θ
) transforming samples of latent variables
z
to
samples
x
(direct)
draw samples
z
from a normal distribution with
zero mean
and
identify covariance
feed
z
to g(∙)
x
= g(
z
) =
μ
+
Lz
drawing samples
x
from a normal distribution with mean
μ
and covariance
Σ
we want to learn parameter in order to generate a sample x of the desidered distribution
distribution over samples
x
(undirected)
when using g(∙) to define p(
x
|
z
), it is possible to impose a distribution over
x
marginalizing
z
:
p(
x
) =
∑ᶻ
p(
x
,
z
) =
∑ᶻ
p(
z
) p(
x
|
z
) = E
ᶻ
p(
x
|
z
)
use g(∙) with sigmoid outputs to provide the mean parameters of Bernoulli distributions: p(
xᵢ
= 1 | z ) = g(z)
ᵢ
Variational Autoencoders
Goal
: estimate the true
θ'
parameters
use a generator network for the posterior pᶿ (
x
|
zᶤ
)
... but need to maximize likelihood of pᶿ (
x
), which is
intractable
solution
: define additional encoder network that approximates pᶿ (
z
|
x
) using
ELBO
pick a simple distribution for prior pᶿ (
z
) (gaussian)
ELBO
L
(q) = E z~q (
z
|
x
) log pmodel(
z
,
x
) -
Dᵏᴸ
( q(
z
|
x
) || pmodel(
z
) )
Idea
sample from true conditional pᶿ (
x
|
zᶤ
)
sample from true prior pᶿ (
z
)
assume training data T = {
xᶤ
} is generated by latent variables
z