Please enable JavaScript.
Coggle requires JavaScript to display documents.
Deep learning (Recurrent Neural Networks (architectures (many-to…
Deep Neural Networks
Improvement
optimization
-
algorithms
-
momentum gradient descent #
-
-
-
Batch normalization
-
\( Z^i_{norm} = \frac{Z^i - \mu}{\sqrt{\sigma^2 + \epsilon}} \)
\( \hat{Z}^i_{norm} = \gamma*Z^i_{norm} + \beta \)
-
at test
estimate \( \mu, \sigma \)
-
softmax
-
loss function
\( L(y,\hat{y}) = \sum{y_i*log(\hat{y_i})} \)
network 
building block
-
-
back propagation
\( \triangle^{(l)} = \triangle^{(l)} + \delta^{l+1)}(a^{(l)})^T \)
\( D_{i,j}^{(l)} = \frac{1}{m}(\triangle_{i,j}^{(l)} + \lambda\Theta_{i,j}^{(l)})\) if \( j \neq 0 \)
\( D_{i,j}^{(l)} = \frac{1}{m}\triangle_{i,j}^{(l)} \) if \( j \neq 0 \)
\( \delta^{(L)} = a^{(L)} - y^{(t)} \)
\( \delta^{(l)} = ((\Theta^{(l)})^T*\delta^{(l+1)}).*g'(z^{(l)}) \)
\( g'(z^{(l)}) = a^{(l)} .* (1-a^{(l)}) \)
-
loss function: \( J(\theta) = \frac{-1}{m}\sum_{output \, unit} J_i + \frac{\lambda}{2m}\sum_{all \, \theta}\theta_j^2 \)
\( J_i = L(y_i,\hat{y_i}) = -(y_i*log(\hat{y_i}) + (1-y_i)*log(1-\hat{y_i})) \)
-
-
-