Please enable JavaScript.
Coggle requires JavaScript to display documents.
5720_2 (Neural Networks (Design Choices for Networks (Choices (Activation…
5720_2
Neural Networks
Design Choices for Networks
Basic network model
Input/Output mapping
Generalizations
Additional layers may be added
Skip-layer connections
Sparse connections
feed-forward networks
Choices
Number of input neurons
Number of output neurons
Number of hidden neurons
Activation functions
in the hidden layer:logistic sigmoid or hyperbolic tangent
in the output layer: logistic sigmoid, hyperbolic tangent, linear or soft-max
Error function
Sum of squared error over training set
Cross entropy
Network Training
Error Backpropagation:
for evaluating the gradient of the error function
Evaluating error function derivatives
Using the derivatives to compute a weight update
Overfitting
Choosing hidden units
Stopping training
Regularization
known as weight decay
Consistent Gaussian priors
Convolutional neural networks
Invariant Pattern Classifiers
Augmenting the training set to include examples of the same class but at different transformations.
Adding a regularization term to penalize changes in the output when the model is transformed
Using translation invariant features
Build invariance into the structure of the network
local receptive fields
weight sharing
subsampling
Biological inspiration
Support Vector Machines:
a sparse kernel machine
Maximum Margin Classifiers
Lagrange multipliers Lagragian function
Inequality constraints
Constraint is active (xA)
Constraint is inactive (xB)
Karush-Kuhn-Tucker (KKT) conditions
Extensions
Lagrange multiplier formulation of SVM
Dual representation of the output y
Sparseness
Finding the anand the b
Extension to nonseparable data
Slack variables
Cost function
Box constraints
Lagrangianmultiplier formulation
Minimization
Multiple constraints
Comparing cost functions:
SVM with separable data
SVM with non-separable data
Logistic Regression
Combining Models
Committees
Bootstrap
Average Error
Error Reduction by the committee
Boosting
AdaBoost (“adaptive boosting”)
The Error Function
exponential error function
Minimizing exponential error
more sensitive to outliers
hinge error function of SVMs
cross-entropy error function of logistic regression
Decision tree
Kernel Methods
Dual Representation
Finding the coefficients ai
Obtaining kernels
From features to kernels
define a kernel directly
Techniques for constructing new kernels
kernel function