Multilayer perceptron: Difference between revisions

From Artificial Neural Network for PHP
No edit summary
Line 87: Line 87:
To avoid overfitting of neural networks in this PHP implementation the training procedure is finished if real output value has a fault tolerance of 1 per cent of desired output value.
To avoid overfitting of neural networks in this PHP implementation the training procedure is finished if real output value has a fault tolerance of 1 per cent of desired output value.


== Choosing learning rate and momentum ==
=== Choosing learning rate and momentum ===


The proper choosing of learning rate (<math>\alpha</math>) and momentum (<math>\beta</math>) is done by experience. Both values have a range between 0 and 1. This PHP implementation uses a default value of 0.5 for <math>\alpha</math> and 0.95 for <math>\beta</math>. <math>\alpha</math> and <math>\beta</math> cannot be zero. Otherwise no weight change will be happen and the network would never reach an errorless level. Theses factors can be changed by runtime.
The proper choosing of learning rate (<math>\alpha</math>) and momentum (<math>\beta</math>) is done by experience. Both values have a range between 0 and 1. This PHP implementation uses a default value of 0.5 for <math>\alpha</math> and 0.95 for <math>\beta</math>. <math>\alpha</math> and <math>\beta</math> cannot be zero. Otherwise no weight change will be happen and the network would never reach an errorless level. Theses factors can be changed by runtime.

Revision as of 17:18, 13 January 2008

General

A multilayer perceptron is a feedforward artificial neural network. This means the signal inside the neural network flows from input layer passing hidden layers to output layer. While training the error correction of neural weights are done in the opposite direction. This is done by the backpropagation algorithm.

Activation

At first a cumulative input is calculated by the following equation:

Considering the BIAS value the equation is:

= 1

Sigmoid activation function

Hyperbolic tangent activation function

using output range between -1 and 1, or

using output range between 0 and 1.

cumulative input
weight of input
value of input
number of inputs
number of neuron

Error of neural network

If the neural network is initialized by random weights it has of course not the expected output. Therefore training is necessary. While supervised training known inputs and their corresponded output values are presented to the network. So it is possible to compare the real output with the desired output. The error is described as the following algorithm:

network error
count of input patterns
desired output
calculated output

Backpropagation

The learning algorithm of a single layer perceptron is easy compared to a multilayer perceptron. The reason is that just the output layer is directly connected to the output, but not the hidden layers. Therefore the calculation of the right weights of the hidden layers is difficult mathematically. To get the right delta value for changing the weights of hidden neuron is described in the following equation:

network error
delta value of neuron connection to
learning rate
the error of neuron
input of neuron
desired output of output neuron
real output of output neuron .

Programming solution of backpropagation

In this PHP implementation of multilayer perceptron the following algorithm is used for weight changes in hidden layers:

learning rate
momentum
neuron k
neuron l
weight m
input
output
count of neurons

Momentum

To avoid oscillating weight changes the momentum factor is defined. Therefore the calculated weight change would not be the same always.

Overfitting

To avoid overfitting of neural networks in this PHP implementation the training procedure is finished if real output value has a fault tolerance of 1 per cent of desired output value.

Choosing learning rate and momentum

The proper choosing of learning rate () and momentum () is done by experience. Both values have a range between 0 and 1. This PHP implementation uses a default value of 0.5 for and 0.95 for . and cannot be zero. Otherwise no weight change will be happen and the network would never reach an errorless level. Theses factors can be changed by runtime.

Binary and linear input

If binary input is used easily the input value is 0 for false and 1 for true.

Using linear input values normalization is needed:

input value for neural network
real world value

This PHP implementation is supporting input normalization.

Binary and linear output

The interpretation of output values just makes sense for the output layer. The interpretation is depending on the use of the neural network. If the network is used for classification, so binary output is used. Binary has two states: True or false. The network will produce always linear output values. Therefore these values has to be converted to binary values:

output value

If using linear output the output values have to be normalized to a real value the network is trained for:

real world value
real output value of neural network

The same normalization equation for input values is used for output values while training the network.

desired output value for neural network
real world value

This PHP implementation is supporting output normalization.