0% found this document useful (0 votes)
46 views7 pages

Overview of Artificial Neural Networks

Artificial Neural Networks (ANN) are computational models inspired by the biological neuron system, capable of learning and problem-solving through training. They consist of interconnected layers, including input, hidden, and output layers, and can be classified into various architectures such as feed forward and recurrent networks. Learning methods in ANNs include supervised, unsupervised, and reinforced learning, each with distinct algorithms and applications.

Uploaded by

ecartman042
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views7 pages

Overview of Artificial Neural Networks

Artificial Neural Networks (ANN) are computational models inspired by the biological neuron system, capable of learning and problem-solving through training. They consist of interconnected layers, including input, hidden, and output layers, and can be classified into various architectures such as feed forward and recurrent networks. Learning methods in ANNs include supervised, unsupervised, and reinforced learning, each with distinct algorithms and applications.

Uploaded by

ecartman042
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ARTIFICIAL NEURAL NETWORKS

Neural networks are simplified models of the biological neuron system. It is a massively parallel
distributed processing system made up of highly interconnected neural computing elements that
have the ability to learn and thereby acquire knowledge and make it available for use. Various
learning mechanisms exist to enable the neural network acquire knowledge. Neural network
architectures have been classified into various types based on their learning mechanisms and other
features. This learning process is referred to as training and the ability to solve a problem using
the knowledge acquired as inference. Neural networks are simplified imitations of the central
nervous system and therefore, have been inspired by the kind of computing performed by the
human brain. The biological neuron and the imitated neuron (artificial neuron) are indicated in
Figure 1a and Figure 1b respectively. The structural constituents of a human brain termed neurons
are the entities, which perform computations such as cognition, logical inference, pattern
recognition and so on. Hence the technology, which has been built on a simplified imitation of
computing by neurons of a brain, has been termed Artificial Neural Systems (ANS) technology or
Artificial Neural Networks (ANN) or simply neural networks (NN).

b
Dendrites:

Soma: Process
𝑥2 𝑤2 net
+ f(.) y

Axons: Turn processed


inputs into outputs

Synapses: Electrochemical
contact between neurons b: Perception Model

a: Biological Neuron

Figure 1: Biological neuron and imitated neuron

Basically, there are two families of neural network, namely feed forward network and recurrent
network or feedback network. The most commonly used network for pattern classification is the
feed forward networks. Different types of neural networks are used depending upon the
requirement of the application. The performance of the neural networks enhances upon increasing
the number of hidden layers up to a certain extent. Increased number of neurons in hidden layer
also improves the performance of the system. Number of neurons must be large enough to
adequately represent the problem domain and small enough to permit the generalization from
training data. A trade-off must be maintained between size of network and complexity resulted
because of network size.

1|Page
Feed Forward Networks
In a feed forward network, only forward connectivity of the neurons is considered. Figure 2
shows a single-layer feed forward network. The inputs to the network are the input vector, x, the
weights of the network are defined by the weight matrix, W, and the biases by the vector b.

Figure 2: Single layer feed forward network


𝑥1 𝑤1,1 ⋯ 𝑤1,𝑚 𝑏1
𝑥2 𝑏
𝑥 = [ ⋮ ]; 𝑊=[ ⋮ ⋱ ⋮ ]; 𝑏 = [ 2 ];
𝑤𝑛,1 ⋯ 𝑤𝑛,𝑚 ⋮
𝑥𝑚 𝑏𝑛
The output Y of the network can be written in vector form as

𝑌 = 𝑓(𝑛𝑒𝑡); where, 𝑛𝑒𝑡 = (∑𝑛𝑖=1 𝑊𝑖 𝑥𝑖 ) + 𝑏

The information-processing ability of a neural network depends on its architecture (topology).


The selection of network architecture is largely determined by the application and
the number of neurons, connections and choice of transfer functions are fixed during the design.
There are seven different types of feed forward neural network architectures as follows:
i. Multilayer perceptron networks
ii. Radial basis function networks
iii. Generalized regression neural networks
iv. Probabilistic neural networks
v. Belief networks
vi. Hamming networks
vii. Stochastic networks.

2|Page
Properties of Neural Networks
i. The NNs display mapping capabilities, i.e. they can map input patterns to their associated
output patterns.
ii. The NNs learn by examples. Thus, NN architectures can be trained with known
examples of a problem before they are tested for their inference capability on unknown
instances of the problem. They can, therefore, identify new objects previously untrained.
iii. The NNs possess the ability to generalize. Thus, they can predict new outcomes from the
past trends.
iv. The NNs are robust systems and are fault tolerant. They can, therefore, recall full patterns
from incomplete, partial or noisy patterns.
v. The NNs can process information in parallel, at high speed, and in a distributed manner.
Neural Network Characteristics
The word network in Neural Network refers to the interconnection between neurons present in
various layers of a system. Every system is basically a 3 layered system, which are Input layer,
Hidden Layer and Output Layer.
Input Layer: This layer is responsible for receiving information (data), signals, features, or
measurements from the external environment. These inputs (samples or patterns) are usually
normalized within the limit values produced by activation functions. The input layer has input
neurons which transfer data via synapses to the hidden layer
Hidden Layers: These layers are composed of neurons which are responsible for extracting
patterns associated with the process or system being analyzed. These layers perform most of the
internal processing from a network. It transfers data from the input layer to the output layer via
more synapses. The synapses stores values called weights which helps them to manipulate the
input and output to various layers.
Output Layer: This layer is also composed of neurons, and thus is responsible for producing and
presenting the final network outputs, which result from the processing performed by the neurons
in the previous layers.
An ANN can be defined based on the following three characteristics:
i. The Architecture: The number of layers and the number of nodes in each of the layers
ii. The learning mechanism which has been applied for updating the weights of the connection
iii. The activation functions used in various layers.

3|Page
Learning Methods
Learning in a network is a procedure for modifying the weights and biases of a network, also
referred to as a training algorithm, to force a network to yield a particular response to a specific
input. Learning methods in NNs can be classified into 3 basic types:
Supervised Learning
In this, every input pattern that is used to train the network is associated with a target or the desired
output pattern. A teacher is assumed to be present during the learning process when a comparison
is made between the networks computed output and the correct expected output, to determine the
error. Tasks that all under this category are Pattern Recognition and Regression.
The training can consume a lot of time. In prototype systems, with inadequate processing
power, learning can take days and even weeks. The rules that belong to supervised learning
are:

• Widrow–Hoff rule
• Gradient descent
• Delta rule
• Backpropagation rule
• Cohen–Grossberg learning rule
• Adaptive conjugate gradient model of Adeli and Hung
Unsupervised Learning
Unsupervised learning refers to learning without supervision, i.e., the target response is not known.
That means no external signals are used to adjust network weights. The system learns of its own
by discovering and adapting to structural features in the input patterns as if there is no teacher to
present the desired patterns. Tasks that fall under this category includes clustering, compression
and filtering.
Currently, unsupervised learning method is limited to networks known as self-organizing maps.
At present, the learning is not well understood and is still the subject of research. The rules that
belong to unsupervised learning are:

• Hebb’s rule
• Kohonen’s rule.
Reinforced Learning
In this method, the teacher though available, does not present the expected answer but only
indicates if the computed output is correct or incorrect. The information provided helps the
network in its learning process. But, reinforced learning is not one of the popular forms of learning.

4|Page
Activation Functions
They are used to limit the output of a neuron in a neural network to a certain value. Output may be
in range [-1, 1] or [0, 1]. An activation function for a back-propagation net should be continuous,
differentiable, and monotonically non-decreasing. The various kinds of activation functions are:
i. Linear
ii. Step or threshold
iii. Ramp
iv. Tansigmoid
v. Hyperbolic tangent
vi. Gaussian
Training Artificial Neural Network
When sufficient data is ready for training, the data will be divided into training data and testing
data. The number of training data points should be several times greater than the number of
parameters being estimated (usually the training data should be 60% and above). In addition, a
validation data may also be used to measure network generalization, and to halt training when
generalization stops improving. One can either use GUI or command line to train the network.
Command Line Training
As earlier stated, for pattern recognition, a feed forward network architecture is used. It has a
general syntax as
net = newff([PN],[S1 S2 ... SN],{TF1 TF2 ... TFN},BTF,LF,PF);

where the first input PN is an N × 2 matrix of minimum and maximum values for N input elements.
S1 S2 . . . SN are the sizes (number of neurons) of the layers of the network architecture. TFi is
the transfer function of the ith layer; the default is ‘tansig’. The transfer functions TFi can be any
differentiable transfer function such as tansig, logsig or purelin. BTF is the backpropagation
network training function; the default is ‘trainlm’LF is the backpropagation
weight/bias learning function with gradient descent, such as ‘learngd’, ‘learngdm’. The default is
‘learngdm’. The function ‘learngdm’ is used to calculate the weight change dW for a given neuron
from the neuron’s input P and error E. Learning occurs according to learngdm’s learning
parameters such as the weight (or bias) W, learning rate and momentum constant, according to
gradient descent with momentum and returns weight change and new learning states. PF is the
performance function such as mse (mean squared error), mae (mean absolute error) and msereg
(mean squared error with regularization). The default is ‘mse’.
To train the architecture, different backpropagation training algorithms are available as functions
in [Link] have their own features and advantages. Some of the most widely used
functions are discussed briefly.

5|Page
• traingd – basic gradient descent learning algorithm. It has slow response but can be used
in incremental mode training.
• traingdm – gradient descent with momentum. It is generally faster than traingd and can
be used in incremental mode training.
• traingdx – adaptive learning rate. It has faster training time than traingd but can only
be used in batch mode training.
• trainrp – resilient backpropagation. It is a simple batch mode training algorithm with
fast convergence and minimal storage requirements.
• trainlm – Levenberg–Marquart algorithm. It is a faster training algorithm for networks
of moderate size. It has a memory reduction feature for use when the training set is large.
There are several parameters associated with training algorithms. The parameters are learning
rate, error goal, epochs and show. These parameters are defined as:

• [Link] - specifies learning rate


• [Link] - specifies error goal
• [Link] - specifies the number of iterations
• [Link] - displays status for every show.

Once the network has been defined and the parameters are set, the network can be trained using
the function train() as:
[net, tr] = train(net, P, T)
where net is the network object, tr contains information about the progress of training, P and T
are the input and output vectors, respectively. Finally the network is simulated using the function
sim(). It takes the network input P and the network objects net and returns the network output,
y as
Y=sim(net, P);

Example
Given the network input ranges from 0 to 10, create a two layer feedforward network if the first
layer has five neurons with tansig function; the second layer has one neuron with linear function.
Train the network using ‘traingd’ training function.
%Create a feedforward NN and train with data set [P T] %Training set
P = [0 1 2 3 4 5 6 7 8 9 10];
T = [0 1 2 3 4 3 2 1 2 3 4];
net = newff([0 10],[5 1],{'tansig' 'purelin'},'traingd');
%Set network parameters as follows
[Link] = 0.01 %Learning rate
[Link] = 0.1 %Performance goal
[Link] = 50 %This sets max. no. of epochs in a training
[Link] = 25 %It displays training status after each 25
%epoch

6|Page
[Link] = inf %Maximum time to train in seconds
[Link].min_grad=1e-10 %Minimum performance gradient. Here the
%network is simulated.
net = train(net,P,T);
Y = sim(net,P);

GUI Training

7|Page

You might also like