I plan to make a short series of posts about neural networks, and how to perform feed forward.
Starting from this post I’ll cover the model/architecture of a neural network and move my way through the math required to output a prediction.
I covered how to perform backpropagation in an earlier post, which can be found here.
The Neural Network Model:
To understand the neural network model and architecture, we must first understand every component in the network. Then we can begin to look at the network as a whole.
The neural network is built from 3 different components: Layers, Nodes, and Weights.
In a traditional neural network, you’ll find three different types of layers. The input layer, the hidden layers, and the output layer.
To these layers, there are very few rules. The input layer must be the very first layer, and the output layer must be the very last layer, and there can only be one of each. The hidden layers have similar rules, as they are categorized as any layer between the input and the output layer. But we can have as many hidden layers as we desire.
One additional rule of thumb is that the input values of the input layer should be normalized in some manner, for example between 0 and 1.
Each layer contains a certain amount of nodes/neurons. You can think of a node as a tiny computational unit, which in essence does all the calculations. A single node has very little computational power, and alone cant do much.
But by combining the nodes in certain networks, these tiny computational nodes can help each other to perform miracles and become capable of approximate any continuous function, given a large enough network and data.
Between any two nodes in two adjacent layers, there will be a weight connecting them, serving as a link between two nodes. Which is how every adjacent layer is connected to each other.
By adjusting the value of the weights, we can allow the neural network to learn. Whenever we adjust the value of a weight, we alter the “behavioral pattern” of the neural network, and by doing this we can teach our network what is correct, and what is wrong within the given parameters.
Before I begin throwing math at you, I want you to understand the data flow through the network. And since the math is so simple, you’ll most probably know how to do it yourself, even before you see the equations.
Imagine we have a simple two-layer neural network, with two inputs and one output node. The network contains the two input nodes and two weights connecting into the single output node. If we want to perform feedforward, we have to sum the value of every input node multiplied by their respective weight, to get the net (Input) value of the output node. Next, we have to apply an activation function on the net value, to get the output value of the node.
The reason we apply an activation function to the input of any node except the input layer is for one to normalize the information within the network, and to allow our network to approximate non-linear functions.
The same procedure, as shown above, is exactly the same to be applied to any sized neural network. For every node connecting to a node, accumulate the value of the nodes in the previous layer multiplied by their respective weights, then apply the activation function, commonly the sigmoid function, to get the output value of the node.
To calculate the net value of a node, we sum all nodes connecting to the node, multiplied by their respective weights:
To calculate the output value of a node, we apply an activation function, commonly the sigmoid function.
Let’s recalculate the results of the first hidden node:
Out = Sigmoid(0.60) = 0.65
There you have it, how to perform simple feedforward 😉 Feel free to comment below if anything is misleading or something is hard to understand, and I’ll happily reply and even update the post accordingly.
Part 2 of this series will contain a code example of a simple implementation of a neural network + feedforward and can be found here.