the perceptron
In the last lesson we made a model neuron that could categorise things into two groups (eat or don't eat) depending upon combinations of two features (albedo and sweetness). I had you changing the synapse strengths yourself in order to find appropriate ones for the task. I hope you became sufficiently fed up of this, because this will motivate the next, and most important, extension of the model - neurons that can autonomously adapt their weights (synapses) to solve a problem, often called a perceptron.
Above is a (different) representation of the (same) artificial neuron we made. s1 and s2 are the two inputs (I used s for sensor) which are multiplied by their respective weights, w1 and w2, on the way to the cell body. The capital sigma represents a summation of the weighted inputs in the cell body, and the thing in the box represents the threshold function which outputs zero for activation below the threshold (the dotted line) and one for activation above the threshold.
One thing you'll notice if you read around the neural network literature a bit is that there are many different representations of the same thing: sometimes going down the page, sometimes across, and sometimes up. Sometimes the sum and transfer function are all stuffed inside a cell like circle, and other times they are drawn separately, as we see here. Get used to concentrating on the features rather than the shapes.
Let's start by extending the model from last time. Below is a screenshot of what we need. The colours are arbitrary and just to help us see the different parts. Create an Excel spreadsheet like this, making sure that the neuron (the yellow bit) is connected up as we did in the previous lesson (the sum in C3 and if function in C5).
Above is a (different) representation of the (same) artificial neuron we made. s1 and s2 are the two inputs (I used s for sensor) which are multiplied by their respective weights, w1 and w2, on the way to the cell body. The capital sigma represents a summation of the weighted inputs in the cell body, and the thing in the box represents the threshold function which outputs zero for activation below the threshold (the dotted line) and one for activation above the threshold.
One thing you'll notice if you read around the neural network literature a bit is that there are many different representations of the same thing: sometimes going down the page, sometimes across, and sometimes up. Sometimes the sum and transfer function are all stuffed inside a cell like circle, and other times they are drawn separately, as we see here. Get used to concentrating on the features rather than the shapes.
Let's start by extending the model from last time. Below is a screenshot of what we need. The colours are arbitrary and just to help us see the different parts. Create an Excel spreadsheet like this, making sure that the neuron (the yellow bit) is connected up as we did in the previous lesson (the sum in C3 and if function in C5).
Let’s take a moment to work out what’s going on here:
Delta Weight from input X = learning rate * (desired – actual) * input X
- The upper left hand side is the same as in the previous lesson.
- To the right of this, in the upper rows of columns F-H, we have the different inputs to the sensors, along with the desired output in each situation. This is here so that we can automate the process of different foods being presented to the system instead of having to change the inputs by hand.
- In cell F8, just below the title ‘inp no’, we will put a little formula that will determine which food is being presented and then tell our inputs (cells B1 and D1) to refer to s1 and s2 in either row 2, 3, 4 or 5.
- Cell F11, just under the title ‘lr’, will contain the learning rate. You can choose a value for this. It’s usually between 0.01 and 0.5.
- At the bottom on the left you can see three row titles: ‘desired’, ‘error’ and ‘deltaW’. A value for the desired behaviour of the neuron (from column H) will be entered into a cell next to ‘desired’ according to which ‘Input No’ is chosen. The error will then be calculated by subtracting the actual response from the desired response, and this error will be used to calculate the "delta weight", which is a number we add to the weight in order to bring it closer to the correct weight. The Greek letter 'delta' is often used in mathematics to mean 'a small amount', so a delta weight is a small amount that we add to the weight to facilitate improvement.
- The final step in the calculation is to add the deltaWs to the weights to get the "new weights", which we will put in the cells on row 11. These "new weights" then replace the existing "weights".
Delta Weight from input X = learning rate * (desired – actual) * input X
delta ruleWe can use algebra to write the above formula more compactly:
Dwx = lr * e * x
As far as the meaning of the equation is concerned, Dwx is an adjustment that aims to reduce the error in the weight. Therefore is must include the error! It also must include the input because we do not want to change the weight from a pre-synaptic cell that wasn't active. Finally, the learning rate simply slows down the learning so as to avoid overshooting the target value. More on each of these later. |
errorThe error calculation is an important method. You should remember this.
Imagine I ask you to guess how much money I have in my pocket. You guess £3, but the actual answer is £5. What was your error? Desired (£5) minus actual (£3). 5 – 3 = 2, so you made an error of £2. Easy, isn’t it! This will come up a lot though, so remember: ERROR IS DESIRED MINUS ACTUAL!!! In our model neuron, the actual behaviour is the output of the neuron, while the desired behaviour is whether or not we want it to prescribe "eat" or "don't eat", so: error = desired - output or e = d - o |
connecting everything
Ok, now we have all the basics in Excel and we understand what the delta rule will do, let's get everything connected...
We are now going to automate the selection of which inputs the neuron receives (which food, if any, it sees).
Go to cell F8, just below ‘Input No’, and put in a 1. This represents input pattern 1, which is in columns F to H of row 2. Now that we’ve done that, we need to make cells B1 and D1 refer to cells F and G in row 2 when the input number is 1, in row 3 when it’s 2, row 4 when it’s 3 and row 5 when it’s 4. |
Click on cell B1, go to the formulas tabbed menu and select insert function:
|
In the box that pops up, select CHOOSE and click OK. A separate window will pop up. Fill it in as below – note: you don’t need to type the name of the cell you want every time. You can simply click the cell instead.
What this choose box says is: look at the value in cell F8. If it’s a 1, use cell F2. If it’s a 2, use cell F3. If it’s a 3, use cell F4. If it’s a 4, use cell F5.
In order to do the second sensory input, select cell D1 and repeat the process, but this time using the values in column G instead of column F.
While we are at it, let’s make cell C7 (the desired output) reference the appropriate row of column H too. Select C7 and repeat the above process again, but using cells H2, H3, H4 and H5 for the four values.
Now you can check that your neuron references the inputs correctly by changing the value in cell F8 to 1, 2, 3 or 4 and seeing if cells B1 and D1 change appropriately. Have a try.
In order to do the second sensory input, select cell D1 and repeat the process, but this time using the values in column G instead of column F.
While we are at it, let’s make cell C7 (the desired output) reference the appropriate row of column H too. Select C7 and repeat the above process again, but using cells H2, H3, H4 and H5 for the four values.
Now you can check that your neuron references the inputs correctly by changing the value in cell F8 to 1, 2, 3 or 4 and seeing if cells B1 and D1 change appropriately. Have a try.
It’s time to calculate the error. Can you remember how? (if you can’t, slap yourself!) Click on cell C8 and then type and equals sign ‘=’. Typing this sign is another way of telling excel that you want to insert a function. The function we want is simply desired minus actual, so cell C7 minus cell C5. Click cell C7, type a minus sign, then click cell C5 and hit enter. You can check the network so far by changing the input numbers in F8 again if you like, or using trace precedents from cell C8.
Before we go any further, let’s decide upon a learning rate to use. We’ll start with 0.2. You can play with this value later. Put 0.2 into cell F11.
The learning rate is needed in order to calculate how much we need to update the weights (synapses) by. We’ll do this in cell B9 for weight 1 and D9 for weight 2. Click on cell B9. Remember the formula for deltaW? Lrate*error*input. Type an ‘=’, then click cell F11, type a multiplication sign ‘*’, click cell C8, another multiplication sign, and finally cell B1 and enter. For the second weight, click cell D9 and do the same, but remember to replace B1 with D1, the other input.
So, we have the deltaWs. These two values are the amounts we need to change the two weights (synapses) by in order for the Grak to learn. So, the next step is to add them to the current weights, thus implementing the change.
Select cell B11 and enter the following, then hit enter: =B2 + B9
Select cell D11 and enter the following, then hit enter: =D2 + D9
The very final step is to update the old weights by changing them to these new weights.
Click cell B2 and enter the following then hit enter: =B11
When you hit enter, you should have a problem! Don’t worry. (if you don’t have a problem, do worry!)
A window will flash up telling you that there is a circular reference in your work. Let’s take a moment to think about what this means.
The calculation of cell activity is based upon the weights. The error depends upon the cell activity. New weights are calculated using the error, and the new weights take the place of the old weights. This is one big circle! -->
Circular references can lead to infinite recursion. The reason why we want a circular reference in a learning neural network is that with each cycle of the network, we hope that the error will decrease (learning!). In the end, there will be no error (or near enough to nothing to lead to useful behaviour), and we can use this lack of error as a stopping point for the recursion.
However, we have to tell excel that this is what we want...
A window will flash up telling you that there is a circular reference in your work. Let’s take a moment to think about what this means.
The calculation of cell activity is based upon the weights. The error depends upon the cell activity. New weights are calculated using the error, and the new weights take the place of the old weights. This is one big circle! -->
Circular references can lead to infinite recursion. The reason why we want a circular reference in a learning neural network is that with each cycle of the network, we hope that the error will decrease (learning!). In the end, there will be no error (or near enough to nothing to lead to useful behaviour), and we can use this lack of error as a stopping point for the recursion.
However, we have to tell excel that this is what we want...
Go to the Microsoft Office Home Menu in the top left hand corner
Find the Excel Options button near the bottom right of the drop down menu.
In the options, select the Formulas tab, put a tick in the Enable iterative calculations box, and change the Maximum Iterations to 1, as shown -->
Click ok.
Unfortunately, this isn’t the end of our problems! We can do circular references now, but if we have a circular reference telling us to make the weights the same as the next weights, where can we get our initial weights for the network from?
(if you are having trouble understanding this problem, have a look at the circle picture above and ask yourself where the first step is...?)
Never fear! The answer to this comes by way of another Excel function (and one that is widely used in almost all programming languages), the IF function.
Click on cell B2 then click the ‘Insert Function’ button, select IF and click ok. You’ll see the following window (it won’t be filled in with the values yet though!):
In the options, select the Formulas tab, put a tick in the Enable iterative calculations box, and change the Maximum Iterations to 1, as shown -->
Click ok.
Unfortunately, this isn’t the end of our problems! We can do circular references now, but if we have a circular reference telling us to make the weights the same as the next weights, where can we get our initial weights for the network from?
(if you are having trouble understanding this problem, have a look at the circle picture above and ask yourself where the first step is...?)
Never fear! The answer to this comes by way of another Excel function (and one that is widely used in almost all programming languages), the IF function.
Click on cell B2 then click the ‘Insert Function’ button, select IF and click ok. You’ll see the following window (it won’t be filled in with the values yet though!):
What is this window saying? Well, the uppermost box checks if whatever is in cell B11 (our new weight1) is a number or not. When the neuron is born, there will be no "new weights" because it won’t have performed any behaviours from which it can learn yet. Hence there will not be any number in B11. Using the information about whether B11 is a number or not, the value to use can be decided. If there is no number, value_if_false will be used, so put in the lowest box the value that you want to use for the initial weight (the synapse strength that this neuron is born with). If there is a number in B11, we want to update the weights with this number, so simply put B11 here.
You will need to do the same procedure again for the second weight, cell D2. This time it wants to reference cell D11 rather than B11. You can choose any value you like for the initial synapse strength (I used 0.2, but really, anything you like is fine cos this neuron will learn!).
You will need to do the same procedure again for the second weight, cell D2. This time it wants to reference cell D11 rather than B11. You can choose any value you like for the initial synapse strength (I used 0.2, but really, anything you like is fine cos this neuron will learn!).
Now, because we limited excel to one iteration when we allowed circular references, we need to refresh the neuron manually each time it has a learning experience. This is easy. Go to the Formulas tabbed menu at the top and look over at the far right. You’ll see a ‘calculate now’ button:
Press it! Each time you want to simulate the neuron experiencing a different food (or not in the case of learning from no food being present – as our [0,0] input accounts for), press this ‘calculate now’ button.
Press it! Each time you want to simulate the neuron experiencing a different food (or not in the case of learning from no food being present – as our [0,0] input accounts for), press this ‘calculate now’ button.
Time to play
- Try each of the different input values {1,2,3,4} in cell F8, clicking the ‘calculate now’ button until the neuron's error is zero.
Input 4, the ice cream, will probably take a few iterations to learn. Once it has learnt this, don’t forget to check the other inputs to make sure it hasn’t unlearnt these! This can happen since the same two weights are being used for all four different input patterns. Learning to accommodate one pattern can interfere with one that already exists.
For a very interesting article on why interference can be tolerated in certain kinds of memory (semantic)/certain brain areas (neocortex) and why it cannot in others (episodic)/(hippocampus), see:
Norman & O’Reilly (2003). Modelling Hippocampal and Neocortical Contributions to Recognition Memory: A Complementary-Learning-Systems Approach.
Before we do the next bit, I want to make a small change which will make things easier. We are going to simulate the fact that different foods may not appear in order, but rather come randomly, or not at all. Click on the ‘Input No’ cell, F8. Go to ‘Insert Function’ and select RANDBETWEEN. In the window that pops up, put 1 in the Bottom box and 4 in the Top box. This means that the ‘Input No’ will now be a randomly generated number between 1 and 4 (inclusive). Click ok.
Doing this also means that you don’t have to bother changing the input number by hand all the time. Simply click ‘calculate now’ repeatedly and watch the how things change!
Another important reason for doing this is that we don’t want our neuron to learn anything from the order of presentation. If an ice cream always comes after marmite, how do we know that our neuron is recognising ice cream, and not just saying “always eat whatever comes after marmite”? Randomising the order avoids this problem.
Doing this also means that you don’t have to bother changing the input number by hand all the time. Simply click ‘calculate now’ repeatedly and watch the how things change!
Another important reason for doing this is that we don’t want our neuron to learn anything from the order of presentation. If an ice cream always comes after marmite, how do we know that our neuron is recognising ice cream, and not just saying “always eat whatever comes after marmite”? Randomising the order avoids this problem.
- Try playing with different thresholds. We have always used a threshold of 1 so far. Change it to 2. What happens to the weights?
- Change it to 10. What happens?
- Change it to -5. What happens now? (You might find it useful to remove the random number generator here so that you can train the neuron on one input until it doesn’t make any errors.)
You might notice some strange behaviour occurring when the threshold is a negative number. For example, you may get an error when the first input (zero to both sensors, meaning there’s no animal walking past) is presented. Why? But even in this case, dWeights takes on a zero value (if there’s an error, it really should be learning to compensate, but it isn’t). Why?
This is worth exploring a little bit. Let’s do this by comparing our artificial neuron to a natural neuron. Look at the following table:
So, bearing in mind that the threshold is the magnitude of the summed activation needed to cause a cell to fire, what does a negative threshold value mean?
Is it realistic to have a negative threshold?
In exploring the properties of the threshold, we have one final test to perform. We know that a positive threshold determines a minimum input activation that is necessary to cause a cell to fire. We know that a negative threshold causes the cell’s natural state of rest to change to one of activation, which is not so biologically plausible, and that this confuses the cell and leads to oscillations. What about if we have a threshold of 0? Try it and see.
Is it realistic to have a negative threshold?
In exploring the properties of the threshold, we have one final test to perform. We know that a positive threshold determines a minimum input activation that is necessary to cause a cell to fire. We know that a negative threshold causes the cell’s natural state of rest to change to one of activation, which is not so biologically plausible, and that this confuses the cell and leads to oscillations. What about if we have a threshold of 0? Try it and see.
Next let’s explore what happens when we have different starting values of the weights. Before doing this, bearing in mind that the our newly build neuron has the ability to learn, what would you expect the result of changing the initial weights to be? Will there be any set of weights you could use that would cause the neuron to be unable to learn the correct behaviour?
In order to play with the initial weights, you need to set the numbers in the equations in cells B2 and D2 to what you want the initial weights to be. Do this now.
However, remember that we had a circular reference, and these numbers will only be used if there is no value in the new weights cells. So, in order to use these initial weights, we have to briefly ‘break’ the neuron. We can do this by setting the threshold (the only thing that isn’t relying on some other value) to something that is incalculable. Click cell C4 and input:
=n
A list of choices should come up like this -->
Click NA and hit enter. This is simply saying that the value in this cell is Not Available, causing an error to appear. Click ‘calculate now’ a few times to make sure the error passes through the network to the new weights (although it should anyway). After that, set it (cell C4) back to a threshold of 1. You should now see the initial weights you selected appear in the two weights cells. Click 'calculate now' a few times until the neuron has learnt (until there is no more error or weight change). Play with different values and say what you find.
P.S. It’s probably a good idea to set the Input No back to a =RANDBETWEEN(1,4) to save time.
Another thing we can investigate is the success of neurons that choose different foods to eat. We can start with the "Fat" neuron. This guy eats anything that we put before him – marmite (low whiteness, but high sweetness), salt (high whiteness, but low sweetness) or ice cream (both white and sweet). Can you draw out a logical table for the Fat neuron?
In order to play with the initial weights, you need to set the numbers in the equations in cells B2 and D2 to what you want the initial weights to be. Do this now.
However, remember that we had a circular reference, and these numbers will only be used if there is no value in the new weights cells. So, in order to use these initial weights, we have to briefly ‘break’ the neuron. We can do this by setting the threshold (the only thing that isn’t relying on some other value) to something that is incalculable. Click cell C4 and input:
=n
A list of choices should come up like this -->
Click NA and hit enter. This is simply saying that the value in this cell is Not Available, causing an error to appear. Click ‘calculate now’ a few times to make sure the error passes through the network to the new weights (although it should anyway). After that, set it (cell C4) back to a threshold of 1. You should now see the initial weights you selected appear in the two weights cells. Click 'calculate now' a few times until the neuron has learnt (until there is no more error or weight change). Play with different values and say what you find.
P.S. It’s probably a good idea to set the Input No back to a =RANDBETWEEN(1,4) to save time.
Another thing we can investigate is the success of neurons that choose different foods to eat. We can start with the "Fat" neuron. This guy eats anything that we put before him – marmite (low whiteness, but high sweetness), salt (high whiteness, but low sweetness) or ice cream (both white and sweet). Can you draw out a logical table for the Fat neuron?
By this point you should be pretty proficient at applying the changes and performing the tests in Excel, so I will leave you to it. Does the Fat neuron learn the desired behaviour?
The Sweet neuron wants to eat anything that is sweet. Remembering that sensor 2 detects sweet foods, we can see that this means the Sweet neuron fires when presented with marmite or ice cream, but doesn't fire when presented with nothing or with salt.
Modify the program and test this neuron. Does it learn the desired behaviour? |
We can also define a White neuron which wants to eat white food, ie: salt and ice cream. You can work out the logic table for this and test it. All being well, it should learn the desired behaviour just like the others did.
Last of all, let's define the Contrary neuron. This neuron wants to eat food which is either white or sweet, but not white and sweet. This means its desired output is to fire when it sees marmite or salt, but not when it sees nothing or ice cream.
Conclusion
So, we have seen how a neuron works by summing its inputs and only firing if that sum is greater than its intrinsic threashold.
We've also seen that the inputs are weighted by the synapses. A neuron like the one we modelled with weights: w1 = 0.1 and w2 = 2.5 would be very sensitive to the second input, but very insensitive to the first input.
Finally, of the four neurons we modelled (the normal one that only eats ice cream, the Fat neuron, the Sweet neuron and the Contrary neuron), the Contrary neuron was the only one that was unable to learn the desired behaviour. Why was it unable? The answer is that the Contrary neuron is an example of a famous problem called the exclusive OR (XOR) problem. I'll give details of what this is and how it affects our neuron in the Formalising and Visualising page.
We've also seen that the inputs are weighted by the synapses. A neuron like the one we modelled with weights: w1 = 0.1 and w2 = 2.5 would be very sensitive to the second input, but very insensitive to the first input.
Finally, of the four neurons we modelled (the normal one that only eats ice cream, the Fat neuron, the Sweet neuron and the Contrary neuron), the Contrary neuron was the only one that was unable to learn the desired behaviour. Why was it unable? The answer is that the Contrary neuron is an example of a famous problem called the exclusive OR (XOR) problem. I'll give details of what this is and how it affects our neuron in the Formalising and Visualising page.
single_neuron_with_learning.xlsx | |
File Size: | 11 kb |
File Type: | xlsx |