Machine Intelligence: Technologies

Some of these blogs entries will be about specific AI related technologies. They are to give a flavour, and in some cases to provide a bit of detail. For starters, let us have a look at a form of ‘artificial neural network’ which I have always considered to be something of a cheat.

n-tuple RAM networks

If you have a pattern recognition problem, one way you can train the computer to recognize your (let us say) images is to use the pixelated information to create addresses, and to simply store a 1 at the address given by the data in the image. Confused? Well, I was when I read about that, like that, so here are some pictures to help.

0	1	2
3	4	5
6	7	8

If we treat the whole 3x3 image as giving us a bit-by-bit address, we can use the numbers in the boxes to act as indices into the address. Note that this address will be 9 bits long (3x3=9) and if we have 0 as the low order bit, the address given by the above image would be 100100001 or 289 in decimal. The amount of memory required to store the information would, of course, be 2^9 or 512 bits. For this one image, we would have one bit set, at address 289, and all others unset.

If our image had some noise in it, we would not recognise the image – not very intelligent. For example

0	1	2
3	4	5
6	7	8

Gives a look-up address of 101100001 or 353. And the bit at 353 is unset, so this image is not recognized. To train the network to recognize it, we set that bit to 1 too.

You might see that at the moment, none of this is going to get us very far. And, what is worse, as the number of pixels increases, the amount of memory needed grows rather too rapidly – by a factor of 2 for each extra pixel, in fact – a 256x256 image would need >2.0e+19728 bits.

Now, the obvious thing to do, given the geometrically progressing memory requirements, is to break the image down in to parts to minimise the effect. If, for instance, we split our 3x3 grid up in to 3 groups of 3, each only requires 8 bits to code it, for a total of 24, instead of the 512 we were using before. The downside is that we lose some of the relationships between pixels, but the memory saving is huge. So huge, in fact, that we can afford to over-sample the pixels. So let’s have a look at what happens if we split the grid in to horizontal rows and vertical columns (yes, I know that those are the definitions of rows and columns, but some people always get confused).

Now our first picture will have h1 : 001, h2 : 100, h3 : 100, v1 : 001, v2 : 000, v3 : 110

If we show the second picture to this network, some, but not all, of the individual 3-tuple recognisers will return values of 1. We will add up the number that do.

h1=1, h2=1, h3=0, v1=0, v2=1, v3=1 -> 4 out of 6 cats say they prefer… wait, 4 out of 6 recognisers think this is the same picture.

And if we want to train this network to recognise the new entrant, we will set the 5^th bit of recogniser h3 and the 5^th bit of recogniser v1.

Later, I will look at saturation, and the choice of ‘n’ for the n-tuple, but hopefully this has given a flavour. It looks like a cheat, and really doesn’t seem all that intelligent, but it is a very good engineering solution to pattern recognition.

Friday, May 26, 2006

Technologies

1 Comments:

About Me

Previous Posts

Adverts