Natural Machine Intelligence and Machine Consciousness - fantasy or near-future fact? How can we get there, and do we want to undertake the journey?
Site feed http://machine-intelligence.blogspot.com/atom.xml

Tuesday, May 30, 2006

Learning Classifier Systems

LCS

 

These are an interesting class of beastie.  They are not connectionist, so don’t earn my full favour (!), but they certainly live up to their name, for low numbers of discrete dimensions in the problem space, at least.  They use populations of classifiers which associate a pattern in the observable environment with an action that can be performed (or, for more complex systems, actions which are re-processed by the system).  An element of the match-expression for the classifier can be specific, or an “I don’t know” character – normally expressed in the literature as #.  I would have used ? personally, but hey.

In each cycle, stimuli from the environment are encoded and put on the Message List.  Those classifiers whose pattern component matches with messages are selected and referred to as the Match Set (although I don’t think classifier systems are particularly known for playing tennis…).  The classifiers in the Match Set compete with one another for the right to post their messages to the Message List, and if they succeed, they are placed in the Action Set.  There is a club membership fee to pay, though, and members of the Action Set provide a pay-off to any classifiers responsible for posting the messages that they responded to, in order to get in to this elite club.

            At this stage, the existing message list is cleared, and then the Action Set members do their thing, posting up the message, or action, part.  If these messages are valid instructions to the systems actuators, then they are acted on, and feedback from the environment is elicited (i.e. the system interacts with the world and good stuff, bad stuff, neutral stuff or a mixture of all three, happens).  If the environment has rewarded the system, the system then rewards its classifiers, thus making them stronger (and repaying some of that investment they made to get there – obviously only the best will be making a profit).  And then the cycle starts again.

            In order to prevent the system becoming stagnated, the population occasionally undergoes a culling and breeding from its fittest members.  Simple crossovers normally seem to suffice in the genetic algorithm, as another technique is used to add variety to the population.  If the Match Set does not have at least one member of the population matching each of the stimulus messages, a process called covering is invoked.  Basically this is the big cheat (OK, so I like systems to be able to fail, and not be brute forced, shoot me) – if none of the population matched a message, a new member is added to the population which does match it. 

            This basic idea was the brainchild of J.H.Holland, as had a number of people work in the field since.  (I would cite a paper here, but as I don’t have a copy to hand, and various sources I do have to hand cite papers from between 1976 and 1989 as definitive, I shall leave that task to you, dear reader.  If someone wants to post a comment pointing out which paper is the best one, I would be most grateful). 

            One thing I particularly like about this variety of learning classifier system is the second order nature of it.  The messages from the action set are left in the message list and the stimuli from the environment are added to them – voila, the system has a simple memory.  This makes it a relative of production systems, which I will try and write something on at some point in the not too distant future.  The bucket-brigade reward scheme – where rewards are passed on down the line to classifiers which helped others get to the all important Action Set, is a useful learning mechanism which I feel should have a wider application.  It is the polite, cooperative part of a functioning economy which is missed by the capitalist driven mindset these days (oops, showing a bit of political bias there!).  Of course, as described here, the scheme is still elitist, because it is only those who have made it to the Action Set who will have posted messages and will be eligible for the reward scheme.

            Enough for now.  I will try and post some pictures in the next issue…

           

0 Comments:

Post a Comment

<< Home