Machine Intelligence

Thursday, November 08, 2007

Operational Consciousness

(Cross-posted at RedGloo)

Operational Consciousness is a book I gladly recommend to anyone in the field of consciousness studies, and especially with regard to Machine Consciousness.

On the cover, it says :

"Consciousness is not a state of mind

It is a procedure

It is not an optional extra.

It is an indispensable and integral part of any system, biological or artificial, which has a high level of intelligence"

And I tend to agree. In the book, Noble argues that consciousness is not a property of the brain, or system, nor is it a 'thing', but rather it is what the brain-mechanism does. Thus it is a procedure, or a process, something which I remember having a heated debate about with one of my lecturers a couple of years ago. Some people, it turns out, have a problem with ascribing the label consciousness to the dynamic process rather than the 'hardware' or 'software' which enable that process to exist. Fortunately, perhaps, I do not share the problem, although it is a very useful mental exercise to try to describe the categorization of the instantiation in a meangingful way, using a language which appears to have had its origins in describing and talking about material objects.

Noble takes pains to be able to explain developments between the various meachanisms in his model in terms of pseudo-evolution. In other words, each step can be explained as having been possible to evolve, without actually having to show how, or, indeed, if such evolution actually happened in that way.

The view taken is also Functionalist, in that it breaks down the proposed structures into functional units which could be realised using known computational methods and hardware. This is not to say that the endeavour would be easy, as one of the major hurdles is the bandwidth necessary for the sensory data the system should be able to handle. Additionally, the whole system must, perforce, build on existing experience to develop its internal models, and is unreasonable to expect early versions to even be able to do this as quickly as, say, a human child would, let alone faster. It seems reasonable that any such system would, certainly during prototyping, take years to build up the body of experience necessary for its qualities to be fully recognised.

Of particular note, Noble uses pragmatism, summed up nicely in his wording "Adopting his [James'] use of the word 'expedient' I call this view of reality 'expedient reality'. Briefly, an object or circumstance can be said to 'exist' in the 'world', if its inclusion in our representation of that world, results in an improved ability to predict future experience". So whilst we can have representations of things which do not 'exist' we do not consider them to be real unless the internal model can use them to make predictions which are more accurate than the ones we make in their absence.

Noble's 5-layer model consists of 5 semi-autonomous layers which each build on the functionality of the earlier ones. This is similar to the concept of a subsumption architecture.

Layer 1 A stimulus-response automaton, providing sensory mechanisms. This also manipulates the data in simple ways, and provides for 'programmable' responses.

Layer 2 A memory layer, which can be used to anticipate a limited range of events, and which can pass data on to subsequent layers. This is similar, I believe, to our short term memory.

Layer 3 An abstraction layer, driven primarily by compression techniques. By recognising repeated patterns in Layer 2, and representing them in a 'concept store', the data volume is reduced, and some level of abstraction is produced. In my view, this is probably a layer which may have multiple self similar layers, each acting on the previous one, possibly still with inputs from Layer 2. This layer is described as constructing a predictive interpretation of events.

Layer 4 A layer which enables the system to hold a Theory of Mind for other animate objects it encounters and produces models of in the lower layers. This layer also provides some level of introspection, and maintains a model of 'selfmind'.

Layer 5 Language layer - built on its predecessors, this layer introduces a mechanism for language, and the idea of a meta-concept, necessary for the development of language in the system.

The rest of the book goes into some detail, explaining the concepts, and exploring what they mean, with explanations of how they could have evolved using pseudo-evolution. It provides a detailed analysis, and, frankly, you should buy the book and read it. Whilst it is not always the easiest read, it is certainly easier than many texts and puts its points well.

I particularly like a comment near the end. When discussing how various mechanisms work, and the way that the interpretation procedure has to process large amounts of data to spot concepts in the reality stream provided by the senses, Noble cautions the reader:

"And remember this - you are not sitting there watching your brain doing this from some stance on the side-lines. This operation is you. You are that mechanism"

Noble, H., (2005) "Operational Consciousness", Tartan Hen Publications, Argyll

Saturday, September 22, 2007

Updates

I have been very bad at updating this blog, to the extent that I imagine nobody who read it before will care very much whether I update it now!

I have been posting some thoughts and background on Machine Intelligence and on areas of psychology, social science, philosophy and various other things over at http://parslow.eu - under the Pat's Research links, and similar.

I have been particularly involved in working on folksonomies and thus identity lately, and as a consequence I have been drawn back to the idea of artificial consciousness being, at least potentially, feasible. My latest posts on identity are at http://parslow.eu/identity.aspx

Enjoy - and feel free to post comments on anything over there, over here.

Labels: folksonomies, identity theory, machine consciousness

Tuesday, June 06, 2006

Interesting...

To me at least; I just noticed an ad for psychology courses on this page… I thought the preponderance of computer related postings would probably drown out the consciousness related stuff so far – but I was wrong!

No time to post today really, but this is just to mention (and to remind myself to do it) that I am planning to outline the system map of the brain functions I believe is necessary for the potential of emergent consciousness. Nobody seems to want to fund any serious research in to it, so I may as well make it public domain by posting it up here! It might get a copyright notice on it though…we will have to see.

Sunday, June 04, 2006

Stochastic Diffusion Search

Searching for things is a time consuming problem. It is why, for instance, we file things in an office environment, and keep things in a nice(?) structured order in libraries. Because we impose a structure on the data held, it is easier to find what we need, when we need it.

The problem is, not everything is neatly ordered, and indeed, with some things there are just so many possibilities that ordering them would be infeasible (and, potentially, impossible). So we need to have ways of searching.

Essentially, all machine learning systems are types of search engine, it is just that they are generally searching for optimal solutions which are defined by a set of examples which do not necessarily cover the entire problem space. It is the features of examples which normally give the system the chance to spot where solutions lie – and to do so they effectively make a series of refinements on an initial guess. By working out how much the guess is wrong by, they work by moving towards solutions with lower errors.

But it is not always this way, and it is not always the best way to approach the problem. Some problem spaces are convoluted affairs, with multiple local minima in the error-space. The feature space can be relatively disjoint, with solutions only appearing in certain areas. The only thing which really shows where a solution may be is whether there are features in the problem space which can be associated with a solution.

For instance, if our problem space were to be a nice simple search in nice simple text, we could set up a nice simple string to examine:

ALLFOODSAREGOODANDTASTEYIFYOULOOKHUNGRYENOUGH

I never said it had to make sense. Now, if we look for a particular word, say “LOOK”, then we can obviously search through from the beginning and find that it is there. If we use such a naïve method, we would see an A at position 0 (computer scientists tend to number things from 0, sorry) and so we can skip it.

So let’s look at position 2 – ah ha! An L, which matches our first character. So now we can look at the second one… we are looking for an O and we have an L. Sadness for our search engine.

But, we can carry on looking, and we will, eventually, find LOOK right up there at position 31.

Stochastic Diffusion Search (first described by Dr. Mark Bishop) takes a different slant on things. If you imagine that that string was really really big, starting at the beginning is probably not going to be a good move. So SDS uses a population of ‘agents’ to look at local features within the search space. Because this is a small string, we will just us 3 (which is crazy talk in terms of the real technique, but it will suffice for now)

We will initialise the search with agents at positions 7, 23, and 36.

Each agent starts out inactive, but knows that it is looking for the feature “LOOK”. It picks one element of its search feature at random, and looks that far away from where it is to see if there is a match.

Ours will pick 0, 2 and 1 (picked entirely arbitrarily by a wetware pseudo random number generator called my mind)

So, Agent A looks at string location 7+0, which is a D, and compares it to the 0^th element of the feature array, which is L. No match.

Agent B looks at string location 23+2, which is an F, and compares it to the 2^nd element of the feature array, which is O. No match.

Agent C looks at string location 36+1, which is a G, and compares it with the 1^st element, which is also an O. Still no match.

Now, any agent which has failed to match asks other members of the population for a hint. As none of them have matched, none of them are feeling particularly helpful, and in this case each of them picks a new spot to squat. Let us assume they pick 2, 12 and 28, and that they will have offsets into the search string of 2, 1 and 2 (OK, so that time they were a lot less than arbitrary, but we don’t want to be here all day, typing is not as fast as processing)

So Agent A now checks position 4, which is an O and matches, well, an O.

Agent B checks position 13, which is an O and matches another O

Agent C checks position 30 and also matches an O.

This time, they are all ‘successful’ and all of them become active.

You may be thinking “But two of the are looking in entirely the wrong place, and the third one isn’t much better”, and you would be right. However, they are not done yet. Because they are looking through the space in a fairly random way, the system is not going to take the first partial match they acquire as gospel, and they must carry on until enough of them agree for long enough that the features they have found are the right ones.

So, turn 3… this time, because all of them are active, they don’t jump around like jumping beans, but sit quietly and take a look around. They choose to look at 0,2 and 2 this time.

A : Looking for an L, found an L.

B : Looking for O, found a D – becomes inactive.

C : Looking for an O and found one.

Now this time, B knows it is in the wrong place, and asks one of its comrades at random for a hint. I’ll toss a coin… C. So B now jumps to the same position C is in, to continue its search next time. (Only we know that this is a bit of a dead end, but Shhh!! Don’t tell the agents!). A will look at pos 3, B at 1 and C at 0

A: looking for a K, found an O – becomes inactive

B: looking for an O, found an O

C: looking for an L, but found an O – becomes inactive.

Now the system has a bit of a problem in a silly case like this one, because now all of the agents will jump to B’s location (30). They only have a 1 in four chance of being active, but as long as one of them is, they will stay in place. Whilst they have not actually found the right spot, they are not far off.

If you have many, many agents, there is obviously a better chance that one will land on the right spot, and of course, once there will always be active. If others find good but not perfect spots, such as the G of GOOD, they will have a 50% chance of staying active each turn. It should be fairly obvious that the agents will tend to cluster around the right areas.

And it really does work – it is a surprisingly fast search algorithm which also happens to be naturally very well suited to parallel processing. The more processors you can chuck at this baby, the faster, and more reliably, it will find the solutions.

I suspect (but confess I haven’t checked yet) that there are some possible optimisations to this basic framework – these may have already been described in the literature. The most obvious three, to me are:

Allow the local search to work in both directions from the chosen location. Negative offsets will look backwards through feature space, and a 0 offset will have to check for both the 0^th element and the last element.

When an agent gets its new location from an active agent, allow it to land nearby if there are already a certain number of agents in one spot. If the search space had a lot of repeated elements (say “aaaaaaaaaah”) and the required string also had a similar high frequency of a particular element (“hahahahahahahahahah”), then there is an unpleasantly high chance of getting false matching. Allowing agents to miss by a fraction would, I believe, help reduce the risk of this.

If you have a certain number of agents in one place, remaining active for a reasonable number of turns, check if they are really on the solution. Of course, if an exact match doesn’t appear in the search space you don’t want to lose the close match you have found, as it may be the best you can do, but you probably also want to make sure some agents are still out there exploring (actually, I seem to remember something similar to this in the method… I really should check!)

Anyway, that is a brief introduction, by means of an example which fails to find the solution in the number of iterations I am prepared to type out, of Stochastic Diffusion Search. It is certainly something I am exploring further at the moment.

Thursday, June 01, 2006

COGRIC

I am very excited to be participating in the International Workshop on Cognitive Robotics, Intelligence and Control, 2006 in August this year. I will have to submit a one page summary of my research interests, which means I will have to be rather succinct.

It looks as though there are some excellent speakers going to be there. I have done a minimalistic amount of background research on them (i.e. done a Google search on their names) and here is a summary:

Bernard Baars

Senior Fellow in Theoretical Neurobiology, The Neurosciences Institute, San Diego, California

Interests: The psychology and brain basis of conscious experience; its ethical implications for human and animal welfare; consciousness in animals; consciousness in the history of psychology; the scientific problem of volition; psychodynamics; conscious aspects of emotion; bioethics.

Ahhh yes, consciousness. This sounds like a man I could enjoy long conversations with, given the opportunity.

Kerstin Dautenhahn

Professor of Artificial Intelligence, University of Hertfordshire, UK

Research goals include -investigation of social intelligence and individual interactions in groups of autonomous agents, including humans and other animals, software agents and robots- and research portfolio includes e-learning,

Social intelligence and interactions in groups of autonomous agents? Yep, that sounds like my sort of stuff. Although I am more interested in the effects on the individuals of the emergent behaviours the group exhibits, I think. However, also a strong interest in e-learning? Excellent news, as I not only have an interest there but am considering moving more in that direction for the next phase of my ongoing education.

Ferdinando Mussa-Ivaldi

Professor of Physiology Department of Physiology, Feinberg School of Medicine, Northwestern University, Chicago, USA

Developmental learning used as a paradigm for development of robotic systems for rehabilitative work with stroke victims et al.

Who could fail to be engaged by this subject area? Not only can it have a direct beneficial effect on people suffering the after-effects of a debilitating ailment, but the underlying theories support, I believe, the core of the development of consciousness.

J. Kevin O'Regan

Directeur de Recherche I Centre National de Recherche Scientifique, Institut de Psychologie, Centre Universitaire de Boulogne 71

Interests include such things as change blindness, which implies a certain degree of sparsity of information held in the brain

Now, I would have to agree with that. Not only does it make sense - why hold a complete model of what is going on in the big wide world when you can look at it to see what is there - but it also explains to some extent the phenomenon I often observe where people totally fail to notice things they don’t expect to see. They also totally fail to notice things if the attention is directed elsewhere - a trick useful to stage magicians and conmen alike. This, of course, has interesting repercussions in relation to computer systems. Why should they strive to perform image recognition, for example, using every pixel available in a camera shot, when we clearly don’t bother? And we benefit from a huge degree of parallel processing, which is something most run of the mill computers won’t be getting for some time yet.

Rolf Pfeiffer

Full professor of computer science Director, Artificial Intelligence Laboratory Department of Informatics, Faculty of Mathematics and Science and Faculty of Economics, Business Administration and Information Technology University of Zurich

Not so sure of what Prof. Pfeiffer does, but it looks like it is related to biologically inspired robotics and how the proprioceptors in the body might form part of the mind. Apologies if this isn’t one of your interests Prof., but if it is, I will be very happy to hear about it if I get the chance. It seems natural to me that the whole of the nervous system, and by extension, the system it monitors, make up the mind, even if it is centred in the brain. Of course, I might then be tempted to follow the line of reasoning that says our immediate environment is so tied up with our sensory perceptions, and is so readily manipulated by us, that it should be included in our concept of mind too. And that would have to extend, I might argue, to the other living entities around us.

Andy Schwartz

Professor of Neurobiology School of Medicine, University of Pittsburgh

Neural processes supporting volitional movements. Relationship of cellular behaviour to limb movement.

Studies showing single cell prediction of arm movement. Impressive stuff, and quite possibly a step on the road to finding the elusive minimal substrate of consciousness, if such a thing exists. This is particularly interesting, in view of the timings involved, and it would be fascinating to find out more about what neuronal activity is required to support the populations of cells in the frontal cortex which have been studied in this work. Practically, it may well lead to systems to overcome paralysis or to drive prostheses, although I don’t like the sound of the work that has been done on monkeys, to be honest.

Olaf Sporns

Associate Professor of Psychology Department of Psychological and Brain Sciences Programs in Neuroscience and Cognitive Science Indiana University

Looking at, amongst other things, mapping the connections in the brain. I was under the impression that the level of difference between individuals was quite large, which could make this a daunting project, but it would certainly be useful. The relationship between emergent behaviour and neural pathways appears to be a recurring theme here, and again I say, who could fail to be fascinated by this subject?

My main regret, at the moment, is that I will have to find some way of shoe-horning in some time to let me read through the papers that I can get my hands on published by this worthy group. I feel I would be letting the side down if I hadn’t got a reasonable grasp of their work before August. Shame about the small issue of it not gelling very well with the research I am doing at the moment. Obviously I will have to find creative ways of lengthening the day. That, or waste less of it, I suppose.

Wednesday, May 31, 2006

Peltarion Synapse

This is not an incitement to click on the ads, but I happened to have a look at what was being advertised on my page (seems only sensible, really, and I was careful to do it in such a way as to not break my agreement not to click on them with Google) and one of them was for a product called Synapse from Peltarion.

I had never heard of them before, so I took a quick look. It is a nice piece of software, although annoyingly similar to something I have been meaning to make for some time. Essentially it provides a toolkit of neural networks and analysis tools, coupled with some wizards to guide you through importing your data and designing the classifier/time series predictor/function modeller. It can take a little while to load up on my machine – 30 seconds or so just now, with precious little else using disk or processor (but with 760Mb of page file in use… which slows things down even with 1GB around… but Synapse only uses another 20MB by the time it has the flash screen up). This is probably due to its highly modular design, with almost everything being implemented in separate dll files. In fact, most of the filters/neural network components etc. have 2 dll files – one for the gubbins (I assume) and one for the gui (judging by the names).

The splash screen has an annoying ‘always on top’ness about it, but it is bright and friendly and gives you the option to switch it off. The main software is probably much easier to use with practice, but I must admit given the 14 day trial period, I would imagine I would stick to using the wizards in order to get anything done.

The only real problem I ran into with it while messing about (that should read ‘testing’) was that if I set the option to use a genetic algorithm to optimise the work flow parameters, it would sometimes de-optimise them, leaving the analysis in a sad and sorry state.

Trying to set some of the components up manually (they have a nice drag ‘n’ drop functionality) left me rather fatigued at one point, as every time I tried to change the number of outputs of one layer of an MLP in it, something automatically adjusted it back to fit the next component it was connected to. I think (it was late at night!) that I discovered that if you work backwards through the work flow, it was more likely to listen to your desires and not fix it for you, than if you work forwards.

Lots of good stuff implemented in there though – Kalman filters, Fuzzy logic, Naïve Bayes, SVMs, wavelets, Hebbians, RBFs, Self Organising Maps … and the design of the system means that is will be simple for them to update it. In fact, the second time I ran it it automatically updated the Hebbian component and a couple of other things. That worries me slightly, because it might mean you have non-reproducible results, but on the other hand it should mean you have the latest and greatest facilities at any time, as long as you are connected to the interweb thingy.

Downside? The asking price. Ouch. Don’t think I will be buying that any time soon, although, maybe, just maybe, I could twist my supervisor’s arm and get it for the project, as it essentially implements a large chunk of the stuff I am half way through implementing. Not quite as satisfying as a roll your own though, is it?

Tuesday, May 30, 2006

Learning Classifier Systems

LCS

These are an interesting class of beastie. They are not connectionist, so don’t earn my full favour (!), but they certainly live up to their name, for low numbers of discrete dimensions in the problem space, at least. They use populations of classifiers which associate a pattern in the observable environment with an action that can be performed (or, for more complex systems, actions which are re-processed by the system). An element of the match-expression for the classifier can be specific, or an “I don’t know” character – normally expressed in the literature as #. I would have used ? personally, but hey.

In each cycle, stimuli from the environment are encoded and put on the Message List. Those classifiers whose pattern component matches with messages are selected and referred to as the Match Set (although I don’t think classifier systems are particularly known for playing tennis…). The classifiers in the Match Set compete with one another for the right to post their messages to the Message List, and if they succeed, they are placed in the Action Set. There is a club membership fee to pay, though, and members of the Action Set provide a pay-off to any classifiers responsible for posting the messages that they responded to, in order to get in to this elite club.

At this stage, the existing message list is cleared, and then the Action Set members do their thing, posting up the message, or action, part. If these messages are valid instructions to the systems actuators, then they are acted on, and feedback from the environment is elicited (i.e. the system interacts with the world and good stuff, bad stuff, neutral stuff or a mixture of all three, happens). If the environment has rewarded the system, the system then rewards its classifiers, thus making them stronger (and repaying some of that investment they made to get there – obviously only the best will be making a profit). And then the cycle starts again.

In order to prevent the system becoming stagnated, the population occasionally undergoes a culling and breeding from its fittest members. Simple crossovers normally seem to suffice in the genetic algorithm, as another technique is used to add variety to the population. If the Match Set does not have at least one member of the population matching each of the stimulus messages, a process called covering is invoked. Basically this is the big cheat (OK, so I like systems to be able to fail, and not be brute forced, shoot me) – if none of the population matched a message, a new member is added to the population which does match it.

This basic idea was the brainchild of J.H.Holland, as had a number of people work in the field since. (I would cite a paper here, but as I don’t have a copy to hand, and various sources I do have to hand cite papers from between 1976 and 1989 as definitive, I shall leave that task to you, dear reader. If someone wants to post a comment pointing out which paper is the best one, I would be most grateful).

One thing I particularly like about this variety of learning classifier system is the second order nature of it. The messages from the action set are left in the message list and the stimuli from the environment are added to them – voila, the system has a simple memory. This makes it a relative of production systems, which I will try and write something on at some point in the not too distant future. The bucket-brigade reward scheme – where rewards are passed on down the line to classifiers which helped others get to the all important Action Set, is a useful learning mechanism which I feel should have a wider application. It is the polite, cooperative part of a functioning economy which is missed by the capitalist driven mindset these days (oops, showing a bit of political bias there!). Of course, as described here, the scheme is still elitist, because it is only those who have made it to the Action Set who will have posted messages and will be eligible for the reward scheme.

Enough for now. I will try and post some pictures in the next issue…