Olivier Georgeon's research blog—also known as the story of little Ernest, the developmental agent.

Keywords: situated cognition, constructivist learning, intrinsic motivation, bottom-up self-programming, individuation, theory of enaction, developmental learning, artificial sense-making, biologically inspired cognitive architectures, agnostic agents (without ontological assumptions about the environment).

Saturday, October 25, 2008

Poor Ernest 3.0

So far, Ernest has had environments with no own state, thus the right action for Ernest to choose did not depend on previous actions. But what if Ernest has to do a previous action in order to get a situation where he has the possibility to get a Y?

In this new environment, called "aA..bB..aA..aA", Ernest has to do two consecutive A or two consecutive B to get a Y. Further same consecutive actions will not lead to Y anymore. So, Ernest will only get a Y if he does A when he has previously done B then A, or if he does B when he has previously done A then B.

This trace shows Ernest 3.0 in this new environment. For better understanding, new labels have been used: "Learn new Schema" means Ernest memorizes a new schema (previously called "Mermorize schema"). "Recall Schema" means Ernest recalls a previously-learned schema that match the current context (previously called "Propose Schema" or "Avoid Schema"). Recalled schema having a X expectation have their weight displayed with the "-" sign (this was previously indicated by the "Avoid" term). This experiment begins with a "AA" initial environment's state. That causes Ernest to begin by learning a AXAX schema, which is OK.

Now, the environment does not always expects a specific action to return Y, there are some situations where any action from Ernest will lead to a X. Instead, the environment has a State made up of the two last Ernest's actions. Ernest will only get Y if he does A when the state is BA or if he does B when the state is AB.

As we can see, poor Ernest has to struggle hard to get Ys. He indefinitely gets one every once in a while, but he is unable to find a simple regularity aAbBaAbB that would give him a Y every second action.

Generally speaking, Enest is lost when the environment's regularities have a longer span than his context memory. I could easily make Ernest able to deal with this specific environment by increasing his context memory span, but I am looking for a more general solution. The basic idea is that Ernest should be able to construct "schemas of schemas".

No comments: