Olivier Georgeon's research blog—also known as the story of little Ernest, the developmental agent.

Keywords: situated cognition, constructivist learning, intrinsic motivation, bottom-up self-programming, individuation, theory of enaction, developmental learning, artificial sense-making, biologically inspired cognitive architectures, agnostic agents (without ontological assumptions about the environment).

Tuesday, May 19, 2009

Enrest 6.0

To prepare Ernest to recursively learn schemas on top of one another, I have almost entirely rewritten it.

Now, schemas are no longer triples of subschemas but couples of subschemas. The first subschema of a schema is its context, and the second is its intention. For instance, S3=(S1,S2) means that the schema S3 intents to enact S2 in a context where S1 has been enacted. In addition, subschemas are associated with their succeed or failure status: S=Succeed or F=Fail. So, for instance, S3=(S1 S, S2 S) means that S3 expects S2 to succeed in a context where S1 has succeeded. On contrary, S4=(S1 S, S2 F) would expect S2 to fail in a context where S1 has succeeded.

Like before, schemas also have satisfaction values and weights. So, S3=(S1 S, S2 S, 2, 1) means that schema S3 has a satisfaction of 2 and a weight of 1. The satisfaction of a schema is the sum of the satisfactions of his subschemas for their specific status. For instance S1 and S2 may both have a satisfaction of 1 when they succeed, so S3 has a satisfaction of 1+1 = 2. If S2 has a satisfaction of -1 when it fails, then S4 would have a satisfactin of 1-1 = 0. The weight of a schema is the number of times the schema has been enacted. For instance, if S2 has failed 3 times in a context where S1 has succeed, then we have S4=(S1 S, S2 F, 0, 3)

At each cycle, all the schemas whose context match the current context propose their intention. The proposition weight is equal to the intended subschema satisfaction multiplied by the proposing schema weight. This can be understood as the benefit of doing it multiplied by the confidence to succeed. For instance, in a context where S1 has succeeded, S3 proposes S2 with a proposition weight equal to satisfaction(S2 S)*weight(S3) = 1*1 = 1. In the same context, S4 proposes S2 with a proposition weight equal to satisfaction(S2 F)*weight(S4) = -1*3 = -3.

Then, the proposition weights of each schemas are summed up and the schema with the highest sum is selected and enacted. In our exemple, S2 has a total proposition weight equal to 1-3 = -2. Negative propositions can be understood as "fear" for getting unsatisfaction. In this case, Ernest is "afraid" of doing S2 because he had more bad experiences of doing it than good experiences in this context. He will only do it if he has no more appealing choice.

When the selected schema has been enacted, the environment returns its succeed or fail status. Based on this status, the new context is assessed and new schemas are learned or reinforced.

No comments: