Olivier Georgeon's research blog—also known as the story of little Ernest, the developmental agent.

Keywords: situated cognition, constructivist learning, intrinsic motivation, bottom-up self-programming, individuation, theory of enaction, developmental learning, artificial sense-making, biologically inspired cognitive architectures, agnostic agents (without ontological assumptions about the environment).

Saturday, February 21, 2009

About local optimum

It must be noticed that sometimes, Ernest 4.3 can get stuck in some non-optimal solution. In this video, he finds a solution made of primary schema S14 and secondary schema S30. This solution gives him a Yahoo! when enacting each of these two schemas. This solution, however, is not optimal because it makes Ernest systematically bump each secondary schema S30 first step, which could be avoided in this environment.

This problem relates to the so-called "local optimum" problem: Ernest will not explore other solutions when he has found a satisfying one.

Many authors suggest a stochastic response to this problem, consisting of putting some randomness or "noise" in the agent's behavior. Soar even implements this response by default when reinforcement learning is activated, through the so-called "epsilon-greedy" exploration policy (Laird & Congdon, 2008). This randomness will sometimes cause the agent to perform an action that he does not prefer, in order to make him explore other solutions.

I do not agree with this response because I don't see any sense having the agent not choose his preferred action. That will only impede the learning process. Besides, it has been widely accepted, since Simon (1955), that cognitive agents' goal is not to find the optimum solution but only a satisfying solution.

For Ernest, I would rather implement some mechanism that would make him choose to explore other solutions when he gets "bored". Before that, I could at least reduce this non-optimal-solution risk by computing a better payoff value for each schema.

References

Laird John E., Congdon Claire B., (2008). The Soar User’s Manual Version 9.0. University of Michigan.

Simon, H. (1955). A behavioral model of rational choice. Quaterly Journal of Economics, 69, 99-118.

No comments: