Helen C. de Mattos Senefonte, Reinaldo A. C. Bianchi, Carlos H. C. Ribeiro.
Wepresentanewvariationofanaction-selectionapproachtobeused in Multi Objective Reinforcement Learning problems, in order to speed up the learning process. To do so, we propose to use the action selection method of the Heuristically Accelerated Q-learning algorithm (HAQL), in which a heuristic function H is used to influence the choice of actions during the learning, in traditional multi objective RL algorithms. This proposal was evaluated using a traditional research task: the predator-prey problem, and the results indicate that the use of heuristics is able to provide the desired learning acceleration.
http://www.lbd.dcc.ufmg.br/colecoes/enia/2011/0033.pdf
Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web