Improving Reinforcement Learning Control Via Online Bilinear Action Interpolation

Carlos H. C. RibeiroElder M. Hemerly

Reinforcement Learning has been used as a reasonably successful method for the problem of model-free learning of action policies for some control problems. However, it is usually assumed that the process to be controlled is either open loop stable or of slow dynamics, when frequency of failures before acceptable performance or input-output processing time are not issues of primary importance.We consider the problem of model-free regulation for an unstable plant. As in many cases the need for state quantization is an algorithmic storage requirement rather than a sensor limitation, we propose a modification of a standard Reinforcement Learning method that uses as additional information the distance between sampled and represented states, embedded in actions that are a result of a distance wise local interpolation scheme. We obtained faster learning under minimal disturbance of the original learning scheme, and the modification is computationally modest enough to allow for real-time implementation.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: