Robust learning for autonomous agents in stochastic environments

Lecerf, Ugo
Thesis

In this work we explore data-driven deep reinforcement learning (RL) approaches for an autonomous agent to be robust to a navigation task, and act correctly in the face of risk and uncertainty. We investigate the effects that sudden changes to environment conditions have on autonomous agents and explore methods which allow an agent to have a high degree of generalization to unforeseen, sudden modifications to its environment it was not explicitly trained to handle. Inspired by the human dopamine circuit, the performance of an RL agent is measured and optimized in terms of rewards and penalties it receives for desirable or undesirable behavior. Our initial approach is to learn to estimate the distribution of expected rewards from the agent, and use information about modes in this distribution to gain nuanced information about how an agent can act in a high-risk situation. Later, we show that we are able to achieve the same robustness objective with respect to uncertainties in the environment by attempting to learn the most effective contingency policies in a `divide and conquer' approach, where the computational complexity of the learning task is shared between multiple policy models. We then combine this approach with a hierarchical planning module which is used to effectively schedule the different policy models in such a way that the collection of contingency plans is able to be highly robust to unanticipated environment changes. This combination of learning and planning enables us to make the most of the adaptability of deep learning models, as well as the stricter and more explicit constraints that can be implemented and measured by means of a hierarchical planner.


HAL
Type:
Thèse
Date:
2022-10-03
Department:
Data Science
Eurecom Ref:
7017
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Thesis and is available at :
See also:

PERMALINK : https://www.eurecom.fr/publication/7017