Active measure reinforcement learning for observation cost minimization

Par Conseil national de recherches du Canada

Téléchargement	Voir la version finale : Active measure reinforcement learning for observation cost minimization (PDF, 2.4 Mio)
DOI	Trouver le DOI : https://doi.org/10.21428/594757db.72846d04
Auteur	Rechercher : Bellinger, Colin¹; Rechercher : Coles, Rory; Rechercher : Crowley, Mark; Rechercher : Tamblyn, Isaac²
Affiliation	Conseil national de recherches du Canada. Technologies numériques Conseil national de recherches du Canada. Technologies de sécurité et de rupture
Format	Texte, Article
Conférence	34th Canadian Conference on Artificial Intelligence, May 25-28, 2021, Vancouver, British Columbia [Virtual Event]
Description physique	12 p.
Sujet	reinforcement learning; active learning; partial observability; sample efficiency
Résumé	Markov Decision Processes (MDP) with explicit measurement cost are a class of environments in which the agent learns to maximize the costed return. Here, we define the costed return as the discounted sum of rewards minus the sum of the explicit cost of measuring the next state. The RL agent can freely explore the relationship between actions and rewards but is charged each time it measures the next state. Thus, an optimal agent must learn a policy without making a large number of measurements. We propose the active measure RL framework (Amrl) as a solution to this novel class of problem, and contrast it with standard reinforcement learning under full observability and planning under partially observability. We demonstrate that Amrl-Q agents learn to shift from a reliance on costly measurements to exploiting a learned transition model in order to reduce the number of real-world measurements and achieve a higher costed return. Our results demonstrate the superiority of Amrl-Q over standard RL methods, Q-learning and Dyna-Q, and POMCP for planning under a POMDP in environments with explicit measurement costs.
Date de publication	2021-06-08
Maison d’édition	Canadian Artificial Intelligence Association
Licence	Creative Commons Attribution 4.0 International (CC BY 4.0) https://creativecommons.org/licenses/by/4.0/deed.fr
Dans	Proceedings of the Canadian Conference on Artificial Intelligence (Canadian AI 2021), 2021L10 (8 juin 2021).
Langue	anglais
Publications évaluées par des pairs	Oui
Exporter la notice	Exporter en format RIS
Signaler une correction	Signaler une correction (s'ouvre dans un nouvel onglet)
Identificateur de l’enregistrement	0a738f55-7c86-4259-9a0a-a1a1e882e8c8
Enregistrement créé	2021-12-13
Enregistrement modifié	2021-12-14

Date de modification :: 2024-04-19