Link | https://proceedings.mlr.press/v232/letourneau23a.html |
---|
Author | Search for: Létourneau, Vincent; Search for: Bellinger, Colin1ORCID identifier: https://orcid.org/0000-0002-3567-7834; Search for: Tamblyn, Isaac; Search for: Fraser, Maia |
---|
Affiliation | - National Research Council of Canada. Digital Technologies
|
---|
Format | Text, Article |
---|
Conference | 2nd Conference on Lifelong Learning Agents, CoLLA 2023, August 22-25, 2023, Montreal, QC, Canada |
---|
Abstract | This conceptual paper provides theoretical results linking notions in semi-supervised learning (SSL) and hierarchical reinforcement learning (HRL) in the context of lifelong learning. Specifically, our construction sets up a direct analogy between intermediate representations in SSL and temporal abstraction in RL, highlighting the important role of factorization in both types of hierarchy and the relevance of partial labeling, resp. partial observation. The construction centres around a simple class of Partially Observed Markov Decision Processes (POMDPs) where we show tools and results from SSL imply lower bounds on regret holding for any RL algorithm without access to temporal abstraction. While our lower bound is for a restricted class of RL problems, it applies to arbitrary RL algorithms in this setting. The setting moreover features so-called “active measuring”, an aspect of widespread relevance in industrial control, but - possibly due to its lifelong learning flavour - not yet well-studied in RL. Our formalization makes it possible to think about tradeoffs that apply for such control problems. |
---|
Publication date | 2023-08-22 |
---|
Publisher | ML Research Press |
---|
In | |
---|
Language | English |
---|
Peer reviewed | Yes |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | 5514a9ff-e0f7-4b57-9111-e1add2a5e819 |
---|
Record created | 2024-12-06 |
---|
Record modified | 2024-12-06 |
---|