Time and temporal abstraction in continual learning: tradeoffs, analogies and regret in an active measuring setting

Link	https://proceedings.mlr.press/v232/letourneau23a.html
Author	Search for: Létourneau, Vincent; Search for: Bellinger, Colin¹ORCID identifier: https://orcid.org/0000-0002-3567-7834; Search for: Tamblyn, Isaac; Search for: Fraser, Maia
Affiliation	National Research Council Canada. Digital Technologies
Format	Text, Article
Conference	2nd Conference on Lifelong Learning Agents, CoLLA 2023, August 22-25, 2023, Montreal, QC, Canada
Abstract	This conceptual paper provides theoretical results linking notions in semi-supervised learning (SSL) and hierarchical reinforcement learning (HRL) in the context of lifelong learning. Specifically, our construction sets up a direct analogy between intermediate representations in SSL and temporal abstraction in RL, highlighting the important role of factorization in both types of hierarchy and the relevance of partial labeling, resp. partial observation. The construction centres around a simple class of Partially Observed Markov Decision Processes (POMDPs) where we show tools and results from SSL imply lower bounds on regret holding for any RL algorithm without access to temporal abstraction. While our lower bound is for a restricted class of RL problems, it applies to arbitrary RL algorithms in this setting. The setting moreover features so-called “active measuring”, an aspect of widespread relevance in industrial control, but - possibly due to its lifelong learning flavour - not yet well-studied in RL. Our formalization makes it possible to think about tradeoffs that apply for such control problems.
Date published	2023-08-22
Publisher	ML Research Press
In	Proceedings of Machine Learning Research 232 (22 August 2023): 470–480.
Language	English
Peer reviewed	Yes
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	5514a9ff-e0f7-4b57-9111-e1add2a5e819
Record created	2024-12-06
Record modified	2024-12-06

Page details

From:

National Research Council Canada

Date modified:: 2026-06-11