Bridging the Gap: Task Design and Data Efficiency for Broader RL Adoption
Speaker: Pierre-Luc Bacon, University of Montreal
Time: Tuesday, Feb 20, 2024, 10:00AM - 11:00 AM, Eastern Time
Zoom Link: contact ymao@uottawa.ca
Abstract:
While traditional RL excels at solving pre-defined problems, it often neglects a crucial first step: specifying real-world tasks that involve
complex environments and human preferences. This challenge is further amplified by real-world scenarios where extensive online data collection is
costly or infeasible, but rich historical observations (e.g., sensor readings) exist. Current RL methods, however, struggle to effectively
utilize and generalize from such data, especially in unseen situations.
This talk presents my group's recent advances on these two fronts. First, I will introduce the "Motif" methodology, which facilitates task
specification by distilling knowledge from Large Language Models (LLMs) into intrinsic rewards. Second, I will discuss how these advances connect
to the important question of representation learning in RL. This includes our work on disentangling memory and credit assignment in Transformer
architectures, as well as developing sample-efficient methods for learning approximate information states for Partially Observable Markov
Decision Processes (POMDPs) using sequence models.
Through these advancements in task specification and data efficiency, we pave the way for broader real-world applications of RL, moving beyond
the confines of academic benchmarks.
Speaker's Bio
Pierre-Luc Bacon is an Assistant Professor at the University of Montreal in the Department of Operations Research and Computer Science. He holds
a CIFAR AI Chair at Mila, the Quebec Institute in AI. His research focuses on deep reinforcement learning and how to scale these methods up,
particularly in the face of the curse of dimensionality. Driven by the goal of making RL more practical, he explores applications in areas like
HVAC control and drug discovery.
|
|
|