Learning and planning in partially observable Markov decision processes
with weighted automata and tensor networks
Speaker: Guillaume Rabusseau, University of Montreal
Time: Tuesday, May 4, 2021, 10:00AM - 11:00AM, EST
Zoom Link: contact
tml.online.seminars@gmail.com
Abstract:
In this talk, I will present fundamental connections between partially
observable Markov decision process (POMDP), weighted automata and tensor
networks.
I will first present the now classical predictive state representations
for POMDP and their natural connection with weighted automata. I will then
present the spectral learning algorithm for weighted automata introduced
in [Bailly et al., 2009] and [Hsu et al., 2009] which gives rise to a
consistent learning algorithm for POMDP.
One caveat of the spectral learning in the context of reinforcement
learning is that is a two-stage paradigm: first learn the environment
dynamics and then plan accordingly, which can be both sample inefficient
and time consuming. I will then show how the spectral learning algorithm
can be extended into a one-stage approach combining planning and learning
altogether. I will then conclude by presenting interesting connections
between weighted automata and tensor networks, opening the door to future
research combining the parameter efficiency of tensor network methods with
the principled spectral learning approach for reinforcement learning in
partially observable domains.
This talk is based on the following work led by PhD student Tianyu Li:
- Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau. Efficient
Planning under Partial Observability with Unnormalized Q Functions and
Spectral Learning, AISTATS 2020
- Tianyu Li, Doina Precup, Guillaume Rabusseau. Connecting Weighted
Automata, Tensor Networks and Recurrent Neural Networks through Spectral
Learning, AISTATS 2019 (extended version)
Speaker's Bio
Guillaume Rabusseau is an assistant professor at Univeristy of Montreal
and holds a Canada CIFAR AI chair at the Mila research institute. Prior to
joining Mila, he was an IVADO postdoctoral research fellow in the
Reasoning and Learning Lab at McGill University, where he worked with
Prakash Panangaden, Joelle Pineau and Doina Precup. He obtained his PhD in
computer science in 2016 at Aix-Marseille University under the supervision
of François Denis and Hachem Kadri. His research interests lie at the
intersection of theoretical computer science and machine learning, and his
work revolves around exploring inter-connections between tensors and
machine learning to develop efficient learning methods for structured data
relying on linear and multilinear algebra.
|
|
|