Carnegie Mellon University
Title: Offline Reinforcement Learning: Towards Optimal Sample Complexity and Distributional Robustness
Abstract: Offline or batch reinforcement learning seeks to learn a near-optimal policy using history data without active exploration of the environment. To counter the insufficient coverage and sample scarcity of many offline datasets, the principle of pessimism has been recently introduced to mitigate high bias of the estimated values. However, prior algorithms or analyses either suffer from suboptimal sample complexities or incur high burn-in cost to reach sample optimality, thus posing an impediment to efficient offline RL in sample-starved applications. In this talk, we demonstrate that the model-based (or “plug-in”) approach achieves minimax-optimal sample complexity without burn-in cost for tabular Markov decision processes (MDPs). Our algorithms are “pessimistic” variants of value iteration with Bernstein-style penalties, and do not require sophisticated variance reduction. We further consider a distributionally robust formulation of offline RL, focusing on tabular robust MDPs with an uncertainty set specified by the Kullback-Leibler divergence. Again, a model-based algorithm that combines distributionally robust value iteration with the principle of pessimism achieves a near-optimal sample complexity up to a polynomial factor of the effective horizon length.
Dr. Yuejie Chi is a Professor in the department of Electrical and Computer Engineering, and a faculty affiliate with the Machine Learning department and CyLab at Carnegie Mellon University. She received her Ph.D. and M.A. from Princeton University, and B. Eng. (Hon.) from Tsinghua University, all in Electrical Engineering. Her research interests lie in the theoretical and algorithmic foundations of data science, signal processing, machine learning and inverse problems, with applications in sensing, imaging, decision making, and societal systems, broadly defined. Among others, Dr. Chi received the Presidential Early Career Award for Scientists and Engineers (PECASE) and the inaugural IEEE Signal Processing Society Early Career Technical Achievement Award for contributions to high-dimensional structured signal processing. She is an IEEE Fellow (Class of 2023) for contributions to statistical signal processing with low-dimensional structures.
Host: Dr. Cong Shen
Organizer: Dr. Cong Shen