Printable PDF
Department of Mathematics,
University of California San Diego

****************************

PhD Defense

Gaojin He

UC San Diego

Complexity Bounds for Approximately Solving Markov Decision Processes and Properties of Turnpike Functions.

Abstract:

Markov Decision Processes are the major model of controlled stochastic processes in discrete time. Value iteration (VI) is one of the major methods for finding optimal policies. For each discount factor, starting from a finite number of iterations, which is called the turnpike integer, value iteration algorithms always generate decision rules which are deterministic optimal policies for the infinite-horizon problems. This fact justifies the rolling horizon approach for computing infinite-horizon optimal policies by conducting a finite number of value iterations. In this talk, we will first discuss the complexity of using VI to approximately solve MDPs, and then introduce properties of turnpike integers and provide their upper bounds.

June 9, 2025

4:00 PM

APM 6402

****************************