Bridging Discrete-Time and Continuous-Time Modeling for Stochastic First-Order Optimization
Abstract: We now live in a world surrounded by data. As an example, when we want to buy or sell a house, we browse real estate websites and go through related listings. By comparing the "data", we subconsciously ``generate a price quote" for the house we are interested in buying or trying to sell. This can be viewed as an optimization problem which, in theory, can be solved using gradient descent (GD) method. However, in real-world scenarios, because of the tremendous sizes of the datasets, vanilla GD is typically not an efficient or computable option. An improved version of vanilla GD is stochastic gradient descent (SGD). In the first half of this talk, we will go through the background of GD and SGD algorithms. From there, we will introduce how a discrete-time SGD algorithm can be modeled by a continuous-time stochastic process based on Brownian motion. Using probabilistic tools, one can reveal many interesting properties from continuous-time versions of SGD. Some of these properties can be translated back to discrete-time SGD algorithms.
Wednesday September 19, 2018 at 4:00 PM in 636 SEO