Combinatorics and Probability Seminar
Youngtak Sohn
MIT
The generalization error of max-margin classifiers in the overparametrized regime
Abstract: Many modern learning methods, such as deep neural networks, are so complex that they perfectly fit the training data. Despite this, they generalize well to the unseen data. This talk will discuss an example of this phenomenon, namely the max-margin estimators for binary classification tasks, which achieves vanishing training error for separable data.
In the first part of the talk, I will talk about the Gaussian model, where the covariates are normally distributed and the labels are drawn from a generalized linear model. In particular, we determine the sharp threshold for the separability of the data, which coincides with the critical threshold for the spherical perceptron model. Moreover, we characterize the sharp asymptotics for the generalization error of max-margin estimators.
In the second part of the talk, I will show how we can use this result to study nonlinear random features model, two-layer neural networks with random first layer weights. I will also discuss several statistical insights which can be drawn from such mathematical analysis.
Joint work with Andrea Montanari, Feng Ruan, and Jun Yan
Talk by zoom, we will watch together in SEO 636
Monday April 4, 2022 at 3:00 PM in Zoom