We recommend that non-UIC participants attend online. Any non-UIC participants who would like to attend in-person events need to register with the organizer at least three days before the seminar. If you have any questions, please contact Eloy Reyes.

MSCS Seminars Today

Calendar for Wednesday October 20, 2021

Wednesday October 20, 2021
pdf * Statistics and Data Science Seminar
Information-based Optimal Subdata Selection for Clusterwise Linear Regression Model
Yanxi Liu (University of Illinois, Chicago)
4:00 PM in Zoom
As the data size increases rapidly, the relationship between input and output variables may not be homogeneous anymore. Conventional statistical models such as generalized linear models (GLMs) may not be well-suited to heterogeneous relationships. Using a Mixture of Expert models is a good solution. The Mixture of Expert models can combine different statistical models to detect heterogeneous patterns while maintaining the benefits of conventional statistical modeling techniques. However, it needs a considerable amount of computer resources, particularly when working with big data. To address this issue, an attractive idea is to analyze a subsample of the data retaining the rich information of the full data. Information-Based Optimal Subdata Strategy (IBOSS), proposed by Wang et al. (2019), is such a strategy. The IBOSS strategy captures most of the relevant information in the full data through a judicious selection of the subdata by "maximizing" the Fisher information matrix. This project aims to develop an algorithm for the Clusterwise Linear Regression model, a type of Mixture of Experts, to select subdata based on IBOSS strategy. However, the Fisher information matrix of the model has no explicit form, which is a major challenge of the work. To overcome this challenge, we propose a surrogate matrix which is proved to be asymptotically equivalent to the Fisher information matrix, and it is used to construct the IBOSS subdata. Further, the proposed subdata selection is proved to be asymptotically optimal, i.e., no other method is statistically more efficient than the proposed one when the full data size is large.

pdf * Graduate Number Theory Seminar
Abelian Varieties VIII
Tian Wang (UIC)
5:00 PM in Zoom
Web Privacy Notice HTML 5 CSS FAE
UIC LAS MSCS > persisting_utilities > seminars > today@UIC