Computer Science Theory Seminar

Gabor Berend
University of Szeged
Determining sparse word representations in monolingual and multilingual settings
Abstract: Symbolic representations have been superseded by continuous representations in practically all natural language processing (NLP) applications. Despite the fact that the popular continuous representations are capable of solving various NLP tasks close to human performance, the kind of representations employed in most recent NLP frameworks do not really resemble human cognition. In this talk, we will review algorithms for obtaining continuous meaning representations of natural language, then propose an approach to distill symbolic features from them in a way that convey human interpretable, commonsense knowledge as well. We additionally present our experimental results suggesting that the symbolic features distilled from continuous representations via sparse coding can be used for training standard statistical models that perform comparably to more expensive and less interpretable neural models. Finally, we also introduce an efficient algorithm for constructing multilingual sparse word representations, opening up the possibility for performing zero-shot learning across languages.
Tuesday February 18, 2020 at 3:00 PM in 1325 SEO
Web Privacy Notice HTML 5 CSS FAE
UIC LAS MSCS > persisting_utilities > seminars >