Cyrus Jeffrey DiCiccio
Improving Efficiency of Hypothesis Tests via Data Splitting
Abstract: Data splitting, a tool that is well studied and commonly used for estimation problems such as assessing prediction error, can also be useful in testing problems where a portion of the data can be allocated to make the testing problem easier in some sense, say by estimating or even eliminating nuisance parameters, dimension-reduction, etc.. In single or multiple testing problems that include a large number of parameters, there can be a dramatic increase of power by reducing the number of parameters tested, particularly when the number of non-null parameters is relatively sparse. While there is some loss of power associated with testing on only a fraction of the available data, carefully selecting a test statistic may in turn improve power, though it remains unclear whether the reduction of the number of parameters under consideration can outweigh the loss of power from splitting the data. To combat the inherent loss of power seen with data splitting, methods of combining inference across several splits of the data are developed. The power of these methods is compared with the power of full data tests, as well as tests using only a single split of the data.
Tea time after talk at 4:00 - 4:30 PM, SEO 300.
Friday January 12, 2018 at 3:00 PM in SEO 636