Prof. Runze LI
Eberly Family Chair Professor in Statistics,
The Pennsylvania State University
Date: 11 June, Tuesday
Time: 15:00 – 16:30
Venue: E22-G008
Host: Prof. Wenyang ZHANG, Chair Professor of Business Intelligence and Analytics
Abstract
This paper is concerned with high-dimensional two-sample mean problems, which receive considerable attention in recent literature. To utilize the correlation information among variables for enhancing the power of two-sample mean tests, we consider the setting in which the precision matrix of high-dimensional data possesses a linear structure. Thus, we first propose a new precision matrix estimation procedure with considering its linear structure, and further develop regularization methods to select the true basis matrices and remove irrelevant basis matrices. With the aid of estimated precision matrix, we propose a new test statistic for the two-sample mean problems by replacing the inverse of sample covariance matrix in Hotelling test by the estimated precision matrix. The proposed test is applicable for both the low dimensional setting and high dimensional setting even if the dimension of the data exceeds the sample size. The limiting null distributions of the proposed test statistic under both null and alternative hypotheses are derived. We further derive the asymptotical power function of the proposed test and compare its asymptotic power with some existing tests. We found the estimation error of the precision matrix does not have impact on the asymptotical power function. Moreover, asymptotic relative efficiency of the proposed test to the classical Hotelling test tends to infinity when the ratio of the dimension of data to the sample size tends to 1. We conduct Monte Carlo simulation study to assess the finite sample performance of the proposed precision matrix estimation procedure and the proposed high-dimensional two-sample mean test. Our numerical results imply that the proposed regularization method is able to effectively remove irrelevant basis matrices. The proposed test performs well compared with the existing methods especially when the elements of the vector have unequal variances. We also illustrate the proposed methodology by an empirical analysis of a real-world data set.
Speaker
Prof. Li’s research interest includes variable selection and feature screening for high dimensional data, nonparametric modeling and semiparametric modeling and their application to social behavior science research. He is also interested in longitudinal data analysis and survival data analysis and their application to biomedical data analysis. Since 2018, he is the Eberly Family Chair Professor of Statistics. He received his NSF Career Award in 2004. He is a fellow of IMS, ASA and AAAS. He was co-editor of Annals of Statistics, and served as associate editor of Annals of Statistics and Statistica Sinica. He currently serves as associate editor of JASA and Journal of Multivariate Analysis.