艾明要,北京大學數學科學學院統計學二級教授、博士生導師。兼任全國應用統計專業學位研究生教育指導委員會委員、培養組組長,中國現場統計研究會副理事長,中國數學會概率統計學會第十一屆理事會秘書長,中國統計學會常務理事。擔任四個國際重要SCI期刊Stat Sinica、JSPI、SPL和Stat編委,國內核心期刊 《系統科學與數學》、《數理統計與管理》、《數學進展》編委,科學出版社《統計與數據科學叢書》編委。主要從事大數據采樣理論與算法、試驗設計與分析、計算機仿真試驗與建模、應用統計的教學和研究工作,在AOS、JASA、Biometrika、《中國科學》等國內外重要期刊發表學術論文八十余篇。主持國家自然科學基金重點項目1項、重點項目子課題1項、面上項目5項,參與完成科技部重點研發計劃項目2項。北京大學通識教育核心課程主講教師,兩次獲得北京大學優秀博士學位論文指導教師,獲北京市高等學校優秀教學成果二等獎。
報告摘要:Subsampling methods are effective techniques to reduce computational burden and maintain statistical inference efficiency for big data. In this talk, we will review different subsampling techniques for efficiently dealing with different types of big data, not only for different inferential models from linear model, to generalized linear model, and to estimation equations, but also for different types of data from static data to data streams. To deal with the situation that the full data are stored in different blocks or at multiple locations, a distributed subsampling framework is developed, in which statistics are computed simultaneously on smaller partitions of the full data. Finally, the proposed strategies are illustrated and evaluated through numerical experiments on both simulated and real data sets.