SCALABLE ALGORITHMS FOR STATISTICAL INFERENCE ON LARGE NETWORK DATA
Large-scale networks are becoming increasingly common in a variety of disciplines such as epidemiology, social science, and digital health. Such massive networks challenge the storage and computational limits of existing statistical inference methods, which are essential tools for data-driven scientific discoveries from such networks. Therefore, there is a great need for new statistical inference methods that are (a) theoretically sound, (b) methodologically versatile, i.e., applicable to a wide spectrum of models and inference goals, and (c) computationally feasible even for large-scale networks. To accomplish this and meet the challenge of inherent inter-dependency and complexity of networks, Professor Chen and his lab propose new approaches for subsampling-based inference of large-scale networks: Subsampling with Common Overlap, Predictive Inference, and a hybrid of these two algorithms. The proposed methods will be integrated into inference tasks such as community detection, model fitting, model selection, and hypothesis testing.
Professor Chen and his lab will collaborate with domain experts from various disciplines to apply the new methods for solving large-scale scientific problems in these fields. This work will also accomplish broader impacts in education through curriculum development and mentoring of students. Research results will be disseminated to the scientific community through publication in research journals, conference presentations, and free open-source software.