Collaborative Statistical Learning: Algorithms and Guarantees

Advances in computation, communication, and data storage techniques in recent decades significantly reduced the cost of data acquisition, leading to an explosion of data generated across different interconnected platforms. Apart from the computational difficulties arise from nonconvex formulations; the sheer volume and spatial disparity of data also pose challenges to traditional learning procedures, which typically require centralized training sets. Reaping the dividend offered by the data deluge, it then urges for the development of collaborative learning methods capable of making inferences from data over the network.
In this talk, we present a novel algorithmic framework, SONATA, and its guarantees for in-network statistical learning, formulated as an empirical risk minimization (ERM) problem.  By leveraging local successive convexification and network communication, our algorithm, for the first time in the literature, is able to solve fairly general nonconvex ERM problems over (time-varying directed) networks; it matches the performance of a centralized learning algorithm, in the sense that it converges linearly for strongly convex ERM problems and sub-linearly for (non)convex ERM instances. Furthermore, when it comes to regularized high-dimensional ERM problems (i.e., models where the parameter dimension is larger than the sample size), SONATA enjoys linear convergence up to the statistical precision of the model, even in the absence of strong convexity. Generalizations of the algorithm to large-scale problems and the asynchronous setting will also be discussed.
Ying Sun is a post-doctoral researcher with the School of Industrial Engineering, Purdue University. She received her Ph.D. degree in Electronic and Computer Engineering from the Hong Kong University of Science and Technology in 2016. Her research focuses on computational optimization, statistical learning and the interplay between them, with an emphasis in decentralized and collaborative inference methods. She is the coauthor of a student best paper at the IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) 2017 and served as a Technical Program Committee member for the IEEE Global Conference on Signal and Information Processing (GlobalSIP) 2018, 2019. Her overview article on majorization minimization algorithms is among the Web of Science highly cited papers in 2018 and 2019.