Nonconvex High-dimensional Statistical Learning over Networks: Algorithms, Guarantees, and Trade-offs

Interdisciplinary Areas: Data and Engineering Applications, Others

Project Description

There is growing interest in large-scale machine learning and optimization over decentralized networks, e.g., in the context of federated learning. This project targets distributed learning and optimization over meshed networks, where no centralized coordination is present – these computational architectures are more compliant with future edge-computing systems. Motivated by big-data applications, we consider so-called high-dimensional statistical models: the number of decision variables is of the same order as, or exceeds, the total sample size over the network. The goal is to exploit latent low-dimensional structures in the network data (e.g., sparsity, low-rank) to achieve statistical consistency in a resource-efficient manner.

While statistical-computational guarantees of centralized high-dimensional learning are well studied, our understanding in the network setting is unsatisfactory, even for simple learning tasks: (i) distributed schemes, designed and performing well in the low dimensional regime, can break down dramatically in the high-dimensional case; and (ii) existing convergence studies fail to provide useful predictions; they are in fact confuted by experiments. This project aims at filling this gap by designing distributed algorithms for high-dimensional M-(nonconvex) estimation problems over meshed networks while revealing trade-offs in the statistical error, computational complexity, and communication costs.

 

Start Date

March 2024

 

Postdoc Qualifications

We are seeking candidates that have a record of scholarship under the rubric of “computational optimization” and (possibly) statistics. Candidates may come from different backgrounds, such as operations research, statistics, electrical engineering, computer science or a related field. All applicants must have strong mathematical and computing skills. Prior knowledge in stochastic programming is a plus. Applicants should possess a Ph.D. in Mathematics, Operation Research, Statistics, Electrical Engineering, or other relevant fields. Applicants are expected to be outstanding, intellectually curious researchers at an early stage of their scholarly career.

 

Co-Advisors

Gesualdo Scutari, School of Industrial Engineering, gscutari@purdue.edu

Alex L. Wang, Daniels School of Business, wang5984@purdue.edu

 

Short Bibliography

-Y. Ji, G. Scutari, Y. Sun, H. Honnappa, "Distributed (ATC) Gradient Descent for High Dimension Sparse Regression, '' IEEE Transactions on Information Theory, 69(8), 5253-5276, 2023

- L. Ding and A. Wang, "Sharpness and well-conditioning of nonsmooth convex formulations in statistical signal recovery, " preprint, https://arxiv.org/abs/2307.06873, 2023.

- Y. Sun, A. Daneshmand, and G. Scutari, “Distributed Optimization Based on Gradient-tracking Revisited: Enhancing Convergence Rate via Surrogation,” SIAM J. Optim., 32(4), 354–385, 2022.

- A. Wang and F. Kilinc-Karzan, "Accelerated first-order methods for a class of semidefinite programs,"preprint, 2022, https://arxiv.org/abs/2206.00224

- A. Daneshmand, G. Scutari, and Vyacheslav Kungurtsev, “Second-Order Guarantees of Distributed Gradient Algorithms”, SIAM J. Optim., 30(4), 3029–3068, 2021.