Resource Efficient Large Scale ML: Plan Before You Run

As ML on structured data becomes prevalent across enterprises, improving resource efficiency is crucial to lower costs and energy consumption. Designing systems for learning on structured data is challenging because of the large number of models. Parameters and data access patterns. We identify that current systems are bottlenecked by data movement which results in poor resource utilization and inefficient training.

Continue reading

 Fair and Optimal Prediction via Post-Processing 

 To mitigate the bias exhibited by machine learning models, fairness criteria can be integrated into the training process to ensure fair treatment across all demographics, but it often comes at the expense of model performance. Understanding such tradeoffs, therefore, underlies the design of optimal and fair algorithms. In this talk, I will first discuss our recent work on characterizing the inherent tradeoff between fairness and accuracy in both classification and regression problems, where we show that the cost of fairness could be characterized by the optimal value of a Wasserstein-barycenter problem. Then I will show that the complexity of learning the optimal fair predictor is the same as learning the Bayes predictor and present a post-processing algorithm based on the solution to the Wasserstein-barycenter problem that derives the optimal fair predictors from Bayes score functions

Continue reading