When research comes full circle: A missed opportunity and what to learn from it

In this reprisal of my keynote address at ACM CCS 2023, I will discuss user-authentication practice on the Internet and the development of the research community’s apathy toward it in the 2000s. While we were focusing on replacing passwords (versus improving their use), industry leaders by the late 2010s were decrying password reuse across accounts as the “No. 1 cause of harm on the Internet” and the cause of “99% of compromised accounts”. 

Continue reading

Resource Efficient Large Scale ML: Plan Before You Run

As ML on structured data becomes prevalent across enterprises, improving resource efficiency is crucial to lower costs and energy consumption. Designing systems for learning on structured data is challenging because of the large number of models. Parameters and data access patterns. We identify that current systems are bottlenecked by data movement which results in poor resource utilization and inefficient training.

Continue reading

 Fair and Optimal Prediction via Post-Processing 

 To mitigate the bias exhibited by machine learning models, fairness criteria can be integrated into the training process to ensure fair treatment across all demographics, but it often comes at the expense of model performance. Understanding such tradeoffs, therefore, underlies the design of optimal and fair algorithms. In this talk, I will first discuss our recent work on characterizing the inherent tradeoff between fairness and accuracy in both classification and regression problems, where we show that the cost of fairness could be characterized by the optimal value of a Wasserstein-barycenter problem. Then I will show that the complexity of learning the optimal fair predictor is the same as learning the Bayes predictor and present a post-processing algorithm based on the solution to the Wasserstein-barycenter problem that derives the optimal fair predictors from Bayes score functions

Continue reading

The Past, Present, and Future of High-Performance Computing

Emphasis on next-generation computing is driven at the national level because it affects scientific discovery, engineering, healthcare, security, and economic competitiveness. Moreover, high-performance computing is playing a greater role in our daily activities, requiring more performance from each system. However, pushing the performance envelope is becoming increasingly challenging – designing a machine that is 5ox faster isn’t as simple as making today’s machines sox larger. The future of high-performance computing will incorporate novel architectural concepts and heterogeneity at both the node and the system level to achieve power and performance goals. These future systems present new challenges and opportunities in how we approach computing.

Continue reading

Gradual Verification: Assuring Programs Incrementally

While software is becoming more ubiquitous in our everyday lives, so are unintended bugs. In response, static verification techniques were introduced to prove or disprove the absence of bugs in code. Unfortunately, current techniques burden users by requiring them to write inductively complete specifications involving many extraneous details. To overcome this limitation, I introduce the idea of gradual verification, which handles complete, partial, or missing specifications by soundly combining static and dynamic checking. As a result, gradual verification allows users to specify and verify only the properties and components of their system that they care about and increase the scope of verification gradually—which is poorly supported by existing tools.

Continue reading

Bootstrapping Library-Based Synthesis

Constraint-based program synthesis techniques have been widely used in numerous settings. However, synthesizing programs that use libraries remains a major challenge. To handle complex or black-box libraries, the state of the art is to provide carefully crafted mocks or models to the synthesizer, requiring extra manual work. We address this challenge by proposing Toshokan, a new synthesis framework as an alternative approach in which library-using programs can be generated without any user-provided artifacts at the cost of moderate performance overhead.

Continue reading

Root Cause Analysis of Failures in Microservices through Causal Discovery

Most cloud applications use a large number of smaller sub-components (called microservices) that interact with each other in the form of a complex graph to provide the overall functionality to the user. While the modularity of the microservice architecture is beneficial for rapid software development, maintaining and debugging such a system quickly in cases of failure is challenging.

Continue reading