Computer Engineering Seminar Series

Enhancing Safety in LLMs and other Foundation Models

We will examine (1) new frameworks for evaluating and aligning model behavior with human intent (2) the security and reliability of watermarking techniques in foundation models, including their role in provenance tracking and their vulnerabilities to adversarial removal and evasion, and (3) novel approaches for detecting and mitigating high-risk model outputs before deployment. By synthesizing these findings, we will discuss the broader implications of foundation model security, trade-offs between robustness and control, and future directions for improving Al safety at scale.

A Retrospective Analysis of Password Reuse and Forward-Looking Considerations for Passwordless Authentication

The focus of the talk, however, will be a measurement study we performed of the University of Chicago’s vulnerability to credential-guessing attacks across twenty years. Given a list of university usernames, we searched for matches in data breaches from hundreds of websites. Ultimately, we successfully guessed passwords for thousands of UChicago affiliates. I will conclude my talk by discussing our parallel investigations of FIDO2 passwordless authentication, including analyzing why passkeys have not yet replaced passwords for web authentication.

Rethinking the Control Plane for Chiplet-Based Heterogeneous Systems

In the first part of this talk, I will discuss our efforts to apply hardware-software co-design to help future heterogeneous systems overcome these challenges and improve performance, energy efficiency, and scalability. Then, in the second part I will discuss how the on-going transition to chiplet-based heterogeneous systems exacerbates these challenges and how we address these challenges in chiplet-based heterogeneous systems by rethinking the control plane.

When research comes full circle: A missed opportunity and what to learn from it

In this reprisal of my keynote address at ACM CCS 2023, I will discuss user-authentication practice on the Internet and the development of the research community’s apathy toward it in the 2000s. While we were focusing on replacing passwords (versus improving their use), industry leaders by the late 2010s were decrying password reuse across accounts as the “No. 1 cause of harm on the Internet” and the cause of “99% of compromised accounts”.

Resource Efficient Large Scale ML: Plan Before You Run

As ML on structured data becomes prevalent across enterprises, improving resource efficiency is crucial to lower costs and energy consumption. Designing systems for learning on structured data is challenging because of the large number of models. Parameters and data access patterns. We identify that current systems are bottlenecked by data movement which results in poor resource utilization and inefficient training.

Fair and Optimal Prediction via Post-Processing

In this talk, I will first discuss our recent work on characterizing the inherent tradeoff between fairness and accuracy in both classification and regression problems, where we show that the cost of fairness could be characterized by the optimal value of a Wasserstein-barycenter problem. Then I will show that the complexity of learning the optimal fair predictor is the same as learning the Bayes predictor and present a post-processing algorithm based on the solution to the Wasserstein-barycenter problem that derives the optimal fair predictors from Bayes score functions

Resiliency and Efficiency of Complex Enterprise Systems

Modern cloud-based applications adhere to a microservices architecture, encompassing numerous components interconnected through intricate dependencies, operating within a distributed environment. This talk gives an overview of three recent research projects aimed at solving practical challenges in promptly identifying and diagnosing outages.

The Past, Present, and Future of High-Performance Computing

Emphasis on next-generation computing is driven at the national level because it affects scientific discovery, engineering, healthcare, security, and economic competitiveness. Moreover, high-performance computing is playing a greater role in our daily activities, requiring more performance from each system. However, pushing the performance envelope is becoming increasingly challenging – designing a machine that is 5ox faster isn’t as simple as making today’s machines sox larger. The future of high-performance computing will incorporate novel architectural concepts and heterogeneity at both the node and the system level to achieve power and performance goals. These future systems present new challenges and opportunities in how we approach computing.

November 29, 2022 Fall 2022

Bootstrapping Library-Based Synthesis

Constraint-based program synthesis techniques have been widely used in numerous settings. However, synthesizing programs that use libraries remains a major challenge. To handle complex or black-box libraries, the state of the art is to provide carefully crafted mocks or models to the synthesizer, requiring extra manual work. We address this challenge by proposing Toshokan, a new synthesis framework as an alternative approach in which library-using programs can be generated without any user-provided artifacts at the cost of moderate performance overhead.

November 15, 2022 Fall 2022

Real-time Intelligent Edge Services for Internet of Things Applications

Advances in neural network revolutionized modern machine intelligence, but important challenges remain when applying these solutions in IoT contexts; specifically, in cost-sensitive applications on lower-end embedded devices. The talk discusses challenges in offering real-time machine intelligence services at the edge to support applications in resource constrained environments.

November 8, 2022 Fall 2022

Root Cause Analysis of Failures in Microservices through Causal Discovery

Most cloud applications use a large number of smaller sub-components (called microservices) that interact with each other in the form of a complex graph to provide the overall functionality to the user. While the modularity of the microservice architecture is beneficial for rapid software development, maintaining and debugging such a system quickly in cases of failure is challenging.

November 1, 2022 Fall 2022

Challenges in Engineering Secure Software Systems

Software security is a growing concern leading to the increasing adoption of a secure by design approach to software development. In such approach, software systems are designed from the ground up to be resilient against attacks. Despite the growing efforts into addressing security concerns early on during software development, mistakes can be made that lead to vulnerabilities (“software weaknesses*).

October 4, 2022 Fall 2022

A Framework for Feedback Guided Generation of Binaries

Binary analysis is an important capability required for many security and software engineering applications. Consequently, there are many binary analysis techniques and tools with varied capabilities. However, testing these tools requires a large, varied binary dataset with corresponding source-level information. In this paper, we present Cornucopia, an architecture agnostic automated framework that can generate a plethora of binaries from corresponding program source by exploiting compiler optimizations and feedback-guided learning.

September 27, 2022 Fall 2022

SIMR: Single Instruction Multiple Request Processing for Energy-Efficient Data Center Microservices

Contemporary data center servers process thousands of similar, independent requests per minute. In the interest of programmer productivity and ease of scaling, workloads in data centers have shifted from single monolithic processes toward a micro and nanoservice software architecture. As a result, single servers are now packed with many threads executing the same, relatively small task on different data.