Algorithm-Hardware Co-design of Robotic Vision-Language-Action Models

/Wraps/wrap09/controls/api/event_system/render has no handler for Purdue Event Documents of type 'Gilbreth Fellowship Research Proposal'

Algorithm-Hardware Co-design of Robotic Vision-Language-Action Models

Project Description

Recent waves of advances in Artificial Intelligence (AI) - from discriminative to generative and agentic AI - are expected to be followed by a wave of physical AI where intelligence is embodied within systems that interact intimately with the physical world. In particular, the emergence of vision-language-action (VLA) foundation models for robotics offers the promise of enabling a wide range of capabilities in robotic agents while benefiting from the world knowledge embedded within these models. However, the computational demands of these models (with Billions to hundreds of Billions of parameters) far outstrip the capabilities of power-constrained robotic processing platforms. Further, the design of current AI hardware is largely driven by data centers, where the optimization metric (optimizing the ratio of throughput to total cost of ownership) is quite different from real-time operation with energy constraints. This project explores a first-of-its-kind algorithm-hardware co-design approach to energy-efficient, real-time, end-to-end robotic control. We will characterize the computational demands of robotic foundation models, project scaling trends and identify algorithmic knobs to modulate computational effort, culminating in the creation of efficient, hardware-friendly models. We will also address the unique challenges posed by the need for real-time operation through latency-driven model compression and AI accelerator design. The research will evaluate these methods on robotic tasks such as closed-loop manipulation, real-time navigation and multi-agent coordination, where latency and control stability are critical.

Start Date

Fall 2026

Postdoc Qualifications

PhD in ECE, CSE or CS, with a strong background and research track record in one of the following: (i) design of AI models for robotics or other physical AI applications, (ii) optimizing AI models for resource-constrained hardware, (iii) hardware architectures for AI. Publications in top-tier venues, strong hands-on skills with machine learning frameworks and broad understanding of computer systems.

Co-advisors

Anand Raghunathan, Silicon Valley Professor of Electrical and Computer Engineering
raghunathan@purdue.edu
https://engineering.purdue.edu/ISL

Aniket Bera, Associate Professor of Computer Science
aniketbera@purdue.edu
https://ideas.cs.purdue.edu/

Bibliography

A. Brohan et al, RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control, https://arxiv.org/abs/2307.15818
K. Black et al., π_0: A Vision-Language-Action Flow Model for General Robot Control, https://arxiv.org/abs/2410.24164
S. Jain et al., Neural network accelerator design with resistive crossbars: Opportunities and challenges. IBM J. Res. Dev. 63(6): 10:1-10:13 (2019)
Cheng et al., EfficientEQA: An Efficient Approach to Open Vocabulary Embodied Question Answering for Robotic Assistants. IROS 2025 https://arxiv.org/pdf/2410.20263