The Dependable Computing Systems Laboratory (DCSL) at Purdue University investigates the question of how to build dependable, heterogeneous, large-scale distributed systems.
“Dependability meets Data Analytics, and at Large Scales”
The above sums up our current research direction. We work on software systems to enable them to perform their functionality in the face of natural and malicious failures. We apply and adapt data analytic techniques to work with the noise of computer systems and at large system scales. Current application domains come from distributed software systems, embedded systems, cellular systems, and bioinformatics.
Since many business and life critical functions are being performed by distributed systems, they need to be dependable while meeting their performance goals. Thus, there is need for smart error detection, diagnosis, and recovery protocols. Since many of these systems operate on vast amounts of data and the patterns of errors or normal operation are approximate and noisy, we have to adapt leading-edge machine learning tools to these systems problems. There is also need for architectures that can combine dependability and security aspects without significantly degrading performance and do this in an adaptive manner, adapting to different user requirements and different runtime environments. This is our mission at DCSL.
Our application contexts come from various domains, many from our industrial colleagues. These include: security-critical enterprise (with Missile Defense Agency, Northrop Grumman and Lockheed Martin), mobile and cloud platforms (in collaboration with AT&T and IBM), large-scale scientific clusters and applications (in collaboration with Lawrence Livermore National Lab and Argonne National Lab), and cyber physical systems (in collaboration with GE Global Research Center and Sandia).
DCSL is the founding lab within the Purdue College of Engineering Center for Resilient Infrastructures, Systems, and Processes (CRISP). DCSL is the co-lead in the WHIN consortium, leading the thrust on “IoT Systems and Networking”.
- Mar 2020: The largest repository on computer system usage and failure data, called Fresco, with logs from Purdue, UIUC, and UT Austin, now has a paper — at DSN 2020. [ WWW ]
- Feb 2020: Our work on streaming apps (yes you read that right, not streaming media) to mobile devices so as to mitigate the storage crunch gets noticed in the popular press. This is the subject of our upcoming EWSN paper. [ Paper ] [ Popular Press ] [ Purdue News Story ]
- Dec 2019: Two papers are accepted on our security in embedded systems project, at Usenix Security and at NDSS (pending minor revision). Congratulations to Abe and Naif for leading the charge on these two papers. [ WWW ]
- Dec 2019:DCSL awards are given out at the end-of-the-semester function. Award winners are:
- Group Champ: Heng Zhang. Citation: “For making mobile magic”
- Best Fresher: Atul Sharma. Citation: “For making ML meaningful”
- Aug 2019: Three new projects start at DCSL.
- [Sandia National Lab] Emulation and security testing of embedded firmware, 2019-2020.
- [Northrop Grumman Corporation] Secure, Real-Time Decision-Making for the Autonomous Battlefield, 2019-20. Joint with David Inouye (Purdue ECE).
- [Northrop Grumman Corporation] A Privacy-Preserving Predictive Modeling Architecture for Edge Computing, 2019-20. Joint with Christopher Brinton (Purdue ECE).
- May-Aug 2019: Our work with Department of Energy/Lawrence Livermore National Lab (DOE/LLNL) on approximating scientific computation through reducing precision of floating point variables gets accepted into ICS (International Conference on Supercomputing). Acceptance rate was 45/193 = 23.3%. The work shows how you can quickly find which variables to reduce precision of and to what extent, while bounding the accuracy loss. [ PDF ] [ WWW ]
A related piece of work won the best paper award at the International Supercomputing Conference (ISC) held in Frankfurt, Germany, June 16-20. [ HPCWire news story ]