Archives of News Items
[August 2017] I am teaching the graduate level class on "Fault-tolerant Computer System Design" as a cross-listed class between ECE and CS. [ Details ]
[June 2017] IFIP Working Group 10.4 on Dependable Computing and Fault Tolerance had a workshop on “Dependable News” in Longmont, Colorado. Here is all our material from that workshop, including an interactive survey on what you find check-worthy in political statements. [ Presentation (Powerpoint) ] [ Quiz ] [ Survey link ] [ Answers for survey ] [ Paper ]
[February 2017] Our paper on security in bare-metal embedded systems is accepted to Oakland '17.
"Protecting Bare-metal Embedded Systems with Privilege Overlays" Abraham A Clements, Naif Saleh Almakhdhub, Khaled Saab, Prashast Srivastava, Jinkyu Koo, Saurabh Bagchi, and Mathias Payer. In Proceedings of the IEEE International Symposium on Security and Privacy (Oakland), 2017. (Acceptance rate: 60/450 = 13.3%) [ Paper ] [ Slides ] [ Video ]
[October 2016] I am honored to have been elected to the IEEE Computer Society Board of Governors, for a 3-year term starting 2017. Here is the Purdue news story. [ html ]
[October 2016] We received the VURI award from AT&T Labs Research, for our collaborative work on diagnosis and repair of QoS issues in their multi-tenant cloud environments. Here is the Purdue news story. [ html ]
[Summer 2016] A paper in Eurosys (our work with AT&T Labs on data recovery in data centers) [ pdf ], one in ICS (our work on a domain specific language for computational genomics) [ pdf ], and two papers in SRDS - one on our joint work with Georgia Tech and AT&T Labs on predictive analytics in cellular networks [ pdf ] and the other on machine learning for detecting silent data corruptions in parallel programs [ pdf ].
[August 2016] I am teaching the graduate-level course titled "Fault-tolerant Computer System Design" which is cross-listed in CS (CS 590FTC) and ECE (ECE 69500). This course is suitable for any starting graduate student in CS or Computer Engineering. [ Course web page ]
[November 18, 2015] We organized a Bird-of-Feather session at the Supercomputing conference in Austin, TX on the topic of our recent NSF project on the open data repository for system usage and failure data. The other organizers were Carol Song (Purdue), Ravi Iyer and Zbigniew Kalbarczyk (UIUC), and Nathan DeBardeleben (Los Alamos National Lab). I have now posted a report for this BoF session. [ html ]
[October 2015] My keynote talk at 34th IEEE Symposium on Reliable Distributed Systems (SRDS) on September 28, 2015 was titled “Dependability in a Connected World: Research in Action in the Operational Trenches”. The slides and the audio recording (in Webex format) are now available. [ Slides ] [ Audio recording ]
[October 8, 2015] I gave a talk to the incoming ECE undergraduate sophomores about why they should all become Computer Engineers. Here is the presentation and the recording for it. [ Powerpoint ] [ MPEG recording ]
[September 10, 2015] Our paper wins the best paper award at ACM BCB, being held at Georgia Tech September 10-12. There were 48 papers, accepted out of 141 submissions. The paper is on developing a distributed classifier for micro RNAs affecting gene expression. Here is the paper and here is the picture. [ pdf ] [ pic ]
[August, 2015] In Fall, I am teaching the graduate course on fault-tolerant computer system design - ECE 695B/CS 590FTC. If you are a graduate student in ECE or CS looking for a hands-on project-based course that also satisfies your POS requirement, consider taking this course. Here is the web page for the course. [ Web page for class ]
[June 2, 2015] We are looking for a research scientist with background in simulation and optimization to start on our recent Purdue-GE Center. [ Announcement (pdf) ]
[April 13, 2015] We are looking for a post-doctoral researcher in practical distributed systems to start on a recently funded NSF project, from July 15, 2015. [ Announcement ]
[February 2015] Our paper on software-only exact record and replay for embedded wireless nodes is accepted in IPSN. This shows how one can use software-only to record all sources of non-determinism, compress them in a domain-specific manner, and create a trace that allows for perfect record and replay. [ pdf ]
[February 2015] We get a Google Faculty Award for our work on detection of spurious messages on social networking sites. This award is jointly with Prof. Alex Quinn also of ECE. Here are the details. [ html ].
[January 2015] Our paper is accepted to appear in Communications of the ACM (CACM). It provides an overview and open research problems in debugging in the very large. It is co-authored with colleagues from LLNL, Ohio State, and of course Purdue. [ pdf ]
[March 11, 2015] Here is our presentation to Duke Energy given at the Energy Center at Purdue. [ pptx ].
[Aug 15, 2014] We are looking for two graduate research assistants, starting this Fall, for an NSF-funded and a Department of Energy-funded project. [ Announcement ] Update: These positions have been filled.
[Aug 2014] I am teaching an exciting, practical computer systems course titled "Fault-tolerant Computer System Design" this Fall. Come and check it out. [ Course web page ]
[June 2014] We have two proposals to the National Science Foundation accepted. The first is from the Computational Research Infrastructure (CRI) program and is titled "Computer System Failure Data Repository to Enable Data-Driven Dependability Research" (jointly with Carol Song of Purdue). The second is from the NeTS program and is titled "Tango: Performance and Fault Management in Cellular Networks through Device-Network Cooperation" (jointly with Alan Qi at Purdue, Mostafa Ammar at Georgia Tech, and Kaustubh Joshi and Rajesh Panta at AT&T Labs).
[June 2014] Our paper to Supercomputing on parallelizing genome alignment is accepted. This is work done under our NSF XPS award from September 2013. The paper is here. [ pdf ] And the open source release is here. [ html ]
[Feb 2014] We have papers accepted at PLDI (debugging hangs and performance slowdowns in parallel programs), DSN (zero-day SQL injection attack detection), and TPDS (performance debugging in large-scale programs, extension of our PACT 12 paper). All these papers are available from our group's publications page. [ html ]
Update: We have released an open source implementation of our PLDI paper system. It also includes the implementation of our earlier PACT '12 paper, integrated into one system, which we call AutomaDeD. The github repository is at: https://github.com/scalability-llnl/AutomaDeD
[November 20, 2013] I am honored to be named an ACM Distinguished Scientist. ACM named 40 Distinguished Members this year. My citation reads "for laying the software basis for designing distributed systems that can tolerate faults under a variety of operating conditions." The ACM news story can be found at the following URL. [ html ]
[September 6, 2013] We are looking for a post-doctoral research associate with systems building skills in distributed systems. [ html ] This position has been filled.
[August 2013] DCSL gets two new research awards - one from the National Science Foundation for making computational genomics applications scalable and the second from Northrop Grumman for making their distributed systems that are used for military mission planning secure. [ NSF grant ] [ Northrop Grumman grant ]
[April 8, 2013] The powers-that-be (aka Board of Trustees) entrust me with the awesome responsibility of being a full professor. [ Purdue news story ] [ ECE news story ]
[Dec 6, 2012] I have been selected as an IMPACT Faculty Fellow at Purdue for 2013-14. [ html ]
[July 15, 2013] My "Writings" page gets a facelift with tales of two recent travels. [ html ]
[June 2012] Our work on reliability with scientific applications from Lawrence Livermore National Lab get some press time. "Nuclear weapon simulations" is the headline grabbing moniker for the work, though we work with unclassified benchmark applications. [Purdue news story] [Campus newspaper news story]
[February 6, 2012] The non-linear version of my bio gets an update. [html]
[Mar 2012] Three papers of ours get into DSN this year.
[February 6, 2012] We are hiring another post-doc for the Missile Defense Agency project with immediate availability. The skills needed are computer and communications modeling and distributed algorithms (systems-focused, not theory-focused). Due to the restriction from the sponsor, the position is only open to US citizens or permanent residents. See the posting for the details. [pdf] [html]
[November 2, 2011] Our paper in Sensys wins the best paper award. The paper is: Aveksha: A Hardware-Software Approach for Non-intrusive Tracing and Profiling of Wireless Embedded Systems by Matthew Tancreti, Mohammad Sajjad Hossain, Saurabh Bagchi, and Vijay Raghunathan. Here is the final version of the paper. [pdf] And here is the final presentation. [pdf]
[September 2011] I am spending my sabbatical at IBM Research in Austin working in the Future Systems Department on an exciting new project on the intersection of mobile computing and IBM's cloud offerings. Time to make an honest living! [IBM Austin Research Lab]
[June 2011] I will be headed to DSN in Hong Kong June 26-30. There I am organizing a panel with the ambitious title of “10 grand challenges in dependability for the next decade”. It will be on June 29 with an exciting set of 5 speakers, drawn from both industry and academia. See the links to the position statements in the DSN program. [DSN program]
[May 2011]I visited IBM Research in Austin and gave a presentation on our work on debugging software bugs in large-scale systems. The presentation is available from the presentations page of our research group. [html]
[April 2011] New post-doctoral position opening in DCSL for a DoD project. If you are a PhD graduate in Computer Science or Computer Engineering, have a strong research record in computer modeling and cyber security, are a US citizen or a permanent resident, take a look. [Announcement in pdf] [Announcement in html]
[November 2010] I have been appointed an Assistant Director of CERIAS, the university’s security research and education center. I will be one of 4 Assistant Directors. You can find information about CERIAS at the following URL. [html]
[August 2010] We have been awarded a 3 year project by the Missile Defense Agency (MDA) of the Department of Defense titled “Agent-based Enhanced C2BMC Architecture”. C2BMC stands for Command and Control, Battle Management, and Communications. The project is aimed at developing architectures for command and control of missile defense systems as part of the Enhanced C2BMC Program of MDA. It involves Dan DeLaurentis of Aeronautics and Astronautics Engineering at Purdue as the PI and me as the co-PI. Here is a news story that Purdue has just done on our grant. [Story from Purdue] [Story in Purdue Exponent, the campus newspaper]
3. [November 2010] I will be teaching an experimental graduate-level course in Spring 2011, titled “Fault-Tolerant Computer System Design”. This is a project-based course in which students do a semester-long project, which in the past has often led to publications. You can find details of the course from the following URL, which is from my last offering of the course. The class will meet M W F 4.30-5.20. [html]
4. [June 2010] Our work on debugging of large-scale parallel programs for nuclear weapon simulation, done with colleagues at Lawrence Livermore National Lab (LLNL), is in the news. [Story 1 (from Purdue)] [Story 2 (ACM Tech News)] [Story 3 (HPC Wire)] …
5. [April 2010] I have been awarded the Eta Kappa Nu (HKN) “Outstanding Professor Award” for 2010. The award was presented at the Spring Banquet and is given to one faculty member from ECE.
6. [February 28, 2010] Our paper on automatic error detection in large-scale parallel programs, done with colleagues at Lawrence Livermore National Lab (LLNL) is accepted to appear at DSN (Dependable Systems and Networks), to be held June 28-July 1, 2010.
Greg Bronevetsky, Ignacio Laguna, Saurabh Bagchi, Bronis R. de Supinski, Dong H. Ahn, and Martin Schulz, “AutomaDeD: Automata-Based Debugging for Dissimilar Parallel Tasks”.
7. [October 31, 2009] I am participating in the NSF NEES project as a Purdue co-PI. This is the largest grant ever won at Purdue. The National Science Foundation created the George E. Brown, Jr. Network for Earthquake Engineering Simulation (NEES) to improve our understanding of earthquakes and their effects. NEES is a shared national network of 14 experimental facilities, collaborative tools, a centralized data repository, and earthquake simulation software. Together, these resources provide the means for collaboration and discovery in the form of more advanced research based on experimentation and computational simulations of the ways buildings, bridges, utility systems, coastal regions, and geomaterials perform during seismic events.
Purdue will be running the operations of NEES for 2009-2014. My role in this is the cybersecurity officer responsible for the security of NEES assets at NEEScomm (the headquarters, here at Purdue) as well as at the 14 sites through the US. You can find the press release upon the announcement of the NEES award and the announcement of the NEES team at the following URLs:
8. [August 11, 2009] We are looking for two PhD students to join our lab as Research Assistants. [html]
[Oct 1, 09] These positions have been filled.
2, 2009] I have been selected to be the Program Committee Chair for DSN, our
premier conference, for its 2011 edition, which will be held in
. 10. [July 2009] Our papers have been accepted in Sensys, Supercomputing (nominated as a candidate for best student paper), Middleware, and MASS. Hey, this feels pretty good for all of us here.
[May 19, 2009] I have been awarded the Teaching for
Tomorrow award by
12. [May 1, 2009] I have updated the picture album. [html]
13. [April 30, 2009] I have become a Senior Member of ACM and a Full Member of the Sigma Xi research society.
14. December 22, 2008: Two of our papers have been accepted for Infocom ‘09. The acceptance rate was 282/1435 = 19.7%.
17. December 2008: I will be teaching a new graduate level course in Spring 09 - ECE 695B titled “Design of Fault Tolerant Computer Systems”. This course updates a previous course ECE 572 that I last taught in Spring 07. ECE 695B meets Tue and Thu 1.30-2.45 in EE 115. You can find the syllabus at the course web page here [html].
18. November 26, 2008: I will be organizing a workshop on intrusion-tolerant systems called WRAITS, to be held with DSN 09 in Lisbon, Portugal on Jun 29-Jul 2, 2009. My co-organizers are Miguel Correia (U. of Lisbon) and Partha Pal (BBN Technologies). The workshop is called “Recent Advances on Intrusion-Tolerant Systems”. Here is the web site of the previous conferences. [html]
19. November 11, 2008: Our patent filed with Avaya colleagues titled “Stateful and cross-protocol intrusion detection for voice over IP” has been approved. It was filed on September 30, 2004. Here is the patent on the USPTO site. [html]
20. November 9, 2008: Our long-delayed paper detailing in a comprehensive way our work on intrusion detection in Voice-over-IP environments has been accepted for publication in the Elsevier International Journal of Information Security.
This paper tackles the problem of optimal responses to pick when an intrusion is detected in a distributed system. We show that optimality is NP-hard and present an approximate algorithm based on genetic algorithm to pick the best responses.
This paper presents a technique to determine the choice and placement of intrusion detectors among different services in a distributed system. It is not cost-effective to deploy them everywhere (maintenance, performance costs plus tedium of responding to alerts) and here we provide a Bayesian network-based that will (approximately) optimize the overall quality of detection in the system.
31. This paper presents a class of attacks called stealthy packet dropping in wireless multi-hop networks that no existing technique can detect. This class has the added property that local overhearing based techniques will cause a legitimate node to be accused. We show, how by maintaining a little additional information during route setup, this class of attacks can be detected through local overhearing.
32. April 14, 2008: My promotion to the position of an Associate Professor with tenure is approved by the highest powers (aka the Board of Trustees) and will take effect from Fall 08. The process has a few different rungs starting with the departmental meeting in early Fall (07), followed within a couple of weeks by the meeting at the College of Engineering level, and then at the University level in mid-Spring (08).
35. June 28, 2007: I am invited to give a talk at the summer meeting of the IFIP Working Group 10.4 on Dependable Computing and Fault Tolerance. It was held near Edinburgh, Scotland. I gave a talk on the security implications of covert timing channels in networked systems (slides).
36. June 25, 2007: I present our work on security in sleep-wake aware wireless networks at DSN (Dependable Systems and Networks) held in the Edinburgh International Conference Center (slides). I also gave a short talk on data modeling in sensor networks to suppress communication (slides).
37. June 11, 2007: The second Ph.D. student from DCSL, Gunjan Khanna, defended his thesis. His thesis is titled “Non Intrusive Detection and Diagnosis of Failures in High Throughput Distributed Systems”. Here are the documents – Abstract, Thesis, and Presentation. Gunjan has gone to work for McKinsey in Pittsburgh.
38. Mar 15, 2007: Our papers are accepted for DSN and HPDC. The DSN paper is on providing security in multi-hop wireless networks that use sleep-wake scheduling and it was one of 24 papers accepted out of a total of 94 submitted to the PDS track. The HPDC paper is on failure-aware checkpointing in non-dedicated storage systems. The acceptance rate at HPDC was 20%. The conferences will be in June in Edinburgh and Monterey Bay, CA respectively.
41. Dec 14, 2006: The first PhD student from our research group, Issa Khalil graduated. His thesis topic was “Mitigation of Control and Data Traffic Attacks in Wireless Ad-Hoc and Sensor Networks”. You can read his thesis here.
42. Nov 14, 2006: Our paper on sensor network reprogramming gets accepted into Infocom to be held in Anchorage, Alaska in May 2007. The authors are Rajesh Panta and Issa Khalil. 252 out of about 1400 papers were accepted giving an acceptance rate of about 18%. Another paper from our group on diagnosis in distributed systems was accepted as a short paper for Infocom.