September 2015

DCSL keynote at SRDS

Saurabh’s keynote talk at 34th IEEE Symposium on Reliable Distributed Systems (SRDS) on September 28, 2015 was titled “Dependability in a Connected World: Research in Action in the Operational Trenches”. The slides and the audio recording (in Webex format) are now available. [ Slides ] [ Audio recording ]

Open source repository for system usage and failure data

We have released the first version of Fresco, the open source repository for system usage and failure data. This is part of the NSF project “Computer System Failure Data Repository to Enable Data-Driven Dependability” CNS-1513197. It contains data about 500K jobs submitted to the Purdue Condor cluster over a 6 month period (Oct 2014-Mar 2015). Here is the link [ html ].

 

Best Paper in ACM BCB

Our paper wins the best paper award at ACM BCB, being held at Georgia Tech September 10-12. There were 48 papers, accepted out of 141 submissions. The paper is on developing a distributed classifier for micro RNAs affecting gene expression and co-authors are Asish Ghoshal, Ananth Grama, Saurabh Bagchi, and Somali Chaterji. Here is the paper and here is the picture. [ pdf ] [ pic ]

 

DCSL in the news

Our work on analyzing system logs and failure logs of large supercomputing systems is featured in a news story at Purdue. This news story talks of a part of our work that is dealing with Purdue’s large centralized computing clusters. Subrata and Suhas are the graduate researchers on the project. Our co-PI at Purdue is Carol Song of ITaP and at UIUC the co-PIs are Ravi Iyer and Zbigniew Kalbarczyk.

News story URL:

http://www.itap.purdue.edu/newsroom/news/150813_communityclusters_usefailresearch.html