{"id":2093,"date":"2020-08-12T21:34:02","date_gmt":"2020-08-13T01:34:02","guid":{"rendered":"https:\/\/engineering.purdue.edu\/dcsl\/?page_id=2093"},"modified":"2026-02-21T15:30:57","modified_gmt":"2026-02-21T19:30:57","slug":"fault-tolerance-for-distributed-applications","status":"publish","type":"page","link":"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/fault-tolerance-for-distributed-applications\/","title":{"rendered":"Dependable Distributed Applications"},"content":{"rendered":"<p style=\"text-align: left;\">To view publications by project, click the buttons down below:<\/p>\n<div class=\"btn-group-vertical\" style=\"text-align: left;\" role=\"group\" aria-label=\"Publication by Project\"><a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/\"><button class=\"btn btn-primary\" type=\"button\">All Publications<\/button><\/a><br \/>\n<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/fault-tolerance-for-distributed-applications\/\"><button class=\"btn btn-primary\" type=\"button\">Dependable Distributed Applications<\/button><\/a><br \/>\n<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/resilient-wireless-networks\/\"><button class=\"btn btn-primary\" type=\"button\">Resilient Wireless Networks<\/button><\/a><br \/>\n<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/distributed-secure-systems\/\"><button class=\"btn btn-primary\" type=\"button\">Distributed Secure Systems<\/button><\/a><br \/>\n<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/resilient-ai\/\"><button class=\"btn btn-primary\" type=\"button\">Resilient AI<\/button><\/a><br \/>\n<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/phd-theses\/\"><button class=\"btn btn-primary\" type=\"button\">PhD Theses by DCSL Members<\/button><\/a><br \/>\n<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications-by-year\/miscellaneous\/\"><button class=\"btn btn-primary\" type=\"button\">Miscellaneous<\/button><\/a><\/div>\n<div class=\"publications\">\n<h2>2025<\/h2>\n<ol>\n<li><strong>PEARC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2025\/fresco-pearc25.pdf\">FRESCO: A Public Multi-Institutional Dataset for Understanding HPC System Behavior and Dependability<\/a>,&#8221; <span class=\"badge\">Dependable Distributed Applications<\/span><br \/>\nJoshua McKerracher, Preeti Mukherjee; Rajesh Kalyanam (Oak Ridge National Lab); and Saurabh Bagchi. In ACM Practice and Experience in Advanced Research Computing <strong>(PEARC)<\/strong>, pp. 1-6, July 21-24, 2025. (Acceptance rate: 88\/267 = 33.0%)<\/li>\n<li><strong>ACM TKDD<br \/>\n<\/strong><span style=\"font-weight: 400;\">&#8220;<\/span><a href=\"https:\/\/engineering.purdue.edu\/dcsl\/wp-content\/uploads\/2025\/02\/AutoForecast_ACM_TKDD.pdf\"><span style=\"font-weight: 400;\">Evaluation-Free Time-Series Forecasting Model Selection via Meta-Learning<\/span><\/a><span style=\"font-weight: 400;\">.&#8221; <span class=\" authors\"><span class=\"badge badge-success\">Dependable Distributed Applications<\/span><\/span><\/span><br \/>\n<span style=\"font-weight: 400;\">Abdallah, Mustafa, Ryan A. Rossi, Kanak Mahadik, Sungchul Kim, Handong Zhao, and Saurabh Bagchi. ACM Transactions on Knowledge Discovery from Data (TKDD), pp. 1-40, 2025.<\/span><\/li>\n<\/ol>\n<h2>2024<\/h2>\n<ol>\n<li><strong>EECV<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/www.ecva.net\/papers\/eccv_2024\/papers_ECCV\/papers\/07666.pdf\">RECON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories<\/a>.&#8221; <span class=\" authors\"><span class=\"badge badge-success\">Dependable Distributed Applications<\/span><\/span><br \/>\nLu, Chen-Yi, Shubham Agarwal, Md Mehrab Tanjim, Kanak Mahadik, Anup Rao, Subrata Mitra, Shiv Kumar Saini, Saurabh Bagchi, and Somali Chaterji. In European Conference on Computer Vision <b>(ECCV)<\/b>, pp. 288-306, 2024.<\/li>\n<li><strong>ICLR<\/strong><br \/>\n&#8220;<a href=\"https:\/\/openreview.net\/pdf?id=wprSv7ichW\">Benchmarking Algorithms for Federated Domain Generalization<\/a>,&#8221; <span class=\" authors\"><span class=\"badge badge-success\">Dependable Distributed Applications<\/span><\/span><br \/>\nRuqi Bai, Saurabh Bagchi, and David I. Inouye. Accepted to appear at the 12th International Conference on Learning Representations (ICLR), pp. 1-24, Vienna, Austria, May 2024. (Spotlight) (Acceptance rate: Spotlight = 366\/7304 = 5.0%)<\/li>\n<\/ol>\n<h2>2022<\/h2>\n<ol>\n<li><strong>NeurIPS<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2022\/rca-neurips22.pdf\">Root Cause Analysis of Failures in Microservices through Causal Discovery<\/a>,&#8221; <span class=\" authors\"><span class=\"badge badge-success\">Dependable Distributed Applications<\/span><\/span><br \/>\nAzam Ikram; Sarthak Chakraborty, Subrata Mitra, Shiv Saini (Adobe Research); Saurabh Bagchi, and Murat Kocaoglu. At the 36<sup>th<\/sup> Conference on Neural Information Processing Systems (NeurIPS), pp. 31158-31170, November-December 2022. (Acceptance rate: 2,665\/10,411 = 25.6%)<\/li>\n<li><b>CIKM<\/b><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2022\/autoforecast-cikm22.pdf\">AutoForecast: Automatic Time-Series Forecasting Model Selection<\/a>,\u201d <span class=\" authors\"><span class=\"badge badge-success\">Dependable Distributed Applications<\/span><\/span><br \/>\nMustafa Abdallah (Purdue); Ryan Rossi, Kanak Mahadik, Sungchul Kim, Handong Zhao, Haoliang Wang (Adobe Research); Saurabh Bagchi (Purdue). At the 31st ACM International Conference on Information and Knowledge Management (CIKM), pp. 1-10, October 2022. (Acceptance rate: 274\/1175 = 23.3%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/software#cikm22\">Dataset<\/a> ]<\/li>\n<li><b>OSDI<\/b><br \/>\n<span class=\" authors\">&#8220;ORION: Optimized Execution Latency for Serverless DAGs,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAshraf Youssef Mahgoub, Edgardo Barsallo Yi; Karthick Shankar (Carnegie Mellon University); Somali Chaterji; Sameh Elnikety (Microsoft Research); Saurabh Bagchi. Accepted to appear at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201922), pp. 1\u201315, July 2022. (Acceptance rate: 49\/253 = 19.4%) <\/span><\/li>\n<li><b>Sigmetrics<\/b><br \/>\n<span class=\" authors\">&#8220;WISEFUSE: Workload Characterization and Optimized Execution Plans for Serverless DAG Workflows,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAshraf Mahgoub, Edgardo Barsallo Yi; Karthick Shankar (Carnegie Mellon University); Eshaan Minocha, Somali Chaterji; Sameh Elnikety (Microsoft Research); Saurabh Bagchi. Accepted to appear at the 2022 ACM SIGMETRICS conference, pp. 1\u201324, June 2022. (Acceptance rate: 17\/126 = 13.5% (Winter submission cycle)) <\/span><\/li>\n<\/ol>\n<h2>2021<\/h2>\n<ol>\n<li><b>Usenix ATC<\/b><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2021\/sonic_atc21.pdf\">SONIC: Application-aware Data Passing for Chained Serverless Applications<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\n<span class=\" authors\">Ashraf Mahgoub, Karthick Shankar (CMU), Subrata Mitra (Adobe Research), Ana Klimovic (ETH Zurich), Somali Chaterji, and Saurabh Bagchi. At the Usenix Annual Technical Conference (Usenix ATC), pp. 1-15, July 2021. (Acceptance rate: 64\/341 = 18.8%)\u00a0<\/span><\/li>\n<\/ol>\n<h2>2020<\/h2>\n<ol>\n<li><b>ISM<\/b><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2020\/closingtheloop_ism20.pdf\">Closing-the-Loop: A Data-Driven Framework for Effective Video Summarization<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\n<span class=\" authors\">Ran Xu, Haoliang Wang (Adobe Research), Stefano Petrangeli (Adobe Research), Viswanathan Swaminathan (Adobe Research), and Saurabh Bagchi. At the 22nd IEEE International Symposium on \u200bMultimedia (ISM), pp. 1&#8211;8, Dec 2020. (Acceptance rate: 16\/55 = 29.1%)<br \/>\n<\/span><\/li>\n<li><span class=\" authors\"><strong>Usenix ATC<\/strong><\/span><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2020\/optimuscloud_atc20_cameraready.pdf\">OptimusCloud: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\n<span class=\" authors\">Ashraf Mahgoub, Alexander Michaelson Medoff, Rakesh Kumar <span class=\"affl\">(Microsoft)<\/span>, Subrata Mitra <span class=\"affl\">(Adobe Research)<\/span>, Ana Klimovic <span class=\"affl\">(Google Research)<\/span>, Somali Chaterji, and Saurabh Bagchi. At the Usenix Annual Technical Conference (Usenix ATC), pp. 189-204, July 2020. (Acceptance rate: 65\/348 = 18.7%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2020\/optimuscloud_atc20.pdf\">Presentation<\/a> ] [ <a href=\"https:\/\/youtu.be\/UV0d5rlN8w0\">Video<\/a> ]<\/span><\/li>\n<li><strong>DSN<\/strong><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2020\/fresco_dsn20_cameraready.pdf\">The Mystery of the Failing Jobs: Insights from Operational Data from Two University-Wide Computing Systems<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nRakesh Kumar, Saurabh Jha (University of Illinois at Urbana-Champaign), Ashraf Mahgoub, Rajesh Kalyanam, Stephen L Harrell, Xiaohui Carol Song, Zbigniew Kalbarczyk (University of Illinois at Urbana-Champaign), William T Kramer (University of Illinois at Urbana-Champaign), Ravishankar K. Iyer (University of Illinois at Urbana-Champaign), and Saurabh Bagchi. At the 50th IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN) , pp. 158\u2013171, June-July 2020. (Acceptance rate: 48\/291 = 16.5%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2020\/dsn20_mystery_of_failing_jobs.pdf\">Presentation<\/a> ] [ <a href=\"https:\/\/drive.google.com\/file\/d\/1ElUFC_IKY8UqqW7bqCT8XhmUJvKOcv0a\/view?usp=sharing\">Video<\/a> ]<\/li>\n<li><strong>OJCS<\/strong><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2020\/vision-paper-cyber-resilience_ojcs_2020.pdf\">Vision Paper: Grand Challenges in Resilience: Autonomous System Resilience through Design and Runtime Measures<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSaurabh Bagchi, Vaneet Aggarwal, Somali Chaterji, Fred Douglis, Aly El Gamal, Jiawei Han, Brian J. Henz, Hank Hoffmann, Suman Jana, Milind Kulkarni, Felix Xiaozhu Lin, Karen Marais, Prateek Mittal, Shaoshuai Mou, Xiaokang Qiu, and Gesualdo Scutari. In IEEE Open Journal of the Computer Society (OJCS), pp. 1-15, 2020, doi: 10.1109\/OJCS.2020.3006807.<\/li>\n<\/ol>\n<h2>2019<\/h2>\n<ol>\n<li><strong>CoNLL<\/strong><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2019\/similarity_nlp_conll19_cameraready.pdf\">SIMVECS: Similarity-based Vectors for Utterance Representation in Conversational AI Systems<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAshraf Mahgoub, Youssef Shahin (Microsoft), Riham Mansour (Microsoft), and Saurabh Bagchi. At the SIGNLL Conference on Computational Natural Language Learning (CoNLL), pp. 1-10, Nov 3-4, 2019, Hong Kong. (Acceptance rate: 97\/428 = 22.7%)<\/li>\n<li><strong>Usenix ATC<\/strong><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2019\/online-nosql-tuning_usenixatc19_cameraready.pdf\">SOPHIA: Online Reconfiguration of Clustered NoSQL Databases for Time-Varying Workloads<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAshraf Mahgoub, Paul Wood, Alexander Medoff, Subrata Mitra (Adobe Research), Folker Meyer (Argonne National Lab), Somali Chaterji, and Saurabh Bagchi. At the 2019 USENIX Annual Technical Conference (Usenix ATC), pp. 223-240, Jul 10-12, 2019, Renton, WA. (Acceptance rate: 71\/356 = 19.9%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2019\/sophia_usenixatc19.ppsx\">Presentation<\/a> ] [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2019\/sophia_usenixatc19_lightning_talk.ppsx\">Lightning talk<\/a> ] [ <a href=\"https:\/\/www.youtube.com\/watch?v=z0u8WWJ5XWs\">YouTube video<\/a> ]<\/li>\n<li><strong>ICS<\/strong><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2019\/gpu-fp-tuning_ics19_submitted.pdf\">AMPT-GA: Automatic Mixed Precision Floating Point Tuning for GPU Applications<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nPradeep Kotipalli, Ranvijay Singh, Paul Wood, Ignacio Laguna (Lawrence Livermore National Lab), and Saurabh Bagchi. At the 33rd ACM International Conference on Supercomputing (ICS), pp. 160-170, Jun 26-28, 2019, Phoenix, AZ. (Acceptance rate: 45\/193 = 23.3%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2019\/ampt-ga_ics19.pdf\">Presentation<\/a> ] [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2019\/ampt-ga_ics19.ppsx\">Slide show<\/a> ]<\/li>\n<li><strong>ISC<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2019\/gpumixer_isc19_cameraready.pdf\">GPUMixer: Performance-Driven Floating-Point Tuning for GPU Scientific Applications<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nIgnacio Laguna, Paul C. Wood, Ranvijay Singh, and Saurabh Bagchi. Accepted to appear at the International Supercomputing Conference (ISC), pp. 227-246, Jun 17-19, Frankfurt, Germany. (Acceptance rate: 17\/72 = 23.6%) [ <span style=\"color: #ff0000;\">Hans Meuer Award winner (best paper)<\/span> ] [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2019\/gpumixer_isc19.pdf\">Presentation<\/a> ]<\/li>\n<li><strong>CACM<\/strong><br \/>\n\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2019\/dependability_in_edge_computing.pdf\">Dependability in Edge Computing<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nPaul Wood, Heng Zhang, Muhammad-Bilal Siddiqui, Saurabh Bagchi. To appear in Communications of the ACM (CACM) as Contributed Article, pp. 1-16.<\/li>\n<li>\u201c<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2019\/smoothing_path_to_computing.pdf\">Smoothing the path to computing: pondering uses for big data<\/a>,\u201d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nM Hall, R Ladner, D Levitt, MAP Qui\u00f1ones, S Bagchi. Communications of the ACM 62 (3), 8-9.<\/li>\n<li>&#8220;<a href=\"https:\/\/www.rcac.purdue.edu\/fresco\/index.html\">FRESCO: Open Source Data Repository for Computational Usage and Failures<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nS Bagchi, R Kumar, R Kalyanam, S Harrell, CA Ellis, C Song. <em>Repository documentation found <a href=\"https:\/\/diagrid.org\/resources\/1099\/download\/FRESCO_Repository_Description.pdf\">here<\/a>.<\/em><\/li>\n<\/ol>\n<h2>2018<\/h2>\n<ol>\n<li><strong>ICST<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2018\/final_xstressor_icst19.pdf\">XSTRESSOR: Automatic Generation of Large-Scale Test Inputs by Inferring Path Conditions<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nCharitha Saumya, Jinkyu Koo, Milind Kulkarni, and Saurabh Bagchi. Accepted to appear at the 12th IEEE International Conference on Software Testing, Verification, and Validation (ICST), pp. 1-11, Apr 22-27, 2019, Xi&#8217;an, China. (Acceptance rate: 31\/110 = 28.2%) [ <span style=\"color: #ff0000;\">Distinguished Paper Award<\/span> (one of 3) ]<\/li>\n<li><strong>ICST<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2018\/final_pyse_icst19.pdf\">PySE: Automatic Worst-Case Test Generation by Reinforcement Learning<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nJinkyu Koo, Charitha Saumya, Milind Kulkarni, and Saurabh Bagchi. Accepted to appear at the 12th IEEE International Conference on Software Testing, Verification, and Validation (ICST), pp. 1-11, Apr 22-27, 2019, Xi&#8217;an, China. (Acceptance rate: 31\/110 = 28.2%)<\/li>\n<li><strong>Middleware<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2018\/final_pythia_middleware18_cameraready.pdf\">Pythia: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nRan Xu (Purdue University); Subrata Mitra (Adobe Research); Jason Rahman (Facebook); Peter Bai (Purdue University); Bowen Zhou (LinkedIn); Greg Bronevetsky (Google); Saurabh Bagchi (Purdue University). At the 19th ACM\/IFIP International Middleware Conference, pp. 146-160, December 10-14, 2018, Rennes, France. (Acceptance rate: 22\/95 = 23.2%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2018\/pythia_middleware18.pdf\">Presentation<\/a> ]<\/li>\n<li><strong>USENIX ATC<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2018\/final_videochef_usenixatc18_cameraready.pdf\">VideoChef: Efficient Approximation for Streaming Video Processing Pipelines<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nRan Xu, Jinkyu Koo, Rakesh Kumar, Peter Bai; Subrata Mitra (Adobe Research); Sasa Misailovic (University of Illinois Urbana-Champaign); Saurabh Bagchi. At the 2018 USENIX Annual Technical Conference (USENIX ATC), pp. 43-56, July 11-13, 2018, Boston, MA. (Acceptance rate: 76\/378 = 20.1%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2018\/videochef_atc18.pdf\">Presentation<\/a> ] [ <a href=\"http:\/\/0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com\/atc18\/xu_ran.mp3\">Audio<\/a> ]<\/li>\n<\/ol>\n<h2>2017<\/h2>\n<ol>\n<li><strong>ScalA<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/final_snowpack_scala17_cameraready.pdf\">Snowpack: Efficient Parameter Choice for GPU Kernels via Static Analysis and Statistical Prediction<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nRanvijay Singh, Paul Wood, Ravi Gupta (Intel), Saurabh Bagchi, Ignacio Laguna (LLNL), At the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), co-located with the IEEE\/ACM Supercomputing conference, pp. 1-8, November 13, 2017, Denver, Colorado. [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2017\/snowpack_scala.pdf\">Presentation<\/a> ]<\/li>\n<li><strong>Middleware<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/final_rafiki_middleware17_cameraready.pdf\">Rafiki: A Middleware for Parameter Tuning of NoSQL Datastores for Dynamic Metagenomics Workloads<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAshraf Mahgoub, Paul Wood, Sachandhan Ganesh, Subrata Mitra (Adobe Research), Wolfgang Gerlach (Argonne National Laboratory), Travis Harrison (Argonne National Laboratory), Folker Meyer (Argonne National Laboratory), Ananth Grama, Saurabh Bagchi, and Somali Chaterji. At the ACM\/IFIP\/USENIX Middleware Conference, pp. 28-40, Dec 11-15, 2017, Las Vegas, Nevada. (Acceptance rate: 20\/85 = 23.5%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2017\/final_middleware17_rafiki_ashraf.pdf\">Presentation<\/a> ] [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2017\/middleware17_rafiki_poster.pdf\">Poster<\/a> ]<\/li>\n<li><strong>Briefings in Bioinformatics<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/final_federation_bib17_cameraready.pdf\">Federation in Genomics Pipelines: Techniques and Challenges<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSomali Chaterji, Jinkyu Koo, Ninghui Li, Folker Meyer, Ananth Grama, and Saurabh Bagchi. In Oxford Briefings in Bioinformatics, pp. 1-11, Published: 29 August 2017. [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/genomics-federation_bib17_abstract.txt\">Abstract<\/a> ]<\/li>\n<li><strong>Briefings in Bioinformatics<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/final_mgrast_bib17_cameraready.pdf\">MG-RAST Version 4\u2014Lessons learned from a decade of low-budget ultra-high throughput metagenome analysis<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nFolker Meyer, Saurabh Bagchi, Somali Chaterji, Wolfgang Gerlach, Ananth Grama, Travis Harrison, Tobias Paczian, Will Trimble, Andreas Wilke. In Oxford Briefings in Bioinformatics, bbx105, pp. 1-12, September 2017. [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/mgrast_bib17_abstract.txt\">Abstract<\/a> ]<\/li>\n<li><strong>ACM BCB<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/final_scaladbg_bcb17_cameraready.pdf\">Scalable Genomic Assembly through Parallel de Bruijn Graph Construction for Multiple K-mers<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nKanak Mahadik, Christopher Wright, Milind Kulkarni, Saurabh Bagchi, Somali Chaterji. In Proceedings of the 8th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), pp. 425-431, Aug 20-23, 2017, Boston, MA. [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2017\/final_bcb17_scaladbg_kanak.pdf\">Presentation<\/a> ]<\/li>\n<li><strong>FTXS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2017\/final_ftxs_2017_camera_ready.pdf\">Understanding the Spatial Characteristics of DRAM Errors in HPC Clusters<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAyush Patwari, Ignacio Laguna, Martin Schulz, and Saurabh Bagchi. At the 7th Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop (co-located with HPDC), pp. 1-6, Jun 26, 2017, Washington DC. [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2017\/slides-ilaguna-FTXS-2017-final.pdf\">Presentation<\/a> ]<\/li>\n<\/ol>\n<h2>2016<\/h2>\n<ol>\n<li><strong>CGO<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2016\/final_opprox_cgo17_cameraready.pdf\">Phase-Aware Optimization in Approximate Computing<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSubrata Mitra, Manish Gupta, Sasa Misailovic (U of Illinois at Urbana-Champaign), Saurabh Bagchi. At the 2017 IEEE\/ACM International Symposium on Code Generation and Optimization (CGO), pp. 1-12, Feb 4-8, 2017, Austin, TX. (Acceptance rate: 26\/114 = 22.8%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2016\/cgo_2017_presentation_noaudio.pptx\">Presentation<\/a> ]<\/li>\n<li>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2016\/final_clusterfailureanalysis_iwpd16_cameraready.pdf\">A Study of Failures in Community Clusters: The Case of Conte<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSubrata Mitra, Suhas Raveesh Javagal, Amiya K. Maji (ITaP), Todd Gamblin (LLNL), Adam Moody (LLNL), Stephen Harrell (ITaP), and Saurabh Bagchi. At the 7th IEEE International Workshop on Program Debugging, co-located with ISSRE, pp. 1-8, Oct 23-27, 2016, Ottawa, Canada.[ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2016\/fresco_issre16_102416.pptx\">Presentation<\/a> ]<\/li>\n<li><strong>SRDS<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2016\/final_silentdatacorruption_srds16_submitted.pdf\">Sirius: Probabilistic data assertions for detecting silent data corruptions in parallel programs<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nTara Thomas, Anmol Bhattad, Subrata Mitra, and Saurabh Bagchi. At the IEEE 35th Symposium on Reliable Distributed Systems (SRDS), pp. 1-10, September 26-29, 2016, Budapest, Hungary. (Acceptance rate: 27\/83 = 32.5%)[ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2016\/Sirius_SRDS_2016.pptx\">Presentation<\/a> ]<\/li>\n<li><strong>ICS<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2016\/final_sarvavid_ics2016_submitted.pdf\">SARVAVID: A Domain Specific Language for Developing Scalable Computational Genomics Applications<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nKanak Mahadik, Christopher Wright, Jinyi Zhang, Milind Kulkarni, Saurabh Bagchi, and Somali Chaterji. At the International Conference on Supercomputing (ICS), pp. 1-13, June 1-3, 2016, Istanbul, Turkey (Acceptance rate: 43\/178 = 24.2%).<\/li>\n<li><strong>EuroSys<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2016\/final_erasurecodedrepair_eurosys16_cameraready.pdf\">Partial-Parallel-Repair (PPR): A Distributed Technique for Repairing Erasure Coded Storage<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSubrata Mitra, Rajesh Krishna Panta (AT&amp;T Labs), Moo-Ryong Ra (AT&amp;T Labs), Saurabh Bagchi. At the European Conference on Computer Systems (EuroSys), pp. 1-14, April 18-21, 2016, London, UK (Acceptance rate: 38\/180 = 21.1%). [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2016\/Eurosys2016_ppr.pptx\"> Presentation <\/a> ]<\/li>\n<\/ol>\n<h2>2015<\/h2>\n<ol>\n<li><strong>PACT<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2015\/guardian_pact15.pdf\">Dealing with the Unknown: Resilience to Prediction Errors<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSubrata Mitra, Greg Bronevetsky, Suhas Javagal and Saurabh Bagchi. At the 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 1-10, October 18-21, 2015, San Francisco, CA. (Acceptance rate: 38\/179 = 21.2%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2015\/pact15_datadependentprediction.pptx\">Presentation<\/a> ]<\/li>\n<li><strong>BCB<\/strong><br \/>\n&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2015\/mirnatargeting_bcb15.pdf\">An Ensemble SVM Model for the Accurate Prediction of Non-Canonical MicroRNA Targets<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAsish Ghoshal, Ananth Grama, Saurabh Bagchi and Somali Chaterji. At the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics (BCB), pp. 403-412, September 9-12, 2015, Atlanta, GA. (Acceptance rate: 48\/141 = 34%) <span style=\"color: #ff0000;\">(Winner of the best paper award)<\/span><\/li>\n<\/ol>\n<h2>2014<\/h2>\n<ol>\n<li>\n<div><strong>Middleware<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2014\/cloudinterferencemitigation_amiya_middleware14.pdf\">Mitigating Interference in Cloud Services by Middleware Reconfiguration<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAmiya Maji, Subrata Mitra, Bowen Zhou, Saurabh Bagchi and Akshat Verma (IBM Research). At the 15th ACM\/IFIP\/USENIX Middleware conference, pp. 1-12, Nov 16-21, 2014. (Acceptance rate: 27\/144 = 18.8%) [ Presentation ]<\/div>\n<\/li>\n<li>\n<div><strong>Supercomputing<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2014\/orion_kanak_sc14.pdf\">Orion: Scaling Genomic Sequence Matching with Fine-Grained Parallelization<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nKanak Mahadik, Somali Chaterji, Bowen Zhou, Milind Kulkarni, and Saurabh Bagchi. At the International Conference for High Performance Computing, Networking, Storage, and (Supercomputing), pp. 1-11, Nov 16-21, 2014. (Acceptance rate: 82\/394 = 20.8%) [ Presentation ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/div>\n<\/li>\n<li>\n<div><strong>ICAC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2014\/duplicaterequests_fahad_icac14.pdf\">Is Your Web Server Suffering from Undue Stress due to Duplicate Requests?<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nFahad A. Arshad, Amiya K. Maji, Sidharth Mudgal, and Saurabh Bagchi. As a Short Paper, At the 11th International Conference on Autonomic Computing (ICAC), pp. 105-111, June 18-20, 2014, Philadelphia, PA. (Acceptance rate: 12 (full papers) + 10 (short papers)\/53 = 41.5%) [<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2014\/griffin_icac14_pres.pptx\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/div>\n<\/li>\n<li>\n<div><strong>TPDS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2014\/progressdependence_tpds14_proofing.pdf\">Diagnosis of Performance Faults in Large Scale MPI Applications via Probabilistic Progress-Dependence Inference<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nIgnacio Laguna (LLNL), Dong Ahn (LLNL), Bronis de Supinski (LLNL), Saurabh Bagchi, and Todd Gamblin (LLNL), Accepted to appear in IEEE Transactions on Parallel and Distributed Systems (TPDS), pp. 1-15, notification of acceptance: March 2014. [ Presentation ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/div>\n<\/li>\n<li>\n<div><strong>PLDI<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2014\/parallelhangdetection_subrata_pldi14.pdf\">Accurate Application Progress Analysis for Large-Scale Parallel Debugging<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nSubrata Mitra, Ignacio Laguna, Dong H. Ahn, Saurabh Bagchi, Martin Schulz, and Todd Gamblin. At the ACM International Symposium on Programming Language Design and Implementation (PLDI), pp. 193-203, Edinburgh, UK, June 9-11, 2014. (Acceptance rate: 52\/287 = 18.1%) [ <a class=\"ShowAsLink\">Abstract<\/a> ] [<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2014\/prodometer_pldi14_pres.pdf\">Presentation<\/a> ]<\/div>\n<\/li>\n<\/ol>\n<h2>2013<\/h2>\n<ol>\n<li>\n<div><strong>ISSRE<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2013\/conferror_fahad_issre13.pdf\">Characterizing Configuration Problems in Java EE Application Servers: An Empirical Study with GlassFish and JBoss<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nFahad A. Arshad, Rebecca J. Krause, and Saurabh Bagchi, At the 24th IEEE International Symposium on Software Reliability Engineering (ISSRE), pp. 1-10, Pasadena, CA, November 4-7, 2013. (Acceptance rate: 46\/131 = 35.1%)[ <a class=\"ShowAsLink\">Abstract <\/a>] [<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2013\/confgauge_issre13_pres.pptx\">Presentation<\/a> ]<\/div>\n<\/li>\n<li>\n<div><strong>SRDS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2013\/orion_subrata_srds2013.pdf\">Automatic Problem Localization in Distributed Applications via Multi-dimensional Metric Profiling<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nIgnacio Laguna, Subrata Mitra, Fahad A. Arshad, Nawanol Theera-Ampornpunt, Zongyang Zhu, Saurabh Bagchi, Samuel P. Midkiff, Mike Kistler (IBM Research), and Ahmed Gheith (IBM Research), At the 32nd International Symposium on Reliable Distributed Systems (SRDS), pp. 121-132, Braga, Portugal, September 30-October 3, 2013. (Acceptance rate: 22\/67 = 32.8%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2013\/orion_srds13_pres.pptx\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract <\/a>]<\/div>\n<\/li>\n<li>\n<div><strong>HPDC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2013\/wukong_bowen_hpdc13.pdf\">WuKong: Automatically Detecting and Localizing Bugs that Manifest at Large System Scales<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nBowen Zhou, Jonathan Too, Milind Kulkarni, and Saurabh Bagchi. At the 22nd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), pp. 131-142, New York City, NY, June 17-21, 2013. (Acceptance rate: 20\/131 = 15.3%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2013\/wukong_hpdc13_pres.pptx\">Presentation<\/a> ] [<a class=\"ShowAsLink\">Abstract<\/a> ]<\/div>\n<\/li>\n<\/ol>\n<h2>2012<\/h2>\n<ol>\n<li>\n<div><strong>HotDep<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2012\/abhranta_scalingbuglocalization_hotdep12.pdf\">ABHRANTA: Locating Bugs that Manifest at Large System Scales<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nBowen Zhou, Milind Kukarni, and Saurabh Bagchi. At the 8th Workshop on Hot Topics in System Dependability (HotDep) (co-located with OSDI &#8217;12), pp. 1-6, Hollywood, CA, October 7, 2012. (Acceptance rate: 10\/24 = 41.7%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2012\/zhou_hotdep12_slides.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/div>\n<\/li>\n<li>\n<div><strong>Supercomputing<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2012\/mcrengine_sc12.pdf\">mcrEngine: A Scalable Checkpointing System using Data-Aware Aggregation and Compression<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nTanzima Zerin Islam, Kathryn Mohror, Saurabh Bagchi, Adam Moody, Bronis R. de Supinski, and Rudolf Eigenmann. At the IEEE\/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing), pp. 1-10, Salt Lake City, Utah, November 10-16, 2012. (Acceptance rate: 100\/472 = 21.2%) (One of 8 papers that is a finalist for the best student paper) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2012\/mcrEngine_sc12_pres.pptx\">Presentation<\/a> ][ <a class=\"ShowAsLink\">Abstract <\/a>]<\/div>\n<\/li>\n<li>\n<div><strong>PACT<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2012\/faultlocalization_pac12.pdf\">Probabilistic Diagnosis of Performance Faults in Large Scale Parallel Applications<\/a>,&#8221;\u009d <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nIgnacio Laguna, Dong H. Ahn, Bronis R. de Supinski, Saurabh Bagchi, and Todd Gamblin. At the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 1-10, September 19-23, 2012, Minneapolis, MN. (Acceptance rate: 39\/207 = 18.8%) [<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2012\/automaded_PACT12_pres.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/div>\n<\/li>\n<li><strong>DSN<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2012\/automaticfaultcharacterizaiton_DSN2012.pdf\">Automatic Fault Characterization via Abnormality-Enhanced Classification<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGreg Bronevetsky (LLNL), Ignacio Laguna, Saurabh Bagchi and Bronis R. de Supinski (LLNL). In the 42th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 1-12, Boston, MA, June 25-28, 2012 (Acceptance rate: 51\/236 = 21.6%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2012\/automaticfaultcharacterization_dsn2012_pres.pdf\">Presentation<\/a> ] [<a class=\"ShowAsLink\">Abstract <\/a>]<\/li>\n<li>\n<div><strong>DSN<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2012\/softerrorconsequences_DSN2012.pdf\">A Study of Soft Error Consequences in Hard Disk Drives<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nTimothy Tsai (Hitachi GST), Nawanol Theera-Ampornpunt and Saurabh Bagchi. In the 42th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN) (Practical Experience Report), pp. 1-8, Boston, MA, June 25-28, 2012 (Acceptance rate: 51\/236 = 21.6%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2012\/softerrorconsequences_DSN2012_pres.pdf\">Presentation<\/a> ] [<a class=\"ShowAsLink\">Abstract <\/a>]<\/div>\n<\/li>\n<\/ol>\n<h2>2011<\/h2>\n<ol>\n<li>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2011\/NeesHub-EarthquakeEngg.pdf\">The NEEShub Cyberinfrastructure for Earthquake Engineering<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nThomas J. Hacker, Rudi Eigenmann, Saurabh Bagchi, Ayhan Irfanoglu, Santiago Pujol, Ann Catlin, Ellen Rathje IEEE Computing in Science and Engineering, vol. 13, issue 4, pp. 67-78, July-August 2011<\/li>\n<li><strong>Supercomputing<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2011\/debugging_ded_supercom11.pdf\">Large Scale Debugging of Parallel Tasks with AutomaDeD,<\/a>&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nIgnacio Laguna, Todd Gamblin, Bronis R. de Supinski, Saurabh Bagchi, Greg Bronevetsky, Dong H. Ahn, Martin Schulz, and Barry Rountree, At the Supercomputing Conference, 12 pages, Seattle, WA, Nov 12-18, 2011. (Acceptance rate: 74\/352 = 21.0%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2011\/automaded_sc11.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/li>\n<li><strong>HPDC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2011\/vrisha_hpdc11_submit.pdf\">Vrisha: Using Scaling Properties of Parallel Programs for Bug Detection and Localization<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nBowen Zhou, Milind Kulkarni, and Saurabh Bagchi, At the 20th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 12 pages, San Jose, California, June 8-11, 2011. (Acceptance rate: 22\/170 = 12.9%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2011\/vrisha_hpdc11.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/li>\n<\/ol>\n<h2>2010<\/h2>\n<ol>\n<li><strong>ISSRE<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2010\/android_issre10_submit.pdf\" target=\"_self\" rel=\"noopener noreferrer\">Characterizing Failures in Mobile OSes: A Case Study with Android and Symbian<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nAmiya Kumar Maji, Kangli Hao, Salmin Sultana, and Saurabh Bagchi. At the 21st annual International Symposium on Software Reliability Engineering (ISSRE 2010), 10 pages, Nov 1-4, 2010, San Jose, California. (Acceptance rate: 40\/130 = 30.8%) [ <a class=\"ShowAsLink\">Abstract <\/a>]<\/li>\n<li><strong>DSN<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2010\/paralleldebugging_dsn10_cameraready.pdf\" target=\"_self\" rel=\"noopener noreferrer\">AutomaDeD: Automata-Based Debugging for Dissimilar Parallel Tasks<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGreg Bronevetsky, Ignacio Laguna, Saurabh Bagchi, Bronis R. de Supinski, Dong H. Ahn, and Martin Schulz. In the 40th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN), 10 pages, June 28-July 1, 2010, Chicago, IL. (Acceptance rate (DCCS track): 40\/174 = 23%) [<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2010\/automaded_dsn10.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">Abstract<\/a> ]<\/li>\n<\/ol>\n<h2>2009<\/h2>\n<ol class=\"style10\">\n<li>\n<div id=\"wrapper\"><strong>Middleware<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2009\/final_middleware_intelligentsampling_submit.pdf\">How To Keep Your Head Above Water While Detecting Errors<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nIgnacio Laguna, Fahad A. Arshad, David M. Grothe, and Saurabh Bagchi. In: ACM\/IFIP\/USENIX 10th International Middleware Conference, November 30-December 4, 2009, Urbana-Champaign, Illinois. (Acceptance rate: 21\/110 = 19.1%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2009\/ilaguna-head_above.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/div>\n<\/li>\n<li>\n<div id=\"wrapper\"><strong>Supercomputing<br \/>\n<\/strong>&#8220;<a href=\"http:\/\/portal.acm.org\/citation.cfm?id=1654059.1654110&amp;dl=ACM&amp;coll=ACM\">FALCON: A System for Reliable Checkpoint Recovery in Shared Grid Environments<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nTanzima Zerin, Saurabh Bagchi, and Rudolf Eigenmann. In: the ACM\/IEEE Supercomputing Conference, November 14-20, 2009, Portland, Oregon. (Acceptance rate: 59\/261 = 22.6%) (Nominated as one of 4 best student papers) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2009\/tislam-falcon.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/div>\n<\/li>\n<\/ol>\n<h2>2008<\/h2>\n<h2>2007<\/h2>\n<ol>\n<li><strong>SRDS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2007\/bagchis-monitor.pdf\">Stateful Detection in High Throughput Distributed Systems<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Ignacio Laguna, Fahad A. Arshad, and Saurabh Bagchi. In: 26th IEEE International Symposium on Reliable Distributed Systems (SRDS-2007), pp. 275-287, Beijing, CHINA, October 10-12, 2007. (Acceptance rate: 29\/185 ~ 15.7%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2007\/sampling_srds07_100707.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/li>\n<li><strong>SRDS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2007\/arshad-Diagnosis.pdf\">Distributed Diagnosis of Failures in a Three Tier E-Commerce System<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Ignacio Laguna, Fahad A. Arshad, and Saurabh Bagchi. In: 26th IEEE International Symposium on Reliable Distributed Systems (SRDS-2007), pp. 185-198, Beijing, CHINA, October 10-12, 2007. (Acceptance rate: 29\/185 ~ 15.7%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2007\/petstore_srds07_101007.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/li>\n<li><strong>HPDC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2007\/final_checkpointing_HPDC07_submit.pdf\">Failure-Aware Checkpointing in Fine-Grained Cycle Sharing Systems<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nXiaojuan Ren, Rudolf Eigenmann, and Saurabh Bagchi. In: 16th IEEE International Symposium on High Performance Distributed Computing (HPDC-16), Monterey Bay, California, June 27-29, 2007. (Acceptance rate: 20%). [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2007\/gridcheckpoint_hpdc07.pdf\">Presentation<\/a> ] [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/li>\n<li><strong>TDSC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2007\/final_tdsc_minorrevision_submit.pdf\">Automated Rule-Based Diagnosis through a Distributed Monitor System<\/a>&#8220;: <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Mike Yu Cheng, Padma Varadharajan, Saurabh Bagchi, Miguel P. Correia, and Paulo J. Verissimo. In: IEEE Transactions on Dependable and Secure Computing (TDSC), notificacion of acceptance: May 2007. [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/li>\n<li><strong>JOGC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2007\/PredictionFGCS_JOGC07_revision.pdf\">Prediction of Resource Availability in Fine-Grained Cycle Sharing Systems and Empirical Evaluation<\/a>&#8220;, <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nXiaojuan Ren, Seyong Lee, Rudolf Eigenmann, and Saurabh Bagchi. In Springer\u00e2\u20ac&#x2122;s Journal of Grid Computing (JOGC), vol. 5, no. 2, pp. 173-195, 2007. [ <a class=\"ShowAsLink\">abstract<\/a> ]<\/li>\n<\/ol>\n<h2>2006<\/h2>\n<ol>\n<li>\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\"><strong>ICCD<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2006\/pesticide_iccd06.pdf\">Pesticide: Using SMT Processors to Improve Performance of Pointer Bug Detection<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nJin-Yi Wang, Yen-Shiang Shue, T N Vijaykumar, and Saurabh Bagchi. 24th International Conference of Computer Design (ICCD), Oct 1-4, 2006, San Jose, California, USA.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/li>\n<li>\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\"><strong>DSN<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2006\/WASR06_virtualized_submit.pdf\">Providing Automated Detection of Problems in Virtualized Servers using Monitor framework<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Saurabh Bagchi, Kirk Beaty, Andrzej Kochut, and Gautam Kar. Workshop on Applied Software Reliability (WASR) at the International Conference on Dependable Systems and Networks (DSN), June 25-28, 2006, Philadelphia, Pennsylvania, USA. [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2006\/Monitor_Virtualized_WASR06.pdf\">Presentation<\/a>]<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/li>\n<li>\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\">\n<div align=\"justify\"><strong>HPDC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2006\/hpdc06_finegrainedcyclesharing_submit.pdf\">Resource Failure Prediction in Fine-Grained Cycle Sharing Systems<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nXiaojuan Ren, Seyong Lee, Rudolf Eigenmann, and Saurabh Bagchi. 15th IEEE International Symposium on High Performance Distributed Computing (HPDC-15), 19-23 June 2006, Paris, France. (Acceptance rate: 24\/157 ~ 15%). [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2006\/final_presentation_FGCS_HPDC.pdf\">Presentation<\/a> ]<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/li>\n<li><strong>TDSC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2006\/monitor_TDSC_accepted.pdf\">Automated Online Monitoring of Distributed Applications through External Monitors<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Padma Varadharajan, and Saurabh Bagchi. IEEE Transactions on Dependable and Secure Computing (TDSC), vol. 3, no. 2, pp. 115-129, Apr-Jun, 2006.<\/li>\n<\/ol>\n<h2>2005<\/h2>\n<ol>\n<li>\n<div align=\"justify\">&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2005\/monitor_ecetr05_19.pdf\">Probabilistic Diagnosis through Non-Intrusive Monitoring in Distributed Applications<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Yu Cheng, Saurabh Bagchi, Miguel Correia, and Paolo Verissimo. Purdue ECE Technical Report 05-19, December 2005.<\/div>\n<\/li>\n<li><strong>SRDS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2005\/lrrm_srds05_submit.pdf\">LRRM: A Randomized Reliable Multicast Protocol for Optimizing Recovery Latency and Buffer Utilization<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nNipoon Malhotra, Shrish Ranjan, and Saurabh Bagchi. 24th IEEE Symposium on Reliable Distributed Systems (SRDS 2005), October 26-28, 2005, Orlando, Florida, USA.(Acceptance rate: 20\/67 ~ 29.9%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2005\/final_lrrm_srds05_cameraready.pdf\">Camera ready<\/a> ].<\/li>\n<li>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2005\/monitor_ecetr05_13.pdf\">Automated Monitor Based Diagnosis in Distributed Systems<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Padma Varadharajan, Mike Cheng, and Saurabh Bagchi, Purdue ECE Technical Report 05-13, August 2005.<\/li>\n<\/ol>\n<h2>2004<\/h2>\n<ol>\n<li><strong>SRDS<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2004\/bagchi_s_monitor.pdf\">Self Checking Network Protocols: A Monitor Based Approach<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, Padma Varadharajan, and Saurabh Bagchi. 23rd International Symposium on Reliable Distributed Systems (SRDS 2004), October 2004. (Acceptance rate:27\/117 ~ 23.1%)<br \/>\n[ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2004\/srds04_final_camera.pdf\">Camera Ready<\/a> ] [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/presentations1\/2004\/final_monitor_srds04_presentation.pdf\">Presentation<\/a> ]<\/li>\n<li><strong>PRDC<br \/>\n<\/strong>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2004\/tram++_prdc04_submit.pdf\">Failure Handling in a Reliable Multicast Protocol for Improving Buffer Utilization and Accommodating Heterogeneous Receivers<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, John Rogers, and Saurabh Bagchi. In Proceedings of the 10th IEEE Pacific Rim Dependable Computing Conference (PRDC&#8217; 04), March 2004. (Acceptance rate: 34\/102 ~ 33.3%) [ <a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2004\/tram++_prdc04_cameraready.pdf\">Camera ready<\/a> ]<\/li>\n<\/ol>\n<h2>2003<\/h2>\n<ol>\n<li>\n<div align=\"justify\">&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2003\/gunjan_msthesis_1203.pdf\">Self-Checking Network Protocols: A Monitor Based Approach<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nGunjan Khanna, MS Thesis. December 2003.<\/div>\n<\/li>\n<li>&#8220;<a href=\"https:\/\/engineering.purdue.edu\/dcsl\/publications\/papers\/2003\/lrrm_fastabsdsn03.pdf\">Light-Weight Randomized Reliable Multicasting Protocol<\/a>,&#8221; <span class=\"badge badge-success\">Dependable Distributed Applications<\/span><br \/>\nNipoon Malhotra, Shrish Ranjan, and Saurabh Bagchi. Appeared in Fast Abstracts, DSN2003.<\/li>\n<\/ol>\n<div style=\"text-align: left;\" align=\"center\"><strong>Copyright notice:<\/strong> Personal use of this material is permitted. However, permission to reprint\/republish this material for advertising or promotional<br \/>\npurposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work<br \/>\nin other works must be obtained from the appropriate publisher (IEEE, ACM, Elsevier, etc.)<\/div>\n<p><!--more--><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>To view publications by project, click the buttons down below: All Publications Dependable Distributed Applications Resilient Wireless Networks Distributed Secure Systems Resilient AI PhD Theses by DCSL Members Miscellaneous 2025 PEARC &#8220;FRESCO: A Public Multi-Institutional Dataset for Understanding HPC System Behavior and Dependability,&#8221; Dependable Distributed Applications Joshua McKerracher, Preeti Mukherjee; Rajesh Kalyanam (Oak Ridge National [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":73,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"tags":[],"_links":{"self":[{"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/pages\/2093"}],"collection":[{"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/comments?post=2093"}],"version-history":[{"count":18,"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/pages\/2093\/revisions"}],"predecessor-version":[{"id":3593,"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/pages\/2093\/revisions\/3593"}],"up":[{"embeddable":true,"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/pages\/73"}],"wp:attachment":[{"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/media?parent=2093"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/engineering.purdue.edu\/dcsl\/wp-json\/wp\/v2\/tags?post=2093"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}