Research Interests

My research interests are in the areas of dependable distributed systems, network security, reliable high performance computing, and embedded wireless networks.

I am interested in the question of how to build heterogeneous large-scale distributed systems that are reliable. Since many business and life critical functions are being performed by distributed systems, they need to be reliable while meeting their performance goals. Thus, there is need for smart error detection, diagnosis and recovery protocols. There is need for architectures that can combine fault tolerance aspects with performance aspects in an adaptive manner, adapting to different user requirements and different runtime environments. I consider intrusions to be an increasingly important class of faults and we are looking at the design of intrusion tolerant systems. Our application contexts currently are coming from web services, high performance computing applications in the sciences, core infrastructure for cellular networks, and computational biology.

Wireless networks of embedded nodes cooperating among themselves for information gathering and analysis are becoming an important platform in several domains, leading toward the vision of “Internet of Things”. The nodes are placed in situ in the environment to be monitored and have the capacity for sensing, communication, computation and sometimes, mobility and actuation. Since the nodes have limited power resource, all the tasks need to be performed under power constraints. The reliability challenges come from the unpredictability of the environment in which the networks are based and the security challenges come from the fact that the networks are often deployed in open environments where network-based and physical compromise-based attacks are possible. I am investigating the issues in building embedded networks to meet high-level reliability requirements in the face of these challenges.

For details of my research projects, take a look at the home page of the Dependable Computing Systems Lab (DCSL) and the research overview document. If you are interested in working in the research group, please take a look at the process for this outlined here.

Project thrusts at DCSL

Project Summary
Fault tolerance for distributed applications Error detection, prediction, and localization for a variety of distributed applications, currently, scientific clusters and applications, web services, and storage systems
Intelligent wireless networks Reliability of emerging class of wireless platforms - resource-constrained embedded wireless networks and smart phones
Distributed intrusion tolerant systems Detection of intrusion attempts against distributed infrastructure and intrusion prevention and response
Last modified: April 21, 2015