Managing Complexity of Large/Ultra Large Software System (SLIM)

	Managing Complexity of Large/Ultra Large Software System (SLIM)
	Infosys sponsored research

SLIM Home > Published papers

About Infosys - SETLabs

"Metrics for Analyzing Module Interactions in Large Software Systems" -
Santonu Sarkar, Avinash C Kak, N S. Nagaraja
Published in IEEE APSEC 2005

Abstract: We present a new set of metrics for analyzing the interaction between the modules of a large software system. We believe that these metrics will be important to any automatic or semi-automatic code modularization algorithm. The metrics are based on the rationale that code partitioning should be based on the principle of similarity of service provided by the different functions encapsulated in a module. Although module interaction metrics are necessary for code modularization, in practice they must be accompanied by metrics that measure other important attributes of how the code is partitioned into modules. These other metrics, dealing with code properties such as the approximate uniformity of module sizes, conformance to any size constraints on the modules, etc., are also included in the work presented here., To give the reader some insight into the workings of our metrics, this paper also includes some results obtained by applying the metrics to the body of code that constitutes the open-source Apache HTTP server. We apply our metrics to this code as packaged by the developers of the software and to the other partially and fully randomized versions of the code.

A Method for Detecting and Measuring Architectural Layering Violations in Source Code
Santonu Sarkar, Girish Maskeri Rama, Shubha R
Published in IEEE APSEC 2006

Abstract: The layered architecture pattern has been widely adopted by the developer community in order to build large software systems. The layered organization of software modules offers a number of benefits such as reusability, changeability and portability to those who are involved in the development and maintenance of such software systems. But in reality as the system evolves over time, rarely does the actual source code of the system conform to the conceptual horizontal layering of modules. This in turn results in a significant degradation of system maintainability. In order to re-factor such a system to improve its maintainability, it is very important to discover, analyze and measure violations of layered architecture pattern. In this paper we propose a technique to discover such violations in the source code and quantitatively measure the amount of non-conformance to the conceptual layering. The proposed approach evaluates the extent to which the module dependencies across layers violate the layered architecture pattern. In order to evaluate the accuracy of our approach, we have applied this technique to discover and analyze such violations to a set of open source applications and a proprietary business application by taking the help of domain experts wherever possible.

API-Based and Information-Theoretic Metrics for Measuring the Quality of Software Modularization
Santonu Sarkar, Girish Maskeri Rama, and Avinash C. Kak
Published in IEEE Transactions on Software Engineering, Jan 2007

Abstract: We present in this paper a new set of metrics that measure the quality of modularization of a non-object oriented software system. We have proposed a set of design principles to capture the notion of modularity and defined metrics centered around these principles. These metrics characterize the software from a variety of perspectives: structural, architectural, and notions such as the similarity of purpose and commonality of goals. (By structural, we are referring to inter-module coupling based notions; and by architectural, we mean the horizontal layering of modules in large software systems.) We employ the notion of API (Application Programming Interface) as the basis for our structural metrics. The rest of the metrics we present are in support of those that are based on API. Some of the important support metrics include those that characterize each module on the basis of the similarity of purpose of the services offered by the module. These metrics are based on information theoretic principles. We tested our metrics on some popular open source systems and some large legacy-code business applications. To validate the metrics, we compared the results obtained on human-modularized versions of the software (as created by the developers of the software) with those obtained on randomized versions of the code. For randomized versions, the assignment of the individual functions to modules was randomized.

Modularization of a Large Scale Business Application - A Case Study
Santonu Sarkar et al.
Accepted for publication in IEEE Software

Abstract: Large software systems, developed over several years are the backbone of industries like banking, retail, transportation and telecommunications. With multiple bug fixes and feature enhancements, these systems gradually deviate from the intended architecture, and deteriorate into unmanageable monoliths. Knowledge about the internal working of these systems is hard to come by since many of the original developers have moved on. These large systems are a maintainer's nightmare. This paper presents a case study of a banking application besot with similar problems and the modularization approach adopted as a solution. We also briefly dwell on certain other benefits that were unearthed as a result of this re-engineering exercise.

Metrics for Measuring the Quality of Modularization of Large-Scale Object-Oriented Software
Santonu Sarkar, Avinash C. Kak, and Girish Maskeri Rama
Accepted for publication in IEEE Transactions on Software Engineering

Abstract: The metrics formulated to date for characterizing the modularization quality of object-oriented software have considered module and class to be synonymous concepts. But a typical class in object-oriented programming exists at too low a level of granularity in large object-oriented software consisting of millions of lines of code. A typical module (sometimes referred to as a superpackage) in a large object-oriented software system will typically consist of a large number of classes. Even when the access discipline encoded in each class makes for ``clean'' class-level partitioning of the code, the inter-module dependencies created by associational, inheritance-based, and method invocations may still make it difficult to maintain and extend the software. The goal of this paper is to provide a set of metrics that characterize large object-oriented software systems with regard to such dependencies. Our metrics characterize the quality of modularization with respect to the APIs of the modules, on the one hand, and, on the other, with respect to such object-oriented inter-module dependencies as caused by inheritance, associational relationships, state access violations, fragile base-class design, etc. Using a two-pronged approach, we validate the metrics by applying them to popular open-source software systems.