The Previous Projects Of ParaMount



   Lean Distributed Shared Memory (LDSM) System


Pegasus Logo

The Pegasus Project

The Pegasus project assesses the suitability of the OpenMP parallel programming paradigm beyond shared-memory machines.


ATune Logo

   Compiler-Driven Adaptive Execution


IShare Logo

  Global  Information Sharing 

            and Computing

PEAK: Program Evolution by Adaptive Compilation


The PEAK Project

PEAK searches through the compiler optimization space so as to find the best set of optimization flags for the important code sections of the application. It uses a fast iterative elimination algorithm to go through the search space. Efficient rating algorithms, which evaluate the performance of the optimized versions based on partial execution of the program, are developed to achieve both rating accuracy and rating speed. So, using a feedback-directed approach, PEAK automatically tunes the program performance.

ADAPT: Automated De-Coupled Adaptive Program Optimization



ADAPT is a compiler-supported infrastructure for high-level adaptive program optimization. It allows developers to leverage existing compilers and optimization tools by describing a runtime heuristic for applying techniques in a domain specific language, ADAPT Language (AL). The ADAPT compiler is fed this description and a target application, generating a complete runtime system for tuning the application dynamically. Currently ADAPT has been ported to both Solaris and Linux based systems. ADAPTsupports remote dynamic compilation, parameterization and runtime sampling, allowing developers complete flexibility in heuristic development.

Moerae: Portable Interface



Shared-memory multiprocessor (SMP) machines have become increasingly popular. As a result of the availability of such systems, parallel compilers have become increasingly important since their impact on the performance of such machines is significant. In order to use a multiprocessor, a user or a compiler must transform sequential code into a parallel form suitable for the target machine. In general there are two methods: One is to use a native parallel compiler provided by the vendor of the target machine, and the other is to insert parallel directives into the sequential code, either by hand or by using a parallel pre-compiler. This code is then compiled by a native parallel compiler to generate machine code and run. Unfortunately, the code generated in this way is usually not portable across the various SMP machines, because each machine has its own parallel directives.

A new method, a portable interface called MOERAE is proposed. This interface allows loop-level parallelization of FORTRAN codes using a portable thread-library. MOERAE provides both an implementation of this portable library as well as a means of automatically transforming a sequential code into this parallel form. MOERAE uses a modified version of the Polaris parallelizing compiler to transform parallel loops into subroutines suitable for running on multiple threads, and inserts calls to a portable runtime library to control these threads. Since MOERAE uses a portable thread library and only requires a sequential backbend compiler, portability is guaranteed. Its performance is comparable to those of a native parallel compiler, a parallel directive method, and OpenMP, and even better in some cases. Several issues in implementing MOERAE, including different methods to handle reduction operations are studied.

Ursa Minor / Ursa Major:

Ursa Logo

Ursa Minor

Ursa Major

The Ursa Minor tool is designed to help understand the structure of a program and the information gathered by a compiler in an interactive way. It facilitates the comparison of performance results under different environments and the identification of potential parallelism. Ursa Minor consists of 8,000 lines of Java code.

The Ursa Major tool is an Internet-based tool designed to help advanced parallel programmers and researchers by providing our results and experience on various parallel programs and their runs under many different environments. It is based on the Ursa Minor tool and can be run on web browsers that support Java1.1. Try this and let me know what you think.


Cepheus Logo


Cepheus seeks to expand the power of Polaris by allowing non-Fortran languages to be parallelized. This is accomplished by converting codes written in languages other than Fortran into a Fortan-like dialect. This translated code is then understandable to the Polaris infrastructure and can be optimized by the already existing passes.

Peformance Forecaster (PerFore):

Perfore Logo


Many different methods are used to approach performance. Program developers present results in high-level ways except to the few, small groups that are directly involved in programming for their scientific field, limiting the understanding of the results to a specific audience. Simulators are built to model the results obtained from black-box architectures, but an understanding of why the resulting performance occurs with respect to the code is rarely clear. Presentations of the performance of large codes involving many nests of loops and subroutine calls are usually reduced to total execution time studies or a breakdown into a few measurable components of execution time. Performance prediction of the scalability of such codes is often done by measuring the trends in execution time as more processors are added. Such predictions provide little understanding of why the trends occur.

The PerFore project addresses the need to understand performance of large applications that cannot reasonably be hand-analyzed and to approach this task without expertise in the scientific field the application is designed for. Currently, a methodology is being developed that can be used to model an application's performance in a manner that is more precise than looking at performance as a black-box, and yet, does not lose relevance to the code. It is intended that this methodology will aid in understanding measured results and forecasting trends in a large application's scalability.

Comprehensive Characterization of Real-World Applications:
A Research Infrastructure for Performace Evaluation

Our research effort aims at making it easy for academic groups to use industrially realistic codes for evaluating new hardware and software ideas/prototypes. To this end we perform, collect, and distribute characterization data and documentation of these codes.

  • Benchmark Characterization: A research infrastructure for comprehensively characterizing complex codes. Here we present characterization data to pointedly answer performance related questions.
  • Publications

Benchmarking with Large-Scope Applications:




  • TAP: The new supercomputer rank list based on Top Application Performance.

This is a collaborative project with the High-Performance group of the Standard Performance Evaluation Corporation (SPEC/HPG). The goal is to make available large, industrial applications for both benchmarking (primarily by industry) and performance evaluation (primarily by research groups). The benchmarks are distributed and maintained by SPEC/HPG, which also publishes SPEC-approved performance numbers on these programs.

Compiling for Speculative Architectures:

Multiplex Logo


We are developing new compiler techniques that exploit the capabilities of advanced speculative computer architectures. In a running program these architectures can exploit parallelism that remained undetected by the compiler. However, hardware speculation incurs overhead. The compiler can reduce this overhead by instructing the architecture to execute compiler-detected parallel program sections and certain data structures without speculation. These ideas are currently being implemented in a compiler for the Purdue Multiplex architecture.