The Previous Projects Of ParaMount
Lean Distributed Shared Memory (LDSM) System
The Pegasus Project
The Pegasus project assesses the suitability of the OpenMP parallel programming paradigm beyond shared-memory machines.
Compiler-Driven Adaptive Execution
Global Information Sharing
PEAK: Program Evolution by Adaptive Compilation
The PEAK Project
PEAK searches through the compiler optimization space so as to find
the best set of optimization flags for the important code sections of
the application. It uses a fast iterative
elimination algorithm to go through the search space. Efficient rating algorithms, which evaluate
the performance of the optimized versions based on partial execution of
the program, are developed to achieve both rating accuracy and rating
speed. So, using a feedback-directed approach, PEAK automatically tunes
the program performance.
ADAPT is a compiler-supported infrastructure for high-level adaptive program optimization. It allows developers to leverage existing compilers and optimization tools by describing a runtime heuristic for applying techniques in a domain specific language, ADAPT Language (AL). The ADAPT compiler is fed this description and a target application, generating a complete runtime system for tuning the application dynamically. Currently ADAPT has been ported to both Solaris and Linux based systems. ADAPTsupports remote dynamic compilation, parameterization and runtime sampling, allowing developers complete flexibility in heuristic development.
Moerae: Portable Interface
Shared-memory multiprocessor (SMP) machines have become increasingly popular. As a result of the availability of such systems, parallel compilers have become increasingly important since their impact on the performance of such machines is significant. In order to use a multiprocessor, a user or a compiler must transform sequential code into a parallel form suitable for the target machine. In general there are two methods: One is to use a native parallel compiler provided by the vendor of the target machine, and the other is to insert parallel directives into the sequential code, either by hand or by using a parallel pre-compiler. This code is then compiled by a native parallel compiler to generate machine code and run. Unfortunately, the code generated in this way is usually not portable across the various SMP machines, because each machine has its own parallel directives.
A new method, a portable interface called MOERAE is proposed. This interface
allows loop-level parallelization of FORTRAN codes using a portable thread-library.
MOERAE provides both an implementation of this portable library as well
as a means of automatically transforming a sequential code into this parallel
form. MOERAE uses a modified version of the Polaris parallelizing compiler
to transform parallel loops into subroutines suitable for running on multiple
threads, and inserts calls to a portable runtime library to control these
threads. Since MOERAE uses a portable thread library and only requires
a sequential backbend compiler, portability is guaranteed. Its performance
is comparable to those of a native parallel compiler, a parallel directive
method, and OpenMP, and even better in some cases. Several issues in implementing
MOERAE, including different methods to handle reduction operations are studied.
Ursa Minor / Ursa Major:
The Ursa Minor tool is designed to help understand the structure of a program and the information gathered by a compiler in an interactive way. It facilitates the comparison of performance results under different environments and the identification of potential parallelism. Ursa Minor consists of 8,000 lines of Java code.
The Ursa Major tool is an Internet-based tool designed to help advanced parallel programmers and researchers by providing our results and experience on various parallel programs and their runs under many different environments. It is based on the Ursa Minor tool and can be run on web browsers that support Java1.1. Try this and let me know what you think.
Cepheus seeks to expand the power of Polaris by allowing non-Fortran languages to be parallelized. This is accomplished by converting codes written in languages other than Fortran into a Fortan-like dialect. This translated code is then understandable to the Polaris infrastructure and can be optimized by the already existing passes.
Peformance Forecaster (PerFore):
Many different methods are used to approach performance. Program developers present results in high-level ways except to the few, small groups that are directly involved in programming for their scientific field, limiting the understanding of the results to a specific audience. Simulators are built to model the results obtained from black-box architectures, but an understanding of why the resulting performance occurs with respect to the code is rarely clear. Presentations of the performance of large codes involving many nests of loops and subroutine calls are usually reduced to total execution time studies or a breakdown into a few measurable components of execution time. Performance prediction of the scalability of such codes is often done by measuring the trends in execution time as more processors are added. Such predictions provide little understanding of why the trends occur.
The PerFore project addresses the need to understand performance of large applications that cannot reasonably be hand-analyzed and to approach this task without expertise in the scientific field the application is designed for. Currently, a methodology is being developed that can be used to model an application's performance in a manner that is more precise than looking at performance as a black-box, and yet, does not lose relevance to the code. It is intended that this methodology will aid in understanding measured results and forecasting trends in a large application's scalability.
Comprehensive Characterization of Real-World Applications:
A Research Infrastructure for Performace Evaluation
Our research effort aims at making it easy for academic groups to use industrially realistic codes for evaluating new hardware and software ideas/prototypes. To this end we perform, collect, and distribute characterization data and documentation of these codes.
Benchmarking with Large-Scope Applications:
This is a collaborative project with the High-Performance group of the Standard Performance Evaluation Corporation (SPEC/HPG). The goal is to make available large, industrial applications for both benchmarking (primarily by industry) and performance evaluation (primarily by research groups). The benchmarks are distributed and maintained by SPEC/HPG, which also publishes SPEC-approved performance numbers on these programs.
Compiling for Speculative Architectures:
We are developing new compiler techniques that exploit the capabilities of advanced speculative computer architectures. In a running program these architectures can exploit parallelism that remained undetected by the compiler. However, hardware speculation incurs overhead. The compiler can reduce this overhead by instructing the architecture to execute compiler-detected parallel program sections and certain data structures without speculation. These ideas are currently being implemented in a compiler for the Purdue Multiplex architecture.