RELEASE -------- Cetus 1.0 (September 21, 2008) Cetus is a source-to-source compiler infrastructure for C written in Java. FEATURES -------- Cetus 1.0 has the following updated features: * New Symbol/Symbol Table Interface The previous symbol table interface attached a lookup-table for each IR region that defines a scope, and each symbol table entry points to the declaration statement that defines the corresponding variable (table key). The disadvantages of this method are 1) the provided search method is not sufficient to support access to members of a variable such as struct members 2) it allows undeclared identifiers in the IR 3) on-demand search can be expensive depending on the behavior of the compiler passes. A new symbol interface is introduced to provide an easy way to access the attributes of variables. The new method defines each declarator as an object that implements "cetus.hir.Symbol" interface and creates a link from each identifier to its declarator while reporting warnings for undeclared variables. Once this is done, access to symbol attributes can be performed in one step by other compiler passes. The "Symbol" object is also useful when used as a key in any data structures, as it is unique to each variable in the program. The range analysis and the array privatization passes described below were implemented using this new symbol interface, and many utility methods in "cetus.hir.Tools" use this new interface. * Symbolic Expression Manipulation This Cetus release provides enhanced expression manipulation tools that are useful in several ways. Those tools can convert a given expression to a simplified and normalized form and perform symbolic algebraic operations. The following example shows the feature of the simplification tools: 1+4-a+2*a -> 5+a (folding) a*(b+c) -> a*b+a*c (distribution) a*b+a*c -> a*(b+c) (factorization) (8*a)/(2*b) -> (4*a)/b (division) By default, the simplifier provided in "cetus.hir.NormalExpression" performs all the simplification methods except for factorization but pass writers can also customize their manipulation processes by calling each individual transformation in sequence. Look for more details in "cetus.hir.NormalExpression". * Symbolic Range Analysis The symbolic range analysis performs symbolic execution of the program and computes a possible value range for each integer-typed variable at each program point. The resulting set of value ranges provide a method to compare the values of two expressions at compile time. This framework is based on the counterpart in the Polaris parallelizing compiler (predecessor of Cetus) and has been verified with SPEC CPU2006. The following example shows the set of value ranges computed at each program point after the range analysis. Range Domain for Procedure foo [] { [] int foo(int k) int i, j; { [] int i, j; double a; double a; [] for ( i=0; i<10; ++ i ) for ( i=0; i<10; ++i ) [] { { a = 0.5*i; [0<=i<=9] } a=(0.5*i); j = i+k; } return j; [i=10] } j=(i+k); [i=10, j=(i+k)] return j; } See more details in "cetus.analysis.RangeAnalysis" and "cetus.analysis.RangeDomain" for examples of how to use the result of the range analysis in a compiler pass. * Array Privatization The array privatization pass detects privatizable scalars and arrays in each loop. The current approach considers variables that are written first and read later in every iteration, and access at the same memory location as private candidates. The privatizer internally incorporates live variable analysis to mark any "last" private variables after detecting privatizable variables. The current limitations of this pass are: - Loops containing function calls are handled conservatively; any variables that appear in actual parameters and global variables are not listed as private. - Variables with a user-defined type are not listed as private. * Reduction Recognition The reduction pass recognizes additive and multiplicative reduction variables in the loop. Essentially, for additive reductions, the algorithm recognizes statements of the form x = x + expr, where expr is typically a real-valued, loop-variant expression. The reduction variable x can be a scalar or an array expression. One or several reduction statements may appear in a loop, however x must not appear in any non-reduction statement. * Data Dependence Analysis The data dependence analyzer determines whether dependences exist between two subscripted references to the same/aliased arrays in a given loop nest. Array accesses within *eligible* loop nests are tested for dependences, and a data dependence graph (DDG) is constructed using the resulting direction vectors for each loop nest. Loop nests considered *eligible* are: - Canonical loops (Conventional FORTRAN DO loop format with positive stride) - Perfect Loop Nests - Loops without function calls - Loops without control flow modifiers such as break or goto statements Cetus uses the Banerjee-Wolfe Inequalities for dependence testing. The framework handles single and multiple index variable subscripts, multi- dimensional array subscripts as well as coupled subscripts (conservatively). The dependence analyzer also interfaces with array privatization, reduction and alias analyses to filter/include dependences. More details can be found in "cetus.analysis.DDTDriver". * Automatic Loop Parallelization The Loop Parallelization Pass uses data dependence information obtained from the data dependence analysis described above. This information is available via the loop dependence graph (cetus.analysis.DDGraph) for a given loop nest. For every loop within a nest, the dependence graph is checked for loop-carried dependences at that nest-level. If none exist, the loop is annotated as parallel. Limitations: - Serialization of outer loops to enable parallelization of additional inner loops (e.g. when a dependence with a [<,>] direction vector is found) is not yet supported. * Internal Cetus Annotation Statements Results of the various analyses such as privatization, reduction and loop parallelization are now stored internally within the IR as Cetus pragmas. Output source code might contain, for example, the following directives: #pragma cetus reduction(*: s) #pragma cetus private(i) #pragma cetus parallel for (i=0; i $ gzip -d cetus.tar.gz | tar xvf - * Build The organization of the directories has changed since the last release and there are several options for building Cetus: - For Apache Ant users The provided build.xml defines the build targets for Cetus. The available targets are "compile", "jar", "clean" and "javadoc". Users need to edit the location of the Antlr tool. - For Linux/Unix command line users. Run the script build.sh after defining system-dependent variables in the script. - For SDK (Eclipse, Netbeans, etc) users Follow the instructions of each SDK. RUNNING CETUS ------------- Users can run Cetus in the following ways: $ java -classpath= cetus.exec.Driver The "user_class_path" should include the class paths of Antlr and Cetus. "build.sh" and "build.xml" provides a target that generates a wrapper script for Cetus users. TESTING ------- We have tested Cetus successfully using the following benchmark suites: * SPECCPU2006 More information about this suite available at www.spec.org * SPECOMP2001 More information about this suite available at www.spec.org * NPB3.0 Only tested with one existing C benchmark: IS (Integer Sort) More information about NAS Parallel Benchmarks at www.nas.nasa.gov/Software/NPB/ LIMITATIONS ----------- In addition to the limited scope of the features mentioned above, Cetus 1.0 currently does not handle the following cases. * Does not support the simultaneous usage of ANSI C and K&R C function declaration formats within the same source file. e.g. void temp_func(int a, int b); .... .... void temp_func() int a; int b; { .... } Affected benchmarks: 456.hmmer (hsregex.c) * Does not preserve line number information and hence fails during SPECCPU validation. Affected benchmarks: 482.sphinx3 * Parallelization passes were tested with the C codes in SPECOMP2001 and NPB3. September 21, 2008 The Cetus Team URL: http://cetus.ecn.purdue.edu EMAIL: cetus@ecn.purdue.edu