RELEASE -------- Cetus 1.1 (July 10, 2009) Cetus is a source-to-source compiler infrastructure for C written in Java. http://cetus.ecn.purdue.edu FEATURES/UPDATES ---------------- * Bug fixes - Illegal identifiers from parsing unnamed structs The previous version used the file name as the prefix for unnamed structs allowing struct names that start with numeric characters. The current version adds a fixed head "named_" before the original name and handles such cases safely. - Fixed bugs in the following transformation passes -tsingle-declarator -tsingle-return -normalize * New flags -argument-noalias Specifies that arguments (parameters) don't alias each other but may alias global storage -argument-noalias-global Specifies that arguments (parameters) don't alias each other and don't alias global storage -loop-interchange Interchange loop to improve locality -macro Sets macros for the specified names with comma-separated list (no space is allowed). e.g., -macro=ARCH=i686,OS=linux -no-side-effect Assume there is no side-effect in the function calls (for range analysis) -profile-loops Inserts loop profiling calls (1=every, 2=outer, 3=cetus parallel, 4=outer cetus parallel, 5=openmp 6=outer openmp) * Removed flags -antlr : default behavior -parse-only: default behavior -usage : default behavior -cfg : no support -inline=N : no support -openmp : no support -procs=N : no support -tloops-to-subroutine: no support * Experimental flags -tsingle-call It is not stable for now. Contact the cetus developers for support. * Improved symbolic tools The accuracy of the symbolic range analysis was improved substantially and it provides tighter bounds of integer variables used as index variables than the previous version does. We have also improved the symbolic expression comparator so that it handles complex expressions such as division expressions. We also provide a new set of simplification utilities in "cetus.hir.Symbolic" which is cleaner and efficient compared to the existing one in "cetus.analysis.NormalExpression". It is backward-compatible with "NormalExpression", so users can safely switch to "Symbolic". The following methods are commonly used features: - Symbolic.simplify(e) : simplifies the given expression e - Symbolic.add(e1, e2) : performs e1+e2 followed by simplification - Symbolic.multiply(e1, e2): performs e1*e2 followed by simplification - Symbolic.subtract(e1, e2): performs e1-e2 followed by simplification - Symbolic.divide(e1, e2) : performs e1/e2 followed by simplification * Annotations The new version of Cetus extends our initial implementation of Annotations to a completely new IR structure and API. Comments, pragmas and other annotations were initially parsed in as Annotations and enclosed inside DeclarationStatements (in most cases). In the new implementation, parsed in annotations are converted to a new internal representation through the AnnotationParser. The AnnotationParser stores annotations under corresponding subclasses: - PragmaAnnotation - CetusAnnotation (#pragma cetus ...) - OmpAnnotation (#pragma omp ...) - CommentAnnotation (e.g. /* ... */) - CodeAnnotation (Raw printing) Annotations are stored as member objects of "Annotatable" IR (Statements and Declarations), the new API for manipulating these is hence available through Annotatable. Standalone annotations are enclosed in special IR such as AnnotationDeclaration or AnnotationStatement (note that AnnotationStatement is different from previous release). The API in Annotatable *COMPLETELY REPLACES* previous functionality provided through Tools.java. We understand this can require substantial updates for Cetus users, but we believe the new representation is much cleaner and warrants a small amount of time required to port over to the new release. * Automatic Parallelization updates: 1. Improved symbolic handling by Banerjee test The DDTDriver framework now uses symbolic information through range analysis in order to simplify symbolic subscripts and symbolic values associated with loop bounds and increments. This information enhances the data dependence results provided by the Banerjee-Wolfe inequalities leading to better parallelization results. 2. Nonperfect loop nest handling The DDT and parallelization frameworks now support testing for parallel loops in non-perfect nests by using common enclosing loop information and accommodating variable length direction vectors during parallelization. 3. Improved function call handling inside of loops The DDT framework can now handle system calls inside of loops that are known to have no side effects. This is currently a fixed list of functions that includes log(), sqrt(), fabs() and possibly other calls encountered in parallelizable loops in our benchmarks. Future releases will allow this list to be extended. If you are aware of a parallelizable function call within your application, it can be easily added to this list. Look for details in LoopTools.java The framework also uses simple interprocedural side-effect analysis in order to determine eligible loops for parallelization. 4. Interface to Data Dependence Graph A more efficient and convenient implementation of the Data Dependence graph is used in this version of Cetus. The dependence graph is created once and attached to the Program IR object for use by other passes. A reference to this DDGraph object gives access to the dependence information API provided by DDGraph. The DDGraph implementation and API provided in Cetus v1.0 have been deprecated. 5. Induction variable substitution pass We provide a fully working induction variable substitution pass which was experimental before. It was designed to support "Generalized Inudction Variables" that may increase multiple times in multiply nested loops. We use aggressive symbolic manipulators for this transformation pass to maximize the coverage of the transformation and the subsequent loop parallelizer. The capability of the IV substitution pass is illustrated in the example code for the range test below. 6. Non-linear and symbolic data dependence test (Range Test) We ported the powerful non-linear and symbolic DD test, "Range Test", which was implemented in the Polaris parallelizing compiler for Fortran programs. The test disproves dependences based on the "overlap" checking which compares extreme values of subscripts for the whole loop or between two consecutive iterations in conjunction with the monotonicity properties of the subscripts. It works seamlessly with the data dependence framework in Cetus, and users can turn on the range test by specifying the flag "-ddt=3". The following example highlights the capability of the range test when parallelizing the given loop in conjunction with the induction variable substitution pass: k = 0; k = 0; for (i=0; i<10; i++) { #pragma omp parallel for private(i) k = k+i; --> for (i=0; i $ gzip -d cetus.tar.gz | tar xvf - * Build There are several options for building Cetus: - For Apache Ant users The provided build.xml defines the build targets for Cetus. The available targets are "compile", "jar", "clean" and "javadoc". Users need to edit the location of the Antlr tool. - For Linux/Unix command line users. Run the script build.sh after defining system-dependent variables in the script. - For SDK (Eclipse, Netbeans, etc) users Follow the instructions of each SDK. RUNNING CETUS ------------- Users can run Cetus in the following ways: $ java -classpath= cetus.exec.Driver The "user_class_path" should include the class paths of Antlr and Cetus. "build.sh" and "build.xml" provides a target that generates a wrapper script for Cetus users. TESTING ------- We have tested Cetus successfully using the following benchmark suites: * SPECCPU2006 More information about this suite available at www.spec.org * SPECOMP2001 More information about this suite available at www.spec.org * NPB3.0 More information about NAS Parallel Benchmarks at www.nas.nasa.gov/Software/NPB/ KNOWN BUGS ---------- In addition to the limited scope of the features mentioned above, Cetus 1.1 currently does not handle the following cases. * Does not support the simultaneous usage of ANSI C and K&R C function declaration formats within the same source file. e.g. void temp_func(int a, int b); .... .... void temp_func() int a; int b; { .... } Affected benchmarks: 456.hmmer (hsregex.c) * Does not preserve line number information and hence fails during SPECCPU validation. Affected benchmarks: 482.sphinx3 * Does not handle parsing and IR creation for GNU GCC __asm__ extensions. Will be addressed in the next release. July 10, 2009 The Cetus Team URL: http://cetus.ecn.purdue.edu EMAIL: cetus@ecn.purdue.edu