RELEASE
--------
Cetus 1.1 (July 10, 2009)

Cetus is a source-to-source compiler infrastructure for C written in Java.
http://cetus.ecn.purdue.edu

FEATURES/UPDATES
----------------
* Bug fixes
  - Illegal identifiers from parsing unnamed structs
    The previous version used the file name as the prefix for unnamed structs
    allowing struct names that start with numeric characters. The current
    version adds a fixed head "named_" before the original name and handles
    such cases safely.
  - Fixed bugs in the following transformation passes
    -tsingle-declarator
    -tsingle-return
    -normalize

* New flags
  -argument-noalias
    Specifies that arguments (parameters) don't alias each other but may alias
    global storage
  -argument-noalias-global
    Specifies that arguments (parameters) don't alias each other and don't
    alias global storage
  -loop-interchange
    Interchange loop to improve locality
  -macro
    Sets macros for the specified names with comma-separated list (no space is
    allowed). e.g., -macro=ARCH=i686,OS=linux
  -no-side-effect
    Assume there is no side-effect in the function calls (for range analysis)
  -profile-loops
    Inserts loop profiling calls (1=every, 2=outer, 3=cetus parallel, 4=outer
    cetus parallel, 5=openmp 6=outer openmp)

* Removed flags
  -antlr     : default behavior
  -parse-only: default behavior
  -usage     : default behavior
  -cfg       : no support
  -inline=N  : no support
  -openmp    : no support
  -procs=N   : no support
  -tloops-to-subroutine: no support

* Experimental flags
  -tsingle-call
    It is not stable for now. Contact the cetus developers for support.

* Improved symbolic tools
  The accuracy of the symbolic range analysis was improved substantially and
  it provides tighter bounds of integer variables used as index variables than
  the previous version does. We have also improved the symbolic expression
  comparator so that it handles complex expressions such as division
  expressions.
  We also provide a new set of simplification utilities in
  "cetus.hir.Symbolic" which is cleaner and efficient compared to the
  existing one in "cetus.analysis.NormalExpression". It is backward-compatible
  with "NormalExpression", so users can safely switch to "Symbolic". The
  following methods are commonly used features:
  - Symbolic.simplify(e)     : simplifies the given expression e
  - Symbolic.add(e1, e2)     : performs e1+e2 followed by simplification
  - Symbolic.multiply(e1, e2): performs e1*e2 followed by simplification
  - Symbolic.subtract(e1, e2): performs e1-e2 followed by simplification
  - Symbolic.divide(e1, e2)  : performs e1/e2 followed by simplification

* Annotations
  The new version of Cetus extends our initial implementation of Annotations
  to a completely new IR structure and API. Comments, pragmas and other
  annotations were initially parsed in as Annotations and enclosed inside
  DeclarationStatements (in most cases).
  In the new implementation, parsed in annotations are converted to a new
  internal representation through the AnnotationParser. The AnnotationParser
  stores annotations under corresponding subclasses:
  - PragmaAnnotation
    - CetusAnnotation (#pragma cetus ...)
    - OmpAnnotation (#pragma omp ...)
  - CommentAnnotation (e.g. /* ... */)
  - CodeAnnotation (Raw printing)
  Annotations are stored as member objects of "Annotatable" IR (Statements
  and Declarations), the new API for manipulating these is hence available
  through Annotatable. Standalone annotations are enclosed in special IR
  such as AnnotationDeclaration or AnnotationStatement (note that
  AnnotationStatement is different from previous release).
  The API in Annotatable *COMPLETELY REPLACES* previous functionality
  provided through Tools.java. We understand this can require substantial
  updates for Cetus users, but we believe the new representation is much
  cleaner and warrants a small amount of time required to port over to the
  new release.

* Automatic Parallelization updates:
1. Improved symbolic handling by Banerjee test
  The DDTDriver framework now uses symbolic information through range analysis
  in order to simplify symbolic subscripts and symbolic values associated with
  loop bounds and increments. This information enhances the data dependence
  results provided by the Banerjee-Wolfe inequalities leading to better
  parallelization results.

2. Nonperfect loop nest handling
  The DDT and parallelization frameworks now support testing for parallel
  loops in non-perfect nests by using common enclosing loop information
  and  accommodating variable length direction vectors during
  parallelization.

3. Improved function call handling inside of loops
  The DDT framework can now handle system calls inside of loops that are
  known to have no side effects. This is currently a fixed list of functions
  that includes log(), sqrt(), fabs() and possibly other calls
  encountered in parallelizable loops in our benchmarks. Future releases 
  will allow this list to be extended. If you are aware of a parallelizable
  function call within your application, it can be easily added to this 
  list. Look for details in LoopTools.java
  The framework also uses simple interprocedural side-effect analysis in
  order to determine eligible loops for parallelization.

4. Interface to Data Dependence Graph
  A more efficient and convenient implementation of the Data Dependence graph
  is used in this version of Cetus. The dependence graph is created once and
  attached to the Program IR object for use by other passes. A reference to this
  DDGraph object gives access to the dependence information API provided by
  DDGraph. The DDGraph implementation and API provided in Cetus v1.0 have been
  deprecated.

5. Induction variable substitution pass
  We provide a fully working induction variable substitution pass which was
  experimental before. It was designed to support "Generalized Inudction
  Variables" that may increase multiple times in multiply nested loops. We use
  aggressive symbolic manipulators for this transformation pass to maximize
  the coverage of the transformation and the subsequent loop parallelizer.
  The capability of the IV substitution pass is illustrated in the example code
  for the range test below.

6. Non-linear and symbolic data dependence test (Range Test)
  We ported the powerful non-linear and symbolic DD test, "Range Test", which
  was implemented in the Polaris parallelizing compiler for Fortran programs.
  The test disproves dependences based on the "overlap" checking which compares
  extreme values of subscripts for the whole loop or between two consecutive
  iterations in conjunction with the monotonicity properties of the subscripts.
  It works seamlessly with the data dependence framework in Cetus, and users
  can turn on the range test by specifying the flag "-ddt=3".
  The following example highlights the capability of the range test when
  parallelizing the given loop in conjunction with the induction variable
  substitution pass:

    k = 0;                           k = 0;
    for (i=0; i<10; i++) {           #pragma omp parallel for private(i)
      k = k+i;                -->    for (i=0; i<imax; i++) {
      a[k] = ...;                      a[(i+i*i)/2] = ...;
    }                                }

7. Improved parallelization results
  The above updates have led to significant improvement in automatic 
  parallelization results using Cetus. You can refer to our latest
  Cetus tutorial at http://cetus.ecn.purdue.edu for detailed
  information.

CONTENTS
--------
This Cetus release has the following contents.

  src       - Source codes of Cetus
  lib       - Archived classes (jar)
  api       - JAVA documents 
  build.sh  - Command line build script
  build.xml - Build configuration for Apache Ant
  readme    - this file
  license   - Cetus license

REQUIREMENTS
------------
* JAVA 2 SDK, SE 1.5.x (or later)
* ANTLRv2 
* GCC
 
INSTALLATION
------------
* Obtain Cetus distribution
  The latest version of Cetus can be obtained at:
  http://cetus.ecn.purdue.edu/

* Unpack
  Users need to unpack the distribution before installing Cetus.
  $ cd <directory_where_cetus.tar.gz_exists>
  $ gzip -d cetus.tar.gz | tar xvf -

* Build
  There are several options for building Cetus:
  - For Apache Ant users
    The provided build.xml defines the build targets for Cetus. The available
    targets are "compile", "jar", "clean" and "javadoc". Users need to edit
    the location of the Antlr tool.
  - For Linux/Unix command line users.
    Run the script build.sh after defining system-dependent variables in the
    script.
  - For SDK (Eclipse, Netbeans, etc) users
    Follow the instructions of each SDK.

RUNNING CETUS
-------------
Users can run Cetus in the following ways:

  $ java -classpath=<user_class_path> cetus.exec.Driver <options> <c codes>

The "user_class_path" should include the class paths of Antlr and Cetus.
"build.sh" and "build.xml" provides a target that generates a wrapper script
for Cetus users.

TESTING
-------
We have tested Cetus successfully using the following benchmark suites:

* SPECCPU2006
  More information about this suite available at www.spec.org

* SPECOMP2001
  More information about this suite available at www.spec.org

* NPB3.0
  More information about NAS Parallel Benchmarks at www.nas.nasa.gov/Software/NPB/

KNOWN BUGS
----------
In addition to the limited scope of the features mentioned above, Cetus 1.1
currently does not handle the following cases.

* Does not support the simultaneous usage of ANSI C and K&R C 
  function declaration formats within the same source file.
  e.g.
  void temp_func(int a, int b);
  ....
  ....
  void temp_func()
    int a;
    int b;
  {
  ....
  }

  Affected benchmarks: 456.hmmer (hsregex.c)

* Does not preserve line number information and hence fails during SPECCPU
  validation.

  Affected benchmarks: 482.sphinx3

* Does not handle parsing and IR creation for GNU GCC __asm__ extensions. Will
  be addressed in the next release.

July 10, 2009
The Cetus Team

URL: http://cetus.ecn.purdue.edu
EMAIL: cetus@ecn.purdue.edu