Home
Netbeans Eclipse Qt Java
Games
College of Engineering Aeronautics and Astronautics Agricultural and Biological Engineering Biomedical Engineering Chemical Engineering Civil Engineering Construction Engineering and Management Electrical and Computer Engineering Engineering Education Engineering Professional Education Environmental and Ecological Engineering Industrial Engineering Materials Engineering Mechanical Engineering Nuclear Engineering
EPICS (Engineering Projects In Community Service) First-Year Engineering Program First-Year Engineering Honors Program Global Engineering Program Minority Engineering Program Professional Practice (Co-Op) Program Women in Engineering Program
College Administration Schools Programs All Groups All People ECN Webmail
Purdue Home

ECE 462 Fall 2010

Individual Programming Assignment III:

(C++) Parallel Programming


Summary

This assignment asks you to improve the performance of a program (matrix multiplication) by using multiple threads.

Requirements

A single-thread program has been written for you and you can download it using

svn checkout http://purdue-wl-ece462.googlecode.com/svn/trunk/2010/IPA purdue-wl-ece462-read-only

Your responsiblity is to make it as fast as possible (and produce the correct results) on computers with multiple cores. You can use your class account (not your own Purdue account) on these machines (algolxx, qstructxx, and qutro01).ecn.purdue.edu. Your program will be graded @quatro02.ecn.purdue.edu. This machine has 32 cores and 128GB memory.  You can use any techniques to improve the program's performance, including but not limited to blocking.  If you are or have taken ECE 437, you should know how cache may affect performance.

Please remember that your program is evaluated by correctness first. If your program produces wrong results, it does not matter how fast the program is.

Grading

The grading part is divided into three parts. You must receive the full 2 points in correctness first. If your program fails the correctness tests, your program will not be graded by scalability or performance.

  • Correctness (2 points): Your program must produce correct results. We will test matrices of sizes 64, 256, 1024, and 4096. Your program must be "reasonably fast". If your program takes more than minutes, we assume your program is not functional and will stop it. Our tests show that it takes no more than 20 min on quatro02.
  • Scalability (1 points):

The performance considers only matrix multiplication and nothing else.  This is reflected by the code

      gettimeofday(&t1, 0);
      m3 = m1 -> multiply(m2, numThread);
      gettimeofday(&t2, 0);
      msec = 1.0e+3 * (t2.tv_sec - t1.tv_sec) +
          1.0e-3 * (t2.tv_usec - t1.tv_usec);

Using a single thread as the basis, how fast does your program run when using more threads? Suppose it takes x seconds for the sample program running with a single thread and y seconds for your program using t threads, your score for this case is

x / (t * y)

If the sample program takes 3 second using a single thread and 2 second using 4 threads, your score is 3 / (2 * 4)  = 0.375.  If the program takes 1 second using 4 threds, the score is 3 / (1 * 4) = 0.75.

We will test matrix sizes 64, 256, 1024, and 4096 using 1, 2, 4, 8, 16, and 32 threads.  There are 4 x 6  = 24 cases. Your score in this part is the sum of all tests divided by 24. It is possible to obtain score higher than one in some cases.  

  • Performance (2 points): In this part, your program's performance is compared with MATLAB's performance.  Suppose your program takes x seconds and MATLAB takes y seconds, you will receive

(4 * y) / x

If your program takes 8 seconds and MATLAB take 1.2 second, you will receive (4 * 1.2) / 8 = 0.6.  If your program is as fast as MATLAB (x quals y), you will receive 4. We will test matrix sizes 1024, 2048, and 4096 using 16 and 32 threads. There are 3 * 2 = 6 cases. Your score is the sum of all tests divided by 3 (6 cases / 2 points = 3). If your program is as fast as MATLAB in every case, you will receive 8 points.

To make your program faster, use "-O" in g++ to enable optimization. Please read the manual page of g++ for details.

As a way to encourage better performance, the maximum score of this assignment is 10 points; in other words, you may receive 100% bonus points.

Warning

Your program must have exactly the same output format as the sample program. Your program will be automatically graded. If the format is different, you will receive no point.

Acknowledgments: The computers (algol, qstruct, and qutro) are donated by Intel. This course is partially supported by National Science Foundation.