Home
Netbeans Eclipse Qt Java
Games
College of Engineering Aeronautics and Astronautics Agricultural and Biological Engineering Biomedical Engineering Chemical Engineering Civil Engineering Construction Engineering and Management Electrical and Computer Engineering Engineering Education Engineering Professional Education Environmental and Ecological Engineering Industrial Engineering Materials Engineering Mechanical Engineering Nuclear Engineering
EPICS (Engineering Projects In Community Service) First-Year Engineering Program First-Year Engineering Honors Program Global Engineering Program Minority Engineering Program Professional Practice (Co-Op) Program Women in Engineering Program
College Administration Schools Programs All Groups All People ECN Webmail
Purdue Home

Exercise 3

Debugger

Due September 18, 2012 @ 6:00 pm

 

IMPORTANT NOTE: YOU NEED TO DO THIS EXERCISE IN EITHER EE206 OR EE207.

In computers, a bug means an error or a mistake. The word comes from a moth stuck on a relay. Since then, the word is widely used for any problem related to computers, hardware or software. Debugging means finding and solving problems. For many students taking programming courses, debugging is a long process filled with frustration. It doesn't have to be so! This exercise gives you several tips in preventing bugs from happening. If there are bugs, you can use a debugger to find and fix them more quickly.

Prevent Bugs

Some students write programs quickly and then spend many hours debugging. This is a wrong approach. You should first prevent bugs in your programs. The following guidelines can help you reduce the occurrences of bugs.

  • Think before coding. You have to develop a strategy before writing code. This strategy should be clearly explained in README. For a complex program, it is usualy true that "The earlier you start coding, the long it will take."
  • Write comments before writing code.  Writing comments can help you organize your thought.
  • Add one feature each time. You need to have a plan before writing a program. What feature are you going to add first? Why? How do you know the feature is correctly implemented? Don't implement several features simultaneously and get nothing done.
  • Have a testing plan. How do you know the program is correct? Think about testing before writing code.
  • Create functions to handle repeated tasks. If you need to do something multiple times, create a function to do it so that you do not have to copy-paste code. If you need to do something similar with slight differences, use the function's arguments to handle the differences. Copy-paste code is a common source of bugs becasuse you have to keep the multiple copies consistent. If you change one place and forget to change another place, here come your bugs.
  • Give meaningful variable names. Many students like to use i, j, k for variables. There are many problems. First, you won't remember what they mean a few days later. Two, i, j, and k are close on a keyboard. If you want to use j and type k, the program will run but generate wrong results. This error is hard to find.
  • Initialize variables and pointers. If you do not initialize them, their values can be anything.
  • Use indentations. You should use an editor that can automatically format code for you so that you can easily discover unmatched brackets. Eclipse, emacs, and xemacs are good choices. They also provide syntatic highlighting.
  • Read your code after typing.  After typing a section of code, say 25 lines, read the code before proceeding. You should be able to run the code in your head before running the code in a computer. If you do not know what the code is doing, running it on a computer will not help.
  • Turn on compiler warnings. Add "-Wall" and "-Wshadow" after "gcc".  In many cases, a warning is actually an error. You should eliminate all warning messages.
  • Avoid global variables. They can be changed anywhere in a program and very hard to track.
  • Write small experimental code before integrating into the whole program. If you are uncertain how to do something, isolate the functionality and write a small experimental program first. Make sure you fully understand the code before integrating it into the larger program.
  • Check the return values. When your program allocates memory, is malloc successful?  You should read the manual to understand the meanings of the return values. If your program reads from a file, does the file exist?
  • Check array indexes. C does not check whether the index is valid. You have to check. If you have an array of 10 elements, the valid range is 0 to 9 (inclusively).  Please remember that 10 is invalid.  You cannot use a negative number, either.
  • Use valgrind to check invalid memory access and memory leak.
  • Understand the limitation of testing. Testing can only tell you that your program has problems. Testing does not tell you that your program is correct. If your program produces correct results for 10 test cases, there is no guarantee that your program can produce the correct result for the 11th test case. You should spend more time designing, writing, understanding, and tracing your program. Do not rely on testing to make your program correct.

Do not use Assertion: Some books tell readers to use assertions to ensure that the program terminates when something is wrong.

                assert(something must be true)

for example

                assert(x >= 0);

If x is smaller than zero, the program terminates. However, using assertions is a terrible way of writing programs. Consider the situation when you ask a user to input a file. The user enters a wrong file name and this file does not exist. The program should ask the user to enter the right name. It is unacceptable to terminate the program. This is very frustrating to a user. The user has no idea what is wrong. The program terminates with a message "assertion fail".  What can the user do? Nothing. Worse, the user may have done something before reaching the assertion. After the program terminates, the user has to restart from scratch. Your program should handle the problem, not simply terminating the program.  Never use assertions in your programs. Some books tell you to use assertions; they are wrong unfortunately. Assertion is an irresponsible way of writing programs. Handle the problem. Do not terminate the program.

Choose Right Tools

As an engineer, you must recognize the importance of good tools. In a power outage, you need a flashlight, not a thermometer. To drink water, you need a cup, not a knife. To call your friend, a phone works but a pen does not. Debugging is not a random guessing process. You need a strategy to find and remove problems. You have to understand that printf is a bad "tool" for debugging.  To debug a program, use a debugger not printf. If you can use a debugger efficiently, you can save many hours debugging your program, compared with using printf only.

Video Tutorial

DDD (Data Display Debugger) is a source-level debugger, meaning that it can show which line of the C code is being executed. Several screenshots are available:

DDD

You can find the manual of ddd. After this exercise, you should glance the manual to learn many more features that are not covered in this exercise.

In EE 206 or 207, type "ddd &" in Terminal to start ddd. 

You will not be able to use ddd remotely from a Windows computer using SecureCRT (unless you set up X server on the Windows computer. Setting up X server is beyond the scope of ECE 264).  You can use gdb through SecureCRT. Please watch this video.

Sample Program

  1. Download debugger.zip.
  2. unzip the file
  3. Type gcc -Wall -g debug.c -o debug (or make).
  4. Type ./debug.

You will see

                Segmentation fault.

"Segmentation fault" means the program tries to access invalid memory.The question is where in your program does that? This is usually difficult if you do not know how to use a debugger.  If you use a debugger, it is much easier.

Data Display Debugger

DDD is a graphical user interface for gdb.  To use DDD (or gdb), you have to include debugging information in the executable program, by adding "-g" after gcc.  The Makefile already has -g. Type

          ddd debug

to start the debugger.

To execute this program, select Program - Run. This program does not need any input argument. You can see the bottom shows Segmentation fault.

How can we find which line causes the problem? Select Status - Backtrack and you can see the following window.

It shows the flow of the program. Line 22 of main calls fscanf; fscanf calls vfscanf and it calls _IO_vfscanf. The numbers mean the position in the call stack, 0 for the most recent call. Numbers 0 - 2 are in the C library (libc) and they are most likely correct. (Millions of people use C libraries so bugs are usually fixed quickly.)  The most likely problem comes from our code, line 22 in exercise4.c. Click #3 and you will see line 22 highlighted in the main ddd window.

Is the file open correctly? Type

             print fhd

in the lower window (with the "gdb" prompt) and the value is 0.

This means calling fopen has failed. The program crashes because it does not check the return value of fopen. The problem is the file name "data.txt" in the program but the actual file name is "data." We may modify the program. However, this is not a good solution. If the file name is different, we have to modify the program again and compile it again. A better solution is to make it an input of the program.

      char fileName[1024];
      fhd = NULL;
      do 
      {
          printf("please enter the file name : ");
          scanf("%s", fileName);
          fileName[1023] = '\0';
          fhd = fopen(fileName, "r");
      } while (fhd == NULL);

This code asks the user to enter a file name. If fopen fails, the program asks for the file name again.  The file name is stored in a character array with 1024 elements. It is rare to have a file name that long so this should be sufficient. In some systems, this assumption is false and programs have to handle extremely long file names (together with directory path). Please notice that the program assigns '\0' to the last element for security. This prevents a malicious user from entering a very long file name (more than 1023 characters) and causing the program to crash.  This type of malicious behavior is called overflow attack.

The next error is tricker to find because the program may not crash on all machines.  In C, array indexes start from zero. Therefore, we have to change the next line to

       for (arrayIndex = 0; arrayIndex < numElem; arrayIndex ++)

After the changes, type

               gcc -g -Wall debug.c -o debug

to rebuild the program and

               file debug

to load the new program into ddd.

Breakpoint

After fixing this problem, the program shows another problem. The printArray function prints lots of zeros even though the input data file has no zero. Is this the problem of the function, or the caller of the function? We are going to set a breakpoint at the function.  A breakpoint means the program will stop at that location. Move your mouse cursor to top of the function, click the right button, select "Break at printArray." Run the program again and give "data" for the file name. The program stops at the first line of printArray.

There are only 29 elements but the size given to printArray is 2900.  The problem is at the caller. Change the caller to

         printArray(intArray, numElem);
After the changes, type

               gcc -g -Wall debug.c -o debug

to rebuild the program and

               file debug

to load the new program into ddd.  Move your mouse cursor to top of the function, click the right button, select "Clear at printArray" to remove the breakpoint.

Single Step

The value of sum is not correct. It always prints the last element in the input file. Set a breakpoint at the line

        fhd = NULL;

by typing

             b 22

after the (gdb) prompt; here 22 means the line number.

Execute the program and it stops at line 22. Use mouse to double click sum. The data window appears. It is value is not initialized yet.  Please remember that C does not initialize variables. Your program has to initialize all variables.

Set another breakpoint at line 38. Select Program - Continue. The program asks for the file name. Enter data.

Select Program - Next twice.  The value of sum is initialized to zero.

Select Program - Next twice.  The value of sum is 37.

Select Program - Next twice.  The value of sum is zero again. This is the problem. This variable should be initialized but the initialization should be outside the while block.

      sum = 0;
      while (! feof(fhd)) // not the end of the file yet
        {
          fscanf(fhd, "%d", & elemValue);
          sum += elemValue;

Now, the program should print the sum of the elements in the data file.

This exercise have no linked list but DDD can also help you visualize a linked list. If you prefer visual feedback than working with numbers only, you will find this feature a great help.

Why to Use Debugger?

Many students have only one way to debug: printf. After this exercise, you may feel that a debugger is not much better. That is because this exercise is meant to introduce the debugger and there are many more features.  First, you can make a conditional breakpoint, for example

              b 40 if (sum < 0)

stops the program at line 40 only if the value of sum is negative. This is very helpful because you do not have to insert printing code and compile the program. Another problem of using printf is to remove all the messages after debugging. Printing unnecessary messages can dramatically slow down your program.

You can also watch a variable. The program stops when this variable is changed.  This does not need any change in your program. Imagine how many printf you have to insert to watch the variable.  You will probably save many hours for ECE 264 assignments if you spend an hour reading DDD manual.

 

What to Submit?

Your task is to debug a piece of code similar to the above sample.

Download the skeleton files here. (Note: this is different from the debugger.zip above) It contains several files:

  • ex3.c           - buggy code
  • print.c
  • print.h
  • Makefile
  • data             - input file
  • solution        - correct output for your reference

You may use this web site to check the correctness of this assignment.  The website will send back a pdf explaining the correctness of the assignment based on the various testcases. Password required to unzip the report is the password you created for the testing server.

Your final submission for this exercise must be submitted through Blackboard. Your submission must contain following files:

  • ex3.c
  • print.c
  • print.h
  • Makefile
  • README

In README, explain what you fix and answer the following question:

  1. What is the command used to set break point at line 35, and break if a is less than or equal to b?

Makefile must create an executable called "ex3" after running "make", otherwise you will receive a score of 0.

Failure in the valgrind check wil result in loss of 50% score for each testcase

 

Submission Hints

To make sure your solution match those of the correct code, it makes sense to pipe the output into files and compare them.  To pipe the output to a file, you would type ./ex3 infile > outfile where infile is the name of input file and outfile is the name of your output file.  After running both your version and the correct version, you can compare the files to see if they match, you would type diff file1 file2 where file1 and file2 are the two files you are comparing.  If there is no output, then the files match. You can learn more about diff by typing man diff in Linux prompt, and the manual page will be opened.

 

FAQ

Q: From this exercise, a debugger doesn't seem very useful.

A: A debugger can do much more. You should spend an hour glancing the DDD manual. You will discover many features that can help you. A typical problem among many students is that they are unwilling to spend an hour learning a debugger. They would rather spend many more hours typing, modifying, and removing printf. If you want to save time, learn a debugger.

Q: My program has segmentation fault. DDD tells me where the program crashes but it is not obvious why that particular line is problematic. What can I do?

A: Sometimes, a problem (in particular, accessing invalid memory addresses) occurs earlier but the program crashes later. In some cases, valgrind may help. Valgrind may detect memory errors even though these errors do not make the program crash immediately.

Q: A debugger is helpful if there are bugs. What can I do to prevent bugs?

A: Carefully think about your program before writing it. As your program becomes more complex, you should have a plan before typing. Also, write your program in a simple, easy-to-understand style. You can improve it later. If your program is hard to read, it is hard to debug. You may want to develop a test plan before writing your program. In many cases, you need to write programs to generate test cases.

Q: I want to improve the performance. Simple and easy-to-read programs are too slow.

A: Most programs spend most time in small portions of the code. Some people suggest a 90-10 rule: 90% time spent running on only 10% code. Skipping one statement here and there (such as checking return values) won't make any difference in performance.  To make a program faster, the best strategy is to adopt faster algorithms, not writing mysterious hard-to-understand hard-to-debug code.

Q: All right. I write simple code. What else can I do to prevent bugs?

A: When you develop large complex programs, you should use libraries. There are libraries for linked list and they handle memory management. Using the libraries can substantially improve the chance of correctness. 

A: Another important way to prevent bugs is to follow certain programming styles. For example, memory allocation and release are usually symmetric.