Home
Netbeans Eclipse Qt Java
Games
College of Engineering Aeronautics and Astronautics Agricultural and Biological Engineering Biomedical Engineering Chemical Engineering Civil Engineering Construction Engineering and Management Electrical and Computer Engineering Engineering Education Engineering Professional Education Environmental and Ecological Engineering Industrial Engineering Materials Engineering Mechanical Engineering Nuclear Engineering
EPICS (Engineering Projects In Community Service) First-Year Engineering Program First-Year Engineering Honors Program Global Engineering Program Minority Engineering Program Professional Practice (Co-Op) Program Women in Engineering Program
College Administration Schools Programs All Groups All People ECN Webmail
Purdue Home

ECE 264 Exercise 4

Debugging

 

In computers, a bug means an error or a mistake. The word comes from a moth stuck on a relay. Since then, the word is widely used for any problem related to computers, hardware or software. Debugging means finding and solving problems. For many students taking programming courses, debugging is a long process filled with frustration. It doesn't have to be so! This exercise gives you several tips in preventing bugs from happening. If there are bugs, you can use a debugger to find and fix them more quickly.

Prevent Bugs

Some students write programs quickly and then spend many hours debugging. This is a wrong approach. You should first prevent bugs in your programs. The following guidelines can help you reduce the occurrences of bugs.

  • Create functions to handle repeated tasks. If you need to do something multiple times, create a function to do it so that you do not have to copy-paste code. If you need to do something similar with slight difference, use the function's parameters to handle the differences. Copy-paste code is a major source of bugs. You are asking for trouble becasuse you have to keep the multiple copies consistent. If you change one place and forget to change another place, here come your bugs.
  • Give meaningful variable names. Many students like to use i, j, k for variables. There are many problems. First, you won't remember what they mean a few days later. Two, i, j, and k are close on a keyboard. If you want to use j and type k, the program will run but generate wrong result. This error is hard to find.
  • Use indentations. You should use an editor that can automatically format code for you so that you can easily discover unmatched brackets. Eclipse, emacs, and xemacs are good choices. They also provide syntatic highlighting.
  • Read your code after typing.  After typing a section of code, say 25 lines, read the code before proceeding. You should be able to run the code in your head before running the code in a computer. If you do not know what the code is doing, running it on a computer will not help.
  • Turn on compiler warnings. Add "-Wall" and "-Wshadow" after "gcc".  In many cases, a warning is actually an error. You should eliminate all warning messages.
  • Avoid global variables. They can be changed anywhere in a program and very hard to track.
  • Write small experimental code before integrating into the whole program. If you are uncertain how to do something, isolate the functionality and write a small experimental program first. Make sure you fully understand the code before integrating it into the larger program.
  • Add one feature each time. You need to have a plan before writing a program. What feature are you going to add first? Why? How do you know the feature is correctly implemented? Don't implement several features simultaneously and get nothing done.
  • Have a testing plan. How do you know the program is correct? Think about testing before writing code.
  • Check the return values. When your program allocates memory, is malloc successful?  You should read the manual to understand the meanings of the return values. If your program reads from a file, does the file exist?
  • Check array indexes. C does not check whether the index is valid. You have to check. If you have an array of 10 elements, the valid range is 0 to 9 (inclusively).  Please remember that 10 is invalid.  You cannot use a negative number, either.

Do not use Assertion: Some books tell readers to use assertions to ensure that the program terminates when something is wrong.

                assert(something must be true)

for example

                assert(x >= 0);

If x is smaller than zero, the program terminates. However, using assertions is a terrible way of writing programs. Consider the situation when you ask a user to input a file. The user enters a wrong file name and this file does not exist. The program should ask the user to enter the right name. It is unacceptable to terminate the program. This is very frustrating to a user. The user has no idea what is wrong. The program terminates with a message "assertion fail".  What can the user do? Nothing. Worse, the user may have done something before reaching the assertion. After the program terminates, the user has to restart from scratch. Your program should handle the problem, not simply terminating the program.  Never use assertions in your programs.  Assertion is an irresponsible way of writing programs.  Handle the problem. Do not terminate the program.

Choose Right Tools

As an engineer, you must recognize the importance of good tools. In a power outage, you need a flashlight, not a thermometer. To drink water, you need a cup, not a knife. To call your friend, a phone works but a pen does not. Debugging is not a random guessing process. You need a strategy to find and remove problems. You have to understand that printf is a bad "tool" for debugging.  To debug a program, use a debugger not printf. If you can use a debugger efficiently, you can save many hours debugging your program, compared with using printf only.

DDD (Data Display Debugger) is a source-level debugger, meaning that it can show which line of the C code is being executed. Several screenshots are available:

DDD

You can find the manual of ddd. After this exercise, you should glance the manual to learn many more features that are not covered in this exercise.

Sample Program

  1. Download Makefile, exercise4.c, and data. Save them in the same directory.
  2. Type make.
  3. Type ./exercise4.

You will see

                Segmentation fault.

"Segmentation fault" means the program tries to access invalid memory.The question is where in your program does that? This is usually difficult if you do not know how to use a debugger.  If you use a debugger, it is much easier.

Data Display Debugger

DDD is a graphical user interface for gdb.  To use DDD (or gdb), you have to include debugging information in the executable program, by adding "-g" after gcc.  The Makefile already has -g. Type

          ddd exercise4

to start the debugger.

 

To execute this program, select Program - Run. This program does not need any input argument. You can see the bottom shows Segmentation fault.

How can we find which line causes the problem? Select Status - Backtrack and you can see the following window.

 

It shows the flow of the program. Line 22 of main calls fscanf; fscanf calls vfscanf and it calls _IO_vfscanf. The numbers mean the position in the call stack, 0 for the most recent calls. Numbers 0 - 2 are in the C library (libc) and they are most likely correct. (Millions of people use C libraries so bugs are usually fixed quickly.)  The most likely problem comes from our code, line 22 in exercise4.c. Click #3 and you will see line 22 highlighted in the main ddd window.

 

Is the file open correctly? Type

             print fhd

in the lower window (with the "gdb" prompt) and the value is 0.

This means calling fopen has failed. The program crashes because it does not check the return value of fopen. The problem is the file name "data.txt" in the program but the actual file name is "data." We may modify the program. However, this is not a good solution. If the file name is different, we have to modify the program again and compile it again. A better solution is to make it an input of the program.

      char fileName[1024];
      fhd = NULL;
      do 
      {
          printf("please enter the file name : ");
          scanf("%s", fileName);
          fileName[1023] = '\0';
          fhd = fopen(fileName, "r");
      } while (fhd == NULL);

This code asks the user to enter a file name. If fopen fails, the program asks for the file name again.  The file name is stored in a character array with 1024 elements. It is rare to have a file name that long so this should be sufficient. Please notice that the program assigns '\0' to the last element for security. This prevents a malicious user from entering a very long file name (more than 1023 characters) and causing the program to crash.  This type of malicious behavior is called overflow attack and it is very common.

The next error is tricker to find because the program may not crash on all machines.  In C, array indexes start from zero. Therefore, we have to change the next line to

       for (arrayIndex = 0; arrayIndex < numElem; arrayIndex ++)

After the changes, type

               make

to rebuild the program and

               file exercise4

to load the new program into ddd.

Breakpoint

After fixing this problem, the program shows another problem. The printArray function prints lots of zeros even though the input data file has no zero. Is this the problem of the function, or the caller of the function? We are going to set a breakpoint at the function.  A breakpoint means the program will stop at that location. Move your mouse cursor to top of the function, click the right button, select "Break at printArray." Run the program again and give "data" for the file name. The program stops at the first line of printArray.

 

There are only 29 elements but the size given to printArray is 2900.  The problem is at the caller. Change the caller to

         printArray(intArray, numElem);

After the changes, type

               make

to rebuild the program and

               file exercise4

to load the new program into ddd.  Move your mouse cursor to top of the function, click the right button, select "Clear at printArray" to remove the breakpoint.

Single Step

The value of sum is not correct. It always prints the last element in the input file. Set a breakpoint at the line

        fhd = NULL;

by typing

             b 22

after the (gdb) prompt; here 22 means the line number.

Execute the program and it stops at line 22. Use mouse to double click sum. The data window appears. It is value is not initialized yet.  Please remember that C does not initialize variables. Your program has to initialize all variables.

 

Set another breakpoint at line 38. Select Program - Continue. The program asks for the file name. Enter data.

Select Program - Next twice.  The value of sum is initialized to zero.

 

Select Program - Next twice.  The value of sum is 37.

Select Program - Next twice.  The value of sum is zero again. This is the problem. This variable should be initialized but the initialization should be outside the while block.

      sum = 0;
      while (! feof(fhd)) // not the end of the file yet
        {
          fscanf(fhd, "%d", & elemValue);
          sum += elemValue;
Now, the program should print the sum of the elements in the data file.

DDD can help you visualize a linked list.

Why to Use Debugger?

Many students have only one way to debug: printf. After this exercise, you may feel that a debugger is not much better. That is because this exercise is meant to introduce the debugger and there are many more features.  First, you can make a conditional breakpoint, for example

              b 40 if (sum < 0)

stops the program at line 40 only if the value of sum is negative. This is very helpful because you do not have to insert printing code and compile the program. Another problem of using printf is to remove all the messages after debugging. Printing unnecessary messages can dramatically slow down your program.

You can also watch a variable. The program stops when this variable is changed.  This does not need any change in your program. Imagine how many printf you have to insert to watch the variable.  You will probably save many hours for ECE 264 assignments if you spend an hour reading DDD manual.

If you want to be a serious programmer, you need serious tools. You need to be familiar with a debugger.  Suppose you work in a software company and your manager asks you to debug a program. What would the manager think if you say, "Sure, I will insert printf in everywhere."?

What to Submit?

Submit your final, corrected program. It should be something like this file. Slight differences are possible.

FAQ

Q: From this exercise, a debugger doesn't seem very useful.

A: A debugger can do much more. You should spend an hour glancing the DDD manual. You will discover many features that can help you. A typical problem among many students is that they are unwilling to spend an hour or two learning a debugger. They would rather spend many more hours typing, modifying, and removing printf. If you want to save time, learn a debugger.

Q: A debugger is helpful if there are bugs. What can I do to prevent bugs?

A: Carefully think about your program before writing it. As your program becomes more complex, you should have a plan before typing. Also, write your program in a simple, easy-to-understand style. You can improve it later. If your program is hard to read, it is hard to debug.

Q: I want to improve the performance. Simple and easy-to-read programs are too slow.

A: Most programs spend most time in small portions of the code. Some people suggest a 90-10 rule: 90% time spent running on only 10% code. Skipping one statement here and there (such as initialization or checking return values) won't make any difference in performance.  To make a program faster, the best strategy is to adopt faster algorithms, not writing mysterious hard-to-understand hard-to-debug code.

Q: All right. I write simple code. What else can I do to prevent bugs?

A: Use libraries whenever possible. In this class, you learn the implementation of some basic data structures. When you develop large complex programs, you usually can use libraries. There are libraries for linked list and they handle memory management. Using the libraries can substantially improve the chance of correctness.  Do not fall into the NIH (not invented here) problem.

A: Another important way to prevent bugs is to follow certain programming styles. For example, memory allocation and release are usually symmetric.