Advanced C Programming

Autumn 2016 :: ECE 264 :: Purdue University

Due 9/12

Debugging with GDB

Goals

  1. Learn strategies for debugging code
  2. Practice using the gdb debugging tool
  3. Reinforce your understanding of memory

Overview

In this assignment, you will learn how to debug C programs using gdb, a very widely used command-line debugger. You still start by reading a tutorial. Then, you will use gdb to diagnose some problems in a small program. You will not turn in any C code. Instead, you will turn in your gdb sessions in the form of log files that capture your commands and gdb's output.

1. Setup

We will be using a more current version of gdb for this assignment than the default on ecegrid. To make sure you get it, you will need to update your .bashrc configuration file. Please enter the following from bash:

cp ~ece264s0/16au/.bashrc ~/.bashrc
If you have been using your own .bashrc, the main change that you need is this:
export PATH="/opt/gcc/6.1.0/bin:/home/shay/a/ece264s0/16au/bin:$PATH"

Next, restart your session.

To test that it worked, type the following:

gdb --version

You should see this:

$ gdb --version
GNU gdb (GDB) 7.12.50.20160803-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".

2. Read

Before you do anything else, READ the following sections of Richard M. Stallman's excellent gdb tutorial. Do not skim. We think you will find this tutorial easy to understand. If not, post questions to Blackboard. The rest of this assignment (and exam #1) will assume you have read and understood every word of this (except the parts that it says to skip).

  1. gdb Frequently Asked Questions (FAQ) (only #1, #2, #3, and #4; skip the rest)
  2. How do I use gdb? (all)
  3. How do I watch the execution of my program? (all)
  4. How do I use the call stack? (all)
  5. How do I use breakpoints? (all except 4.3)
  6. How do I use watchpoints? (all)
  7. Advanced gdb Features (only 6.1 and 6.3; skip the rest)
  8. Example Debugging Session: Infinite Loop Example (all)
  9. Example Debugging Session: Segmentation Fault Example (all)

When asking questions to course staff, refer to the relevant section of the reading.

That is so we can clarify the issue in that context. Everyone needs to come away from this assignment with a big picture understanding of how you can use gdb to solve future problems—not just the specific commands you need for this homework.

3. Try the code

Get the code for hw05 using 264get. You will find four files:

This code has bugs. You will learn how to find them. Finding the bugs is not the purpose of the assignment, so you are welcome to ask course staff or classmates for help finding them. Your main purpose here is to understand the functionality of gdb, and demonstrate that you are able to use it.

Compile calc.c and test_calc.c to create an executable called calc. (By now, you should know these commands.)

Try running ./calc . Notice that it hangs. There must be an infinite loop. Press Ctrl-C to stop it.

4. Start gdb and turn on logging

Run the calc program using the debugger (gdb). The command to start gdb was in the required reading for this assignment. (Hint: See section 1.1 “How do I run programs with the debugger?”.) If you have not read all 9 sections yet, please stop and do so now.

For this assignment, you save the commands you type and gdb's output in files, which you will turn in. You will have two debugging sessions in gdb. For simplicity, you will save the log files for the two sessions separately. Logging must be turned on manually when you start gdb. For the first gdb session, enter the following four commands:

(gdb) set logging file gdb.1.log
(gdb) set logging on
(gdb) set history filename gdb.1.history
(gdb) set history save
(gdb)

Not sure what command to type to call printf(…)? All of the commands needed for this assignment are in the tutorial reading. We are not giving the exact command to type because we want to help you learn this tool, together with some basic debugging techniques. You will be using these for the rest of the semester. If we gave you the exact command to type, this would be a typing exercise, which we're pretty sure you don't need.

Here's a hint to help you get started: printf(…) is one of the C standard library functions linked into your program.

You must use the commands found in the tutorial reading (unless otherwise noted in these instructions). Our scoring will look for the specific command types found in the tutorial. If you look on the web—normally encouraged—you may find other ways to do the same thing that our tester won't recognize. It's totally fine to play with the commands and enter other stuff while you're at it. Our tester will ignore other intervening commands.

5. Diagnose the infinite loop

Start calc from within gdb using the run command. It will go into an infinite loop. Press Ctrl-C (like before) to stop your program. You will now be in gdb, ready to diagnose the problem.

SAMPLE OUTPUT
(gdb) run
Starting program: …/calc

View the backtrace to see what function you are in, and what function called it. You should be in _parse_integer(…).

SAMPLE OUTPUT
(gdb) ████████████
#0  0x00000000004005c5 in _parse_integer (s=0x400862 "1") at calc.c:39
#1  0x00000000004004e5 in calculate (lhs_str=0x400862 "1", operator=43 '+',
    rhs_str=0x400860 "1") at calc.c:10
#2  0x00000000004006eb in main (argc=1, argv=0x7fffffffe078) at test_calc.c:5
(gdb)

View the code near where you are (your output may not be exactly the same as below).

SAMPLE OUTPUT
(gdb) ████████████
35              _find_base(&start, &base);
36
37              int value = 0;   // This will be the return value from this function
38              int i = 0;
39              while(start[i] != '\0') {
40                      value *= base;
41                      value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
42              }
43
(gdb)

Have gdb print the arguments and local variables to the current function.

SAMPLE OUTPUT
(gdb) ████████████
s = 0x400862 "1"
(gdb) ████████████
start = 0x400a38 "1"
sign = 1
base = 10
value = -954437178
i = 0
(gdb)

Try stepping through the code for a while. There are two commands in gdb for stepping through code. One will step over function calls, while the other will step into them. For now, use the one that steps over function calls, even though it doesn't really matter for this piece of code. Once you have entered that command once, you can just hit Enter to repeat it. (That works for many gdb commands.)

SAMPLE OUTPUT
(gdb) ████████████
41                      value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
(gdb)
39              while(start[i] != '\0') {
(gdb)
40                      value *= base;
(gdb)
41                      value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
(gdb)
39              while(start[i] != '\0') {
(gdb)
40                      value *= base;
(gdb)
41                      value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
(gdb)

If you do that for a while, you will see the pattern. Your program is going through a loop. But how do you know if it's an infinite loop? Well, the loop is controlled by start[i] so first, let's print its value once.

SAMPLE OUTPUT
(gdb) ████████████
$3 = 49 '1'
(gdb)

Let's keep an eye on it while we step through. The display command (not in the tutorial reading) prints the value of a variable after every gdb command you type. You can use it to watch start[i] while you step through the code.

SAMPLE OUTPUT
(gdb) display start[i]
1: start[i] = 49 '1'
(gdb) next
40                      value *= base;
1: start[i] = 49 '1'
(gdb)
41                      value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
1: start[i] = 49 '1'
(gdb)
39              while(start[i] != '\0') {
1: start[i] = 49 '1'
(gdb)
40                      value *= base;
1: start[i] = 49 '1'
(gdb)
41                      value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
1: start[i] = 49 '1'
(gdb)
39              while(start[i] != '\0') {
1: start[i] = 49 '1'
(gdb)

Now, turn it off. You can use undisplay with the "display number" of the expression you don't want to display anymore (shown just to the left of the expression each time it is displayed). The undisplay command is not in the tutorial reading, and has no output.

(gdb) ████████████
(gdb)

Quit gdb. When it warns that your debugging session is active and asks if you really want to quit, answer "y".

SAMPLE OUTPUT
(gdb) ████████████
A debugging session is active.

        Inferior 1 [process 18729] will be killed.

Quit anyway? (y or n) y

aq@ecegrid-thin1 ~/264/hw05
$

6. Fix the infinite loop in the code

Edit calc.c (using vim or your preferred code editor) and fix the infinite loop in calc.c. The fix is just to increment i at the end of the while loop in _parse_integer(…). When you're done, it should look like this:

39     while(start[i] != '\0') {
40         value *= base;
41         value += start[i] - (start[i] <= '9' ? '0' : 'a' - 10);
42         i++;
43     }

You don't need to turn this in, but you will need to fix it in order to proceed with this assignment.

7. Diagnose a memory problem

There is another problem. If you run this program, you will notice a new problem. It might have a segmentation fault, or it might just print the wrong value. (In our testing, we observed both at different times. Such inconsistent behavior is often a sign of a memory problem.)

When debugging a mysterious program, it is best to have a hypothesis. However, at this point, we don't have much information with which to form a hypothesis, so one option is to step through the code. We will start with that.

Launch gdb and turn on logging again using the given commands. Note that the filenames have '2' instead of '1' this time.

(gdb) set logging file gdb.2.log
(gdb) set logging on
(gdb) set history filename gdb.2.history
(gdb) set history save
(gdb)

Set a breakpoint at the beginning of the main(…) function. (Do not use the line number to set this breakpoint.)

SAMPLE OUTPUT
(gdb) ████████████
Breakpoint 1 at 0x4006db: file test_calc.c, line 5.
(gdb)

Run the program. It will stop at the breakpoint you set in the previous step.

SAMPLE OUTPUT
(gdb) ████████████
Starting program: …/calc

Breakpoint 1, main (argc=1, argv=0x7fffffffe0e8) at test_calc.c:5
5               int result = calculate("1", '+', "1");
(gdb)

Print the basic frame information using the frame command (not in the tutorial).

SAMPLE OUTPUT
(gdb) frame
#0  main (argc=1, argv=0x7fffffffe0e8) at test_calc.c:5
5               int result = calculate("1", '+', "1");
(gdb)

That is the same as what gdb printed when it hit the breakpoint, but this is a good command to remember. It's easy to forget where you are.

List the nearby code so you can see some context.

SAMPLE OUTPUT
(gdb) ████████████
1       #include 
2       #include "calc.h"
3
4       int main(int argc, char *argv[]) {
5               int result = calculate("1", '+', "1");
6               printf("1 + 1 = %d\n", result);
7
8               result = calculate("0xa", '+', "3");
9               printf("0xa + 3 = %d\n", result);
10
(gdb)

Let it run until line 8, since it is the example with the hexadecimal argument that is causing problems. Although you could just step through (using the one that steps over function calls), let's try a more direct way, using the until command. (The until command is not in the tutorial reading.)

SAMPLE OUTPUT
(gdb) until 8 
1 + 1 = 2
main (argc=1, argv=0x7fffffffe0e8) at test_calc.c:8
8               result = calculate("0xa", '+', "3");
(gdb) 

Step through the code, but this time use the command that steps into a function call, instead of over it. After entering that command once, you can simply press Enter to repeat it. Keep pressing Enter until it gets to line 59 of calc.c.

SAMPLE OUTPUT
(gdb) ████████████
calculate (lhs_str=0x400892 "0xa", operator=43 '+', rhs_str=0x400890 "3") at calc.c:10
10              int lhs_int = _parse_integer(lhs_str);
(gdb)
_parse_integer (s=0x400892 "0xa") at calc.c:29
29              const char* start = s; // address of first digit after "0x", "0b", or "-"
(gdb)
31              int sign  = 0;   // 1 if positive, -1 if negative
(gdb)
32              _find_sign(&start, &sign);
(gdb)
_find_sign (start=0x7fffffffdf80, sign=0x7fffffffdf7c) at calc.c:49
49              if(*start[0] == '-') {
(gdb)
54                      *sign = 1;   // No minus sign, so mark this as positive
(gdb)
56      }
(gdb)
_parse_integer (s=0x400892 "0xa") at calc.c:34
34              int base  = 0;   // 10 for decimal, 16 for hexadecimal, 2 for binary
(gdb)
35              _find_base(&start, &base);
(gdb)
_find_base (start=0x7fffffffdf80, base=0x7fffffffdf78) at calc.c:59
59              if(*start[0] == '0' && *start[1] == 'x') {
(gdb)

This looks interesting… Is the expression *start[1] correct?

In this function, start is the address of the address of a character in a string. the expression *start[1] is intended to get the address of the character in the string (i.e., *start), and then treat it like an array so we can get the next character. In that expression, * and […] are both operators. In arithemetic, most people know that / takes precedence over +, such that a + b / c is the same as a + (b / c), not (a + b) / c. However, the order of operator precedence precedence for * (address dereference) and […] (array subscript) is less intuitive. It is the kind of thing programmers sometimes make mistakes with. Does the * take precedence over […], or vice versa? We could look it up, but let's dig around and understand this more directly.

Print the value of start. Since it is a const char**, it makes sense that its value is just a memory address.

SAMPLE OUTPUT
(gdb) ████████████
$1 = (const char **) 0x7fffffffdf80

Examine the memory at that address. For this, you want the x command. (It's in the tutorial, but this one has a lot of options, so we'll explain.) The simplest way to do this is x/8bx start which will print 8 bytes of memory starting at address start, displayed in hex notation. However, that's hard to read because the order in which bytes appear in memory is a little complicated.

SAMPLE OUTPUT
(gdb) x/8bx start
0x7fffffffdf80: 0x92    0x08    0x40    0x00    0x00    0x00    0x00    0x00

The x command has a special option for printing memory that contains addresses. All it does is group into 8 bytes at a time (assuming an address is 8 bytes, i.e., on a 64-bit system). The command to enter is x/1a start which prints 1 address worth of memory, starting at the address start.

SAMPLE OUTPUT
(gdb) x/1a start
0x7fffffffdf80: 0x0000000000400892

Notice that the right two digits (92) were on the left above. That's an issue of byte order which we won't go into right now.

Print the value of *start.

SAMPLE OUTPUT
(gdb) ████████████
$4 = 0x400892 "0xa"

That is the string we passed to calculate(…) in the main function, in test_calc.c, so that looks like what we would expect. Now, let's start to test our hypothesis about getting the operator precedence wrong. Let's try it with different parentheses.

Print the value of *start[0]. (Note the zero, not one.)

SAMPLE OUTPUT
(gdb) ████████████
$5 = 48 '0'
(gdb)

That is what we would expect.

Print the value of *start[1] (now with a one).

SAMPLE OUTPUT
(gdb) ████████████
Cannot access memory at address 0x1
(gdb)

Don't worry if you got some other output. As you can see, this is a memory problem. Memory problems often cause inconsistent, unpredictable behavior. You might find that it prints some other character. (In our testing, it sometimes printed 'I', among other things.)

To keep this discussion coherent, we will assume you got the memory error above.

If we weren't running under gdb, that would have been a segmentation fault. That may be the cause of our problem. Let's explore further.

Print the value of (*start)[1] (with the parentheses).

SAMPLE OUTPUT
(gdb) ████████████
$6 = 120 'x'
(gdb)

Now, let's try the parentheses the other way.

Print the value of *(start[1]).

SAMPLE OUTPUT
(gdb) ████████████
Cannot access memory at address 0x1
(gdb)

Based on this, it appears that this expression needs parentheses around (*start). Let us fix it.

Quit gdb. Like before, when it warns that your debugging session is active and asks if you really want to quit, answer "y".

SAMPLE OUTPUT
(gdb) ████████████
A debugging session is active.

        Inferior 1 [process 18729] will be killed.

Quit anyway? (y or n) y

aq@ecegrid-thin1 ~/264/hw05
$

8. Fix the operator precedence bug in the code

This part won't be scored so if you're feeling lazy or rushed, you can stop here. However, fixing the code and seeing it work will be easy.

Open the code in vim and change every instance of *start[…] to (*start)[…]. You will find five occurrences, in _find_sign(…) and _find_base(…). When it is fixed, you should have the following:

⋯
49     if((*start)[0] == '-') {
⋯
59     if((*start)[0] == '0' && (*start)[1] == 'x') {
⋯
63     else if((*start)[0] == '0' && (*start)[1] == 'b') {
⋯

Save, compile, and run. You should have working code with output that matches up with test_calc.txt. Testing in the usual way (same as hw02 and hw03) verifies that the program output matches the expected output in test_calc.txt.

SAMPLE OUTPUT
aq@ecegrid-thin1 ~/264/hw05
$ gcc -o calc calc.c test_calc.c && ./calc | diff - test_calc.txt

aq@ecegrid-thin1 ~/264/hw05
$ echo $?
0

No output from diff means the program works as intended. The second part is just a double-check.

Submit

You will submit four files:

Submit using the 264submit command.

aq@ecegrid-thin1 ~/264/hw05
$ 264submit hw05 gdb.*
Be careful about the filenames. You should have a total of four files. Make sure the filenames match exactly.

Requirements

Look at your files and make sure they contains all of the steps above. If anything goes wrong, you might need to redo this. It is okay if you experimented with commands or started, stopped, etc. in the middle, as long as those steps are in there. Differences in memory addresses will be ignored.

Q&A

  1. How will this be scored?
    We will look through your gdb.log file for the commands needed to follow the steps listed above. Make sure you have done all of the steps listed. It's okay if you have some junk in between, or if you have to repeat things now and then. This will not be scored using a "replay" mechanism (like the vim assignment).
  2. Will there be partial credit?
    Yes, but you shouldn't need it. As long as you read the tutorial carefully, and get clarification on anything you misunderstood, the rest of this should be pretty easy.
  3. Can I start over?
    Yes, of course. One way is to just delete your hw05 directory (rm -rf hw05) and then 264get hw05 again. If you want to redo just one session or the other, you can delete the gdb.#.log and gdb.#.history files.
  4. Can I exit a session and resume later?
    No. The history file is apparently overwritten (not appended) each time you start gdb. However, you definitely can do the two sessions (infinite loop and memory problem) separately and redo either of those without redoing the other.

Updates

9/11: Corrected the output in steps 5 and 7 because the calc.c had been edited since these instructions were first written; added command for starting gdb.