Advanced C Programming

Autumn 2016 :: ECE 264 :: Purdue University

This is for Fall 2016 (8 years ago)
Due 12/9

Buffer overflow attack

Goals

The goals of this assignment are as follows:
  1. Understand how your compiled C code operates at the instruction level.
  2. Appreciate the value of memory safety to security concerns.
  3. Get a brief introduction to application security at the binary level.

Overview

There are many ways to attack a vulnerable application, to behave in a way that the author did not intend—and that the attacker did intend. One very common way is the buffer overflow attack. This is possible if a target program does takes inputs from a user and loads them into a buffer (string) without checking if it is big enough to hold the input.

In this assignment, you are given a vulnerable program, including the source code. Your job is to create a malicious input string that will cause it to call another function.

You will not write any C code for this assignment. The main part of this assignment will be a very small text file, which will be piped as input to the target program, to cause the attack.

Besides the readings (below) much of the calculations for this assignment are being done as an in-class exercise. Be sure to come to every lecture (as always).

Start

This assignment has two required readings.

  1. Understanding C by learning assembly – blog post, ≈7 pages printed
  2. Introduction to x64 assembly language – PDF, 3 pages

These links were sent by email 11/23, 12/1, 12/2, and 12/4. We will assume everyone has read them.

Optional
The following may be of interest to those who desire a deeper understanding.

After completing the readings, type 264get hw15 in bash to get the starter files.

Try running it normally

To run the target program normally, just compile it (gcc -o boa boa.c) and then

You may ignore the compiler warnings for this assignment.

Analyze the code

Compile the code as usual:  gcc -o boa boa.c

To run it normally, type ./boa from bash. Enter your name. You should get a message back. It is simple.

Next, run the code in gdb (gdb boa). Here are some commands you may find useful.

Try setting a breakpoint right before greet_visitor(…) returns (line 10). Then, check the registers and look at the stack. To view a single stack frame and no more, first type p $rbp - $rsp + 8 to get the right number of bytes to show. That will include everything between the base pointer and stack pointer, plus 8 bytes for the return address. Most likely, it will print 40, but you should double-check. Then, type x/40bx $rsp to print 40 bytes, starting at the stack pointer.

Note: The return address is located right below the base pointer.

Here is a PDF walk-through of the function call for this code, by Prof. Quinn. If any of this is unclear, study this walk-through. This was created specifically for HW15. Warning: Addresses may differ between your code and the PDF.

Plan your attack

First, let's test out a way to send arbitrary characters to the program on stdin. It will be as if the characters were typed by a user. From bash, type the following:

printf "Somebody" | ./boa

You should see something like this:

aq@ecegrid-thin1 ~/264/hw15
$ printf "Somebody" | ./boa
Hello.  What is your name?
Hello, Somebody.
aq@ecegrid-thin1 ~/264/hw15
$

Your buffer overflow attack will consist of sending an attack string to the program that is longer than the length of the name string buffer, and spills junk data all the way to the return address. (See the readings.) You will overwrite the return address with a different address in the code (in the text segment), so that when greet_visitor(…) tries to return, instead of jumping back to main(…), it instead jumps to scare_visitor(…). Thus, you need to find the number of bytes from beginning of name to the beginning of the return address. That is the number of filler characters you will have. Then, you will add a few more bytes: the address of the scare_visitor(…) function.

Remember that in memory, addresses are stored in little endian byte order, just like the numbers in the BMP file format in HW13 and HW14. For example, the address 0x400123 would be shown as 0x23 0x01 0x40 0x00 0x00 0x00 0x00 0x00 in the memory dump (via the gdb x command). When you craft your attack string, it will need to be little endian, as well.

You will need to dig around in gdb a bit to find these–and also get somewhat comfortable with all of the above.

Once you have an idea for an attack string, try it out using the same command (printf "████████████████████" | ./boa). If it says, "BRAH!!!" then you're done (assuming "BRAH!!!" doesn't appear in your attack string).

To include bytes that do not correspond to easily typable ASCII characters, you must use a hexadecimal escape sequence. For example, the address 0x400123 can be printed with printf "\x23\x01\x40\x00\x00\x00\x00\x00", using little-endian byte order. For example, "\x41" is the same as "A". Both will emit 1 byte when printed. Likewise, "\x41B\x43" is the same as "A\x42C" is the same as "ABC". All emit 3 bytes when printed.

You should see something like this:

aq@ecegrid-thin1 ~/264/hw15
$ printf "████████████████████" | ./boa
Hello.  What is your name?
Hello, █████████████▓▓▒▒░░BRAH!!!
Segmentation fault (core dumped)

aq@ecegrid-thin1 ~/264/hw15
$

The segmentation fault is acceptable for this assignment. Also, you might see some other junk in the output. As long as it includes "BRAH!!!" you're done.

Create your input.txt

To create your input.txt, just run a similar command to write your attack string to a file.

printf "████████████████████" > ./input.txt

To test it the way we will test it, run the following command:

./boa < input.txt

Submit

In general, to submit any assignment for this course, you will use the following command:

264submit ASSIGNMENT FILES…

For HW15, you will type 264submit hw15 input.txt boa.c boa from inside your hw15 directory.

You can submit as often as you want, even if you are not finished with the assignment. That saves a backup copy which we can retrieve for you if you ever have a problem.

We do not plan to release a pre-tester for this assignment.

Requirements

  1. Your submission must contain each of the following files, as specified:
    file contents
    input.txt attack string Supplying this string to the executable, boa, should cause it to redirect execution to scare_visitor(…), which will print BRAH!!!.
    boa.c target code Target code, submitted as is with no changes
    1. We will most likely not look at this. It is only a failsafe.
    boa target executable Compiled target code
    • This is just the executable that you compiled using gcc.
    • You don't need to do anything except compile what you were given.
    • We will most likely not use this. It is only a failsafe, in case we can't reproduce your attack with ours.

How much work is this?

Assuming you have done the readings (≈2-4 hours) and the in-class exercise, creating the buffer overflow attack string is expected to take about 1-2 hours. (Some have reported even less, but don't count on that.)

Q&A

  1. Will the addresses of functions always stay the same?
    They are tied to a specific executable file. As long as you don't recompile, they will be the same.
  2. How can you be so sure the addresses won't change?
    The executable file specifies where in memory it should be loaded.
  3. Is there any other way to see the mapping between code and instructions, besides GDB?
    objdump -S --disassemble boa
    This prints everything in the executable, including other supporting code. Our code accounts for only 188 of the 8689 bytes in the boa executable.
    It is also possible to get the disassembly from gcc using gcc boa.c -o boa.s but this is less useful than objdump or gdb because it does not interleave the C code lines with the assembly.
  4. What if I recompile?
    If you recompile on the same machine with the same compiler, same options, and same boa.c, addresses will not change.
  5. How is the printf bash command related to the printf(…) C function?
    Name only… and a few format conventions (e.g., %d, %s, etc.) that were mimicked by the creators of the printf bash command. For C programmers, it can be handy to be able to format strings using percent codes (e.g., %d) and hexadecimal escape characters (e.g., \x07) from bash.
  6. I am getting the segmentation fault but no "BRAH!!!" message. What's wrong?
    You are on the right track. Most likely, the new return address isn't getting to the right location on the stack.
    Make sure you have the right address for scare_visitor(…), the right offset (distance from name[0] to return address), and corresponding number of filler characters. Also, keep in mind that gets(…) will write a null terminator ('\0') after your input string. Since the return address, when written in little endian, ends with a bunch of zeros, you may want to leave off the last zero in the return address. That way, the null terminator (==0) will simply be overwriting another zero.
  7. What's the best way to see the contents of my input.txt?
    xxd input.txt
  8. Can I edit input.txt in Vim?
    Yes, but you must open Vim with in binary mode using the -b flag like this:
    vim -b input.txt
    Without that, Vim—like most editors in the world—automatically add a newline (\n) at the end of the file, if there isn't already one there. The -b flag tells Vim not to do that.
    To type non-printable ASCII characters (≤31 or ≥127), press Ctrl-V then x then the two digit hex value (e.g., 06).
    Another way to edit a binary file in Vim is to convert it to the hex dump—and then back. First open the file in binary mode (vim -n input.txt). Next, enter :%!xxd to convert the current buffer to a hex dump. Make any edits you like, but be sure to keep the hex dump format. Finally, enter :%!xxd -r to convert back from the hex dump to the binary file.
  9. Do I need to add any flags special flags to gcc when compiling boa.c?
    No. Initially, Prof. Quinn said in lecture that you should add -O0 since it is widely written online that turning off compiler optimizations using that flag is needed to keep the executable stable. Upon some experimentation, that does not seem to be the case. Likewise, it is often reported that for simple buffer overflow attacks like this to work, you must explicitly tell gcc to turn off certain protections. That does not seem to be necessary for our simple case. It works just fine with the default compiler flags that we use in this class (-g -Wall -Wshadow -Wvla -std=c99).