Advanced C Programming

Autumn 2015 :: ECE 264 :: Purdue University

This is for Fall 2015 (9 years ago)
Due 12/12

Buffer overflow attack

Unlike all of the other assignments, the information about HW13 was sent entirely by email. Text from those emails is copied below, for convenience.

Requirements

HW13. For HW13, please submit three files: input.txt, boa.c, boa (executable). Most likely, we will only look at input.txt. Running ./boa < input.txt should cause it to call scare_visitor(..), and thus print "BRAH!!!". First, follow along with the instructions in Dr. Kak's article using his example. Then, then apply the same principle to the boa.c example we have used in class. (Run 264get hw13 to get it.) This hasn't been posted yet, but that's all there is to it.

How much work is this?

Time-wise, my general estimate would be about 2 hours reading Dr. Kak's notes (assigned 12/2), 1-2 hours reading the two short articles sent yesterday, 30 mins on exercise #1, and 1-2 hours on HW13 (due this Sat 12/12).

Required readings

Please read pages 6 to 43 of Prof. Kak's notes on buffer overflow attacks by Tue 12/8. You can skim the case studies on pages 8-14, but make sure you understand the basic idea of what went wrong. You should come away understanding why buffer overflows are such a treacherous kind of bug, how they work, how they could be exploited by malicious attackers, and how you can write secure code that resists such attacks.

Please read the following for class tomorrow (12/10/2015). This is ≈7 pages in total.

More information

Backtrace. The addresses listed by gdb in the backtrace (e.g., 0x400558 and 0x4005f8) are for the next instruction to be executed in those functions, not the beginning of those functions. main() begins at 0x4005d8. greet_visitor() begins at 0x400554. You will also find the address 0x400470, which is for _start(), a part of the "load system" that calls your main function.

From class. Here are a few things I referred to in class yesterday (Tue 12/8):

(optional) For those who are curious. If you wish to play with the boa program in gdb before tomorrow, you might also find the following useful:

What you need to know

Things you need to know from this week
  • What happens under the hood when we call a function in C?
    • What is the overall process?
    • What happens with the stack?
    • What is the role of the registers?
      • rip – aka $pc, instruction pointer, program counter, IP, PC
      • rsp - aka $sp, stack pointer
      • rbp - aka base pointer, frame pointer
      • general purpose – e.g., rdi, rsi, rdx, rcx, r8, …, r15
    • What happens in the prologue and epilogue of a function?
  • How are assembly instructions different from C code?
    • What kinds of operations do assembly instructions do?
    • What do the following categories of instructions do?
      • jump, call, return, arithmetic, push, pop, move, arithmetic
  • How do buffer overflow/overread attacks work?
    • How does an attacker perpetrate an attack?
      • simple buffer overflow
      • simple buffer overread
      • What is the role of a debugger (i.e., gdb)?
    • How can you write C code that is resistant to such attacks?
  • ... only as these apply to the x64 (aka AMD64, x86-64) architecture
Things you do NOT need to know (yet)
  • syntax of any specific instruction
  • size of particular registers
  • how to write programs in assembly language
  • directives (.intel_syntax)
  • any architecture other than AMD-64 on Linux

Q&A

  1. Can you give an illustration of how function calls work?
    I made this example:
    https://engineering.purdue.edu/ece264/15au/static/stack_example.pdf

    Some of the details in the stack frame for main() seem to contradict my understanding about calling x64 conventions. I suspect main() may be a special case, since it gets called by the load system. At any rate, your task for HW13 is not related to main() so it shouldn't be an issue. You can see anything in that example for yourself by simply creating breakpoints and using the commands discussed in class (and given in the earlier email).

  2. How do I create the input.txt file?
    This was posted on Blackboard:
    Recommended method: vim -b and Ctrl-v x ░ ░

    Here's the method I showed in class. I find this the easiest. In this example, I'll show how to create the string used in Dr. Kak's notes (page 42).

    1. Open a file called input.txt in vim, in binary mode.
      vim -b input.txt
    2. Press i to start inserting characters.
    3. Press A 24 times to enter "AAAAAAAAAAAAAAAAAAAAAAAA".
    4. Press Ctrl-v x 8 e to enter the character 0x8e.
    5. Press Ctrl-v x 0 6 to enter the character 0x06.
    6. Press Ctrl-v x 4 0 to enter the character 0x40.
    7. Press Ctrl-v x 0 0 to enter the character 0x00.

    Alternative method: xxd -r

    If you don't like that method, another option is to create a hex dump in the format of xxd, and then use xxd -r input.hexdump > input.txt and redirect the output to input.txt.

    1. Open a file called input.hexdump in vim
      vim -b input.hexdump
    2. Enter the following:
      0000000: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41
      0000010: 41 41 41 41 41 41 41 41 8e 06 40 00
    3. Use xxd -r to reverse it and create your input.txt.
      xxd -r input.hexdump > input.txt

    Either way you should end up with a file called input.txt that, when viewed with xxd -g1, contains the following:

    0000000: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000010: 41 41 41 41 41 41 41 41 c2 8e 06 40 00 0a AAAAAAAA...@..

    I hope that helps.