Advanced C Programming

Spring 2020 ECE 264 :: Purdue University

⚠ This is a PAST SEMESTER (Spring 2020).
Due 4/14

Huffman 3: linked list of trees

Learning goals

You should learn:

  1. Practice using linked lists and trees.
  2. Combining linked lists with trees
  3. Learning objectives: structures and dynamic structures

Overview

This is part 3 in an 8-part series in which you are implementing the Huffman Coding algorithm to compress files.

  1. HW11: Implement priority queue for future use
  2. HW12: Calculate character frequencies from a file
  3. HW13: Create a priority queue of trees from the character frequencies in a file. ◀◀◀ this homework
  4. HW14: Build a single Huffman tree.
  5. HW15: Implement functions to write arbitrary bits to a file, for future use.
  6. HW16: Write the Huffman Tree to the disk.
  7. HW17: Write the encoded file contents to the disk.
  8. HW18: Create a command-line interface for compressing files.

Multiple files

In this homework, your code will be split into many files. You already have several files with code that serves coherent purposes. In HW12, you created frequencies.c and frequencies.h, for calculating character frequencies in files. In HW11, you created files for a priority queue. You will be renaming those to priority_queue.c and priority_queue.h for this homework.

You will definitely want to use your Makefile for this homework, even if you weren't doing so before. Unlike previous assignments, you will be compiling four (4) C files together. That's too much to type every time!

Getting Started

There is no starter code. By the time you leave this class you should be able to set up a project from scratch. We have given detailed instructions below.

  1. Set up your HW13 directory with the files you will need.
    1. Create a directory for HW13 and cd into it.
      you@ecegrid-thin1 ~/ $ mkdir hw13
    2. Copy your frequencies.c and frequencies.h into your HW13
      you@ecegrid-thin1 ~/ $ cp hw12/{frequencies.c,frequencies.h} hw13/
    3. Copy your priority queue code from HW11 into your HW13, changing the names from huffman.c to priority_queue.c, and huffman.h to priority_queue.h.
      you@ecegrid-thin1 ~/ $ cp hw11/huffman.c hw13/priority_queue.c
      you@ecegrid-thin1 ~/ $ cp hw11/huffman.h hw13/priority_queue.h
    4. Change into your HW13 directory.
      you@ecegrid-thin1 ~/ $ cd hw13
  2. Edit your priority_queue.c and priority_queue.h.
    1. In priority_queue.c, change #include "huffman.h" to #include "priority_queue.h" .
      you@ecegrid-thin1 hw13/ $ vim priority_queue.c
    2. In priority_queue.h, change the include guard from __HUFFMAN_H__ to __PRIORITY_QUEUE_H__.
  3. Create a new huffman.h
    you@ecegrid-thin1 hw13/ $ vim huffman.h
    1. Add an include guard.
    2. Add a #include statement for stdlib.h (for size_t).
    3. Add #include statements for frequencies.h, priority_queue.h. Be sure to use double quotes (#include "▒▒▒.h") not angle brackets (#include <▒▒▒.h>), since these are your header files.
    4. Add a struct type called TreeNode in huffman.h using typedef syntax. It should have four fields: character (uchar), frequency (size_t), left (TreeNode*), and right (TreeNode*).
    5. Add a function declaration, exactly as below:
      Node* make_huffman_pq(Frequencies freqs);
  4. Create a new huffman.c.
    you@ecegrid-thin1 hw13/ $ vim huffman.c
    1. Add #include statements for the usual standard header files (stdbool.h, stdio.h, stdlib.h, assert.h).
    2. Add #include statements for huffman.h. (You don't need to explicitly include priority_queue.h and frequencies.h since they are included by way of huffman.h).
  5. Modify your Makefile.
    you@ecegrid-thin1 hw13/ $ vim Makefile
    1. Set your SRC_C variable (at the top) to huffman.c priority_queue.c frequencies.c.
    2. Set your SRC_H variable (at the top) to huffman.h priority_queue.h frequencies.h miniunit.h clog.h.
    3. Set your TEST_C variable (at the top) to test_huffman.c.
    4. Set your EXECUTABLE variable (at the top) to test_huffman.
    5. Verify that running make results in this compilation command:
      gcc -std=c11 -g -Wall -Wshadow --pedantic -Wvla -Werror test_huffman.c huffman.c priority_queue.c frequencies.c -o test_huffman
      … and likewise for make test but with the addition of -DDEBUG.
      The order can be different (as long as -o is immediately before test_huffman).
    6. Create a new variable called SUBMIT_FILESbelow all of the aforementioned variable definitions—with all of the files you need to submit, using existing variable names wherever possible. It should look like this:
      SUBMIT_FILES=$(SRC_C) $(SRC_H) $(TEST_C) Makefile
    7. Modify your submit rule to use the new variable. (You may hand-copy this, but copy-pasting won't work.)
      submit:
          264submit $(ASG_NICKNAME) $(SUBMIT_FILES)
      

Example

Suppose you have a file called a.txt containing abc\nabca\n .

Your code calls calc_frequencies(freqs, "a.txt", &error) and then make_huffman_pq(freqs)

The resulting list should look like this:

┌────┬───┐ ┌───┬───┐ ┌───┬───┐ ┌───┬───┐ │ █ │ █─┼────▶│ █ │ █─┼────▶│ █ │ █─┼────▶│ █ │ █─┼──┐ └─┼──┴───┘ └─┼─┴───┘ └─┼─┴───┘ └─┼─┴───┘ ▼ │ │ │ │ NULL ▼ ▼ ▼ ▼ ┌─┴──┬───┬─┬─┐ ┌─┴─┬───┬─┬─┐ ┌─┴─┬───┬─┬─┐ ┌─┴─┬───┬─┬─┐ │'\n'│ 2 │█│█│ │'b'│ 2 │█│█│ │'c'│ 2 │█│█│ │'a'│ 3 │█│█│ └────┴───┴┼┴┼┘ └───┴───┴┼┴┼┘ └───┴───┴┼┴┼┘ └───┴───┴┼┴┼┘ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ NULL NULL NULL NULL NULL NULL NULL NULL

malloc(…) should be called eight (8) times.

free(…) should not be called at all.

a.txt should not be accessed in any way.

How much work is this?

In the instructor's solution, the body of make_huffman_pq(…) is 9 sloc. The body of the compare function is 4 sloc. That is not counting the function signature and ending brace of those functions.

The cloc command can measure sloc for individual files if you add the --by-file flag.

aq@ecegrid-thin1 asg_info.asg_nickname_lower $ cloc --by-file frequencies.? huffman.? priority_queue.?
6 text files. 6 unique files. 0 files ignored. github.com/AlDanial/cloc v 1.75 T=0.04 s (159.2 files/s, 4005.6 lines/s) ------------------------------------------------------------------------------- File blank comment code ------------------------------------------------------------------------------- priority_queue.c 7 6 42 huffman.c 2 0 23 frequencies.c 2 1 19 huffman.h 3 0 13 priority_queue.h 7 1 12 frequencies.h 4 1 8 ------------------------------------------------------------------------------- SUM: 25 9 117 -------------------------------------------------------------------------------

Requirements

  1. Your submission must contain each of the following files, as specified:
    file contents
    huffman.c functions
    make huffman pq(Frequencies freqs)
    return type: Node✶
    Create a priority queue where the value at each node is a tree node containing a character and its frequency.
    • freqs is as returned by calc_frequencies(…) in HW12.
      • Ex: freqs['a'] == 3 for the file containing abc\nabca\n .
    • Caller must free the memory using destroy_list(…) from your HW11.
    • Do not create nodes for characters with frequency zero.
    • Use your pq_enqueue(…) from your HW11.
    • You will need to create a compare function.
      • Sort by frequency (ascending). The least frequent nodes should come first.
      • For nodes with the same frequency, sort by character value (e.g., 'A' before 'B')
      • Compare function is not listed in the requirements table because it a helper function.
      • As with all helper functions:
        • Function name should start with '_'.
        • It should be declared as static.
        • We will not test your compare function directly.
    • Return the address of the head for the newly created list.
    • Do not make this complicated. These “trees” are size=1 (i.e., single node with no children).
    • Do not call free(…).
    • Do not open or close any files.
    huffman.h struct type
    TreeNode
    Struct type definition for a node in your Huffman tree.
    • Use typedef syntax.
    • It should have four fields:
      • character (uchar)
      • frequency (size_t)
      • left (TreeNode*)
      • right (TreeNode*)
    function declaration
    make huffman pq(Frequencies freqs)
    return type: Node✶
    Declaration of make_huffman_pq(…).
    • This should have no curly braces, and should end with a semicolon.
    test_huffman.c functions
    main(int argc, char✶ argv[])
    return type: int
    • 100% code coverage is required in huffman.c only.
    • Direct testing of frequencies.c and priority_queue.c is optional this time.
    • Use your miniunit.h.
  2. stdio.h fputc, fputs, stdout test_huffman.c printf, fputc test_huffman.c stdlib.h EXIT_SUCCESS, EXIT_FAILURE, free test_huffman.c assert.h assert huffman.c, test_huffman.c stdbool.h bool, true, false huffman.c, test_huffman.c Feel free to ask if there is something else you would like to use. -->
  3. Submissions must meet the code quality standards and other course policies on homework.

Submit

To submit HW13 from within your hw13 directory, type

make submit

That should result in the following command: 264submit HW13 huffman.c huffman.h frequencies.c frequencies.h priority_queue.c priority_queue.h test_huffman.c miniunit.h clog.h Makefile

Pre-tester

The pre-tester for HW13 has not yet been released. As soon as it is ready, this note will be changed.

Q&A

  1. ▒▒▒?
    ▒▒▒

Updates

4/11/2020 Fixed many issues. (Thanks HBB hunters!)
  • #include "huffman.h"
  • #include "frequencies.h"
  • #include "priority_queue.h"
  • Copy commands for priority_queue.c and priority_queue.h
  • Don't #include "huffman.h" from huffman.h.
  • Fixed example. (String did not match the frequencies shown the diagram.)
Also… lots more detail was added for updating your Makefile.