Huffman 3: linked list of trees
Learning goals
You should learn:
- Practice using linked lists and trees.
- Combining linked lists with trees
- Learning objectives: structures and dynamic structures
Overview
This is part 3 in an 8-part series in which you are implementing the Huffman Coding algorithm to compress files.
- huffman_1: Implement priority queue for future use
- HW16: Calculate character frequencies from a file
- HW17: Create a priority queue of trees from the character frequencies in a file. ◀◀◀ this homework
- HW18: Build a single Huffman tree.
- HW19: Implement functions to write arbitrary bits to a file, for future use.
- HW20: Write the Huffman Tree to the disk.
- HW21: Write the encoded file contents to the disk.
- huffman_8: Create a command-line interface for compressing files.
Multiple files
In this homework, your code will be split into many files. You already have several files with code that serves coherent purposes. In HW16, you created frequencies.c and frequencies.h, for calculating character frequencies in files. In HW10, you created files for a priority queue. You will be renaming those to priority_queue.c and priority_queue.h for this homework.
You will definitely want to use your Makefile for this homework, even if you weren't doing so before. Unlike previous assignments, you will be compiling four (4) C files together. That's too much to type every time!
Getting Started
There is no starter code. By the time you leave this class you should be able to set up a project from scratch. We have given detailed instructions below.
-
Set up your HW17 directory with the files you will need.
- Create a directory for HW17.
you@ecegrid-thin1 ~/ $
mkdir hw17
-
Copy your frequencies.c and frequencies.h into your HW17
you@ecegrid-thin1 ~/ $
cp hw16/{frequencies.c,frequencies.h} hw17/
-
Copy your priority queue code from HW10 into your HW17,
changing the names from huffman.c to priority_queue.c, and huffman.h to priority_queue.h.
you@ecegrid-thin1 ~/ $
cp hw10/huffman.c hw17/priority_queue.c
you@ecegrid-thin1 ~/ $
cp hw10/huffman.h hw17/priority_queue.h
-
Change into your HW17 directory.
you@ecegrid-thin1 ~/ $
cd hw17
- Create a directory for HW17.
-
Edit your priority_queue.c and priority_queue.h.
-
In priority_queue.c, change
#include "huffman.h"
to#include "priority_queue.h"
.you@ecegrid-thin1 hw17/ $
vim priority_queue.c
-
In priority_queue.h, change the include guard from
__HUFFMAN_H__
to__PRIORITY_QUEUE_H__
.
-
In priority_queue.c, change
- Create a new huffman.h
you@ecegrid-thin1 hw17/ $
vim huffman.h
- Add an include guard.
-
Add a
#include
statement for stdlib.h (forsize_t
). -
Add
#include
statements for frequencies.h, priority_queue.h. Be sure to use double quotes (#include "▒▒▒.h"
) not angle brackets (#include <▒▒▒.h>
), since these are your header files. -
Add a struct type called
TreeNode
in huffman.h usingtypedef
syntax. It should have four fields:character
(uchar
),frequency
(size_t
),left
(TreeNode*
), andright
(TreeNode*
). -
Add a function declaration, exactly as below:
Node* make_huffman_pq(Frequencies freqs);
- Create a new huffman.c.
you@ecegrid-thin1 hw17/ $
vim huffman.c
-
Add
#include
statements for the usual standard header files (stdbool.h, stdio.h, stdlib.h, assert.h). -
Add
#include
statements for huffman.h. (You don't need to explicitly include priority_queue.h and frequencies.h since they are included by way of huffman.h).
-
Add
-
Modify your Makefile.
you@ecegrid-thin1 hw17/ $
vim Makefile
-
Set your
SRC_C
variable (at the top) tohuffman.c priority_queue.c frequencies.c
. -
Set your
SRC_H
variable (at the top) tohuffman.h priority_queue.h frequencies.h miniunit.h clog.h
. -
Set your
TEST_C
variable (at the top) totest_huffman.c
. -
Set your
EXECUTABLE
variable (at the top) totest_huffman
. -
Verify that running
make
results in this compilation command:gcc -std=c11 -g -Wall -Wshadow --pedantic -Wvla -Werror test_huffman.c huffman.c priority_queue.c frequencies.c -o test_huffman… and likewise formake test
but with the addition of-DDEBUG
.The order can be different (as long as-o
is immediately beforetest_huffman
). -
Modify your
SUBMIT_FILES
variable to include all of the files you need to submit. As always, it should using existing variable names wherever possible. You will end up with something like this:SUBMIT_FILES=$(SRC_C) $(SRC_H) $(TEST_C) Makefile
-
Set your
Example
Suppose you have a file called a.txt containing
abc\nabca\n
.
Your code calls
calc_frequencies(freqs, "a.txt", &error)
and then
make_huffman_pq(freqs)
The resulting list should look like this:
malloc(…)
should be called eight (8) times.
free(…)
should not be called at all.
a.txt should not be accessed in any way.
How much work is this?
In the instructor's solution, the body of
make_huffman_pq(…)
is 9 sloc. The body of
the compare function is 4 sloc. That is not counting the function signature
and ending brace of those functions.
The cloc
command can measure sloc for individual files if you add the
--by-file
flag.
aq@ecegrid-thin1 asg_info.asg_nickname_lower
$
cloc --by-file frequencies.? huffman.? priority_queue.?
Requirements
- Your submission must contain each of the following files, as specified:
file contents huffman.c functions make huffman pq(Frequencies freqs)
→ return type: Node✶Create a priority queue where the value at each node is a tree node containing a character and its frequency.freqs
is as returned bycalc_frequencies(…)
in HW16.- Ex:
freqs['a'] == 3
for the file containingabc\nabca\n
.
- Ex:
- Caller must free the memory using
destroy_list(…)
from your HW10. - Do not create nodes for characters with frequency zero.
- Use your
pq_enqueue(…)
from your HW10. - You will need to create a compare function.
- Sort by frequency (ascending). The least frequent nodes should come first.
- For nodes with the same frequency, sort by character value (e.g.,
'A'
before'B'
) - Compare function is not listed in the requirements table because it a helper function.
- As with all helper functions:
- Function name should start with
'_'
. - It should be declared as
static
. - We will not test your compare function directly.
- Function name should start with
- Return the address of the head for the newly created list.
- ⚠ Do not make this complicated. These “trees” are size=1 (i.e., single node with no children).
- ⚠ Do not call
free(…)
. - ⚠ Do not open or close any files.
huffman.h struct type TreeNodeStruct type definition for a node in your Huffman tree.- Use
typedef
syntax. - It should have four fields:
character
(uchar
)frequency
(size_t
)left
(TreeNode*
)right
(TreeNode*
)
function declaration make huffman pq(Frequencies freqs)
→ return type: Node✶Declaration ofmake_huffman_pq(…)
.- This should have no curly braces, and should end with a semicolon.
test_huffman.c functions main(int argc, char✶ argv[])
→ return type: int- 100% code coverage is required in huffman.c only.
- Direct testing of frequencies.c and priority_queue.c is optional this time.
- Use your miniunit.h.
- Submissions must meet the code quality standards and other course policies on homework.
fputc
, fputs
, stdout
test_huffman.c
printf
, fputc
test_huffman.c
EXIT_SUCCESS
, EXIT_FAILURE
, free
test_huffman.c
assert
huffman.c
, test_huffman.c
bool
, true
, false
huffman.c
, test_huffman.c
Submit
To submit HW17 from within your hw17 directory, type
make submit
That should result in the following command:
264submit HW17 huffman.c huffman.h frequencies.c frequencies.h priority_queue.c priority_queue.h test_huffman.c miniunit.h clog.h Makefile
Pre-tester ●
The pre-tester for HW17 has been released and is ready to use.
Q&A
-
▒▒▒?
▒▒▒
Updates
Updates may be added here.