Huffman 6: encode tree to file
Overview
This is part 6 in an 8-part series in which you are implementing the Huffman Coding algorithm to compress files.
- HW15: Implement priority queue for future use
- HW16: Calculate character frequencies from a file
- HW17: Create a priority queue of trees from the character frequencies in a file.
- HW18: Build a single Huffman tree.
- HW19: Implement functions to write arbitrary bits to a file, for future use.
- HW20: Write the Huffman Tree to the disk. ◀◀◀ this homework
- HW21: Write the encoded file contents to the disk.
- huffman_8: Create a command-line interface for compressing files.
In this homework you will use your code from HW19 to write the Huffman coding table to a file.
The coding table consists of a mapping from each character found in the input, to a sequence of bits. It isn't actually stored in the form of a “table” (or array). It is part of the Huffman tree itself. You will traverse the tree to find each character and the bits used to encode it.
Input
The input to your code will be a Huffman tree like this:
Output
Your code will write the bits for the coding table to a file using the functions you wrote in HW19, according to the algorithm in article .
Here's how the Huffman tree above would be represented. You will write this only. Do not write anything before or after this.
In case of any discrepency between this example and the article , the article shall take priority.
Algorithm
You must read the article completely and carefully. Do not ask for help if you have not first read that.
See the article and Huffman walkthrough for a complete description of how the tree is converted to bits.
Here's a synopsis. If this doesn't make sense, go back to the article . You do a post-order traversal.
- Traverse the left subtree of the root (i.e., encode it to the file).
- Traverse the right subtree of the root (i.e., encode it to the file).
- Visit the root.
Every time you “visit” a root (including the root of a subtree):
-
If it is a leaf (i.e., character), you write one bit: 1. Then, you write
the entier character (8 bits). Example: If the character is
'A'
, you will write101000001
. The1
is to signify that it is a leaf. The01000001
is to specify the character itself. - If it is a non-leaf (i.e., cluster)), you write one bit: 0.
Getting Started
There is no starter code. However, we recommend creating a new directory for each step of the Huffman series.
- You should have already read the article and Huffman walkthrough . If you have not done so already, please do so before you proceed.
- Create a directory for HW20 and change into it.
you@ecegrid-thin1 ~ $
mkdir hw20
you@ecegrid-thin1 ~ $
cd hw20
-
Copy all of your files from HW18
and the bit_writer.[ch] files from your
HW19 into your
HW20 directory.
you@ecegrid-thin1 ~/hw20 $
cp ../hw18/{*.[ch],Makefile} -t ./
you@ecegrid-thin1 ~/hw20 $
cp ../hw19/bit_writer.[ch] -t ./
-
Update your Makefile with the new
ASG_NICKNAME
.You should now have this:you@ecegrid-thin1 ~/hw20 $
vim Makefile
ASG_NICKNAME=HW20
You also need to make sure all of the files listed in the Requirements table will be included when you runmake submit
. -
Edit your huffman.h to add the declarations for
write_coding_table(…)
. No other edits are necessary.you@ecegrid-thin1 ~/hw20 $
vim huffman.h
Requirements
- Your submission must contain each of the following files, as specified:
file contents huffman.c functions write coding table(TreeNode✶ root, BitWriter✶ a writer)
→ return type: voidWrite the coding table represented in the Huffman tree to the file referenced bya_writer
.- Follow the algorithm given in the article and Huffman walkthrough .
- Use the functions you created in HW19.
- Don't worry about what comes before or after this in the file. Just write the coding table.
All functions that were present in HW18huffman.h types All types that were present in HW18. declarations All declarations that were present in HW18, plus the one added for HW20 (specified above). test_huffman.c functions main(int argc, char✶ argv[])
→ return type: int- 100% code coverage is required for huffman.c.
- Direct testing of frequencies.c and priority_queue.c is optional for HW20.
- Use your miniunit.h.
priority_queue.h ⋯ same as beforepriority_queue.c ⋯ same as beforebit_writer.h ⋯ same as beforebit_writer.c ⋯ same as beforefrequencies.h ⋯ same as beforefrequencies.c ⋯ same as beforeminiunit.h ⋯ as usualclog.h ⋯ as usualMakefile ⋯ as usual - Submissions must meet the code quality standards and other course policies on homework.
Submit
To submit HW20 from within your hw20 directory, type
make submit
That should result in the following command:
264submit HW20 huffman.c huffman.h frequencies.c frequencies.h priority_queue.c priority_queue.h test_huffman.c bit_writer.c bit_writer.h miniunit.h clog.h Makefile
Pre-tester ●
The pre-tester for HW20 has been released and is ready to use.
Q&A
How will these bits be used? How can this possibly be used to deconstruct a Huffman tree?
Read the article ! We can see from the web logs that many of you are not doing that.
Updates
There are no updates to HW20 so far.