Huffman 7: encode data to file
Overview
This is part 7 in an 8-part series in which you are implementing the Huffman Coding algorithm to compress files.
- HW15: Implement priority queue for future use
- HW16: Calculate character frequencies from a file
- HW17: Create a priority queue of trees from the character frequencies in a file.
- HW18: Build a single Huffman tree.
- HW19: Implement functions to write arbitrary bits to a file, for future use.
- HW20: Write the Huffman Tree to the disk.
- HW21: Write the encoded file contents to the disk. ◀◀◀ this homework
- huffman_8: Create a command-line interface for compressing files.
In this homework you will use a Huffman tree created with your code from HW18 to compress some input data, and write it to a file using your bit_writer.c from HW19.
Input
The input to your code will be an input string and a Huffman tree.
The input string would be like this:
huffman fluffs many mums
The Huffman tree would look like this:
Output
Your code will compress the input string using Huffman coding, and write the compressed data to a file.
Here's how the Huffman tree above would be represented in terms of bits. You will write this only. Do not write anything before or after this.
Here is an illustration of how the parts fit together. (This has not been tested and is intended only as an illustration.) Notably, this does not write the code book or anything else.
const char* path_uncompressed = "…";
const char* path_compressed = "…";
unsigned char* contents_of_file = _read_file(path_uncompressed); // helper function; not shown
// contents_of_file would be a string like "huffman fluffs many mums"
// unsigned char is equivalent uint8_t
Frequencies freqs;
const char* error = NULL;
if(calc_frequencies(freqs, path_uncompressed, &error) ) {
Node* head = make_huffman_pq(freqs);
TreeNode* root = make_huffman_tree(head);
BitWriter writer = open_bit_writer(path_compressed);
write_compressed(root, &writer, contents_of_file);
close_bit_writer(writer);
}
free(contents_of_file);
destroy_huffman_tree(root);
If the file contained "huffman fluffs many mums", then the following would be written (and nothing more).
In case of any discrepency between this example and the article , the article shall take priority.
Algorithm
You must read the article completely and carefully. Do not ask for help if you have not first read that.
See the article and Huffman walkthrough for a complete description of how the data is compressed using the Huffman tree.
You can implement this in whatever way you like. One possibility is outlined in the Q&A. Use it only if it makes sense to you, and if you believe it is correct.
Getting Started
There is no starter code. However, we recommend creating a new directory for each step of the Huffman series.
- You should have already read the article and Huffman walkthrough . If you have not done so already, please do so before you proceed.
- Create a directory for HW21 and change into it.
you@ecegrid-thin1 ~ $
mkdir hw21
you@ecegrid-thin1 ~ $
cd hw21
-
Copy all of your files from HW20
into your HW21 directory.
you@ecegrid-thin1 ~/hw21 $
cp ../hw20/{*.[ch],Makefile} -t ./
-
Update your Makefile with the new
ASG_NICKNAME
.You should now have this:you@ecegrid-thin1 ~/hw21 $
vim Makefile
ASG_NICKNAME=HW21
You also need to make sure all of the files listed in the Requirements table will be included when you runmake submit
. -
Edit your huffman.h to add the declarations for
write_compressed(…)
. No other edits are necessary.you@ecegrid-thin1 ~/hw21 $
vim huffman.h
Requirements
- Your submission must contain each of the following files, as specified:
file contents huffman.c functions write compressed(TreeNode✶ root, BitWriter✶ a writer, uint8 t✶ uncompressed bytes)
→ return type: voidCompress and writeuncompressed_bytes
to a file using the Huffman tree atroot
and theBitWriter
ata_writer
.-
The parameter
uncompressed_bytes
should be of typeuint8_t*
. It is just an input string, like"huffman fluffs many mums"
. -
Note:
uint8_t
is equivalent tounsigned char
.uncompressed_bytes
is just a string. The only reason we didn't declare it aschar*
is because we need to support all character values. - Do not open or close the file.
- Use your bit_writer.c.
-
⚠ Do not call
malloc(…)
orfree(…)
.- ∴ You cannot write the compressed data to a string first.
-
Don't worry about what comes before or after this in the file. Just write the compressed data only.
- ⚠ Do not call
write_coding_table(…)
.
- ⚠ Do not call
All functions that were present in HW20huffman.h types All types that were present in HW20. declarations All declarations that were present in HW20, plus the one added for HW21 (specified above). test_huffman.c functions main(int argc, char✶ argv[])
→ return type: int- 100% code coverage is required for huffman.c.
- Direct testing of frequencies.c and priority_queue.c is optional for HW21.
- Use your miniunit.h.
priority_queue.h ⋯ same as beforepriority_queue.c ⋯ same as beforebit_writer.h ⋯ same as beforebit_writer.c ⋯ same as beforefrequencies.h ⋯ same as beforefrequencies.c ⋯ same as beforeminiunit.h ⋯ as usualclog.h ⋯ as usualMakefile ⋯ as usual -
The parameter
- Submissions must meet the code quality standards and other course policies on homework.
Submit
To submit HW21 from within your hw21 directory, type
make submit
That should result in the following command:
264submit HW21 huffman.c huffman.h frequencies.c frequencies.h priority_queue.c priority_queue.h test_huffman.c bit_writer.c bit_writer.h miniunit.h clog.h Makefile
Pre-tester ●
The pre-tester for HW21 has been released and is ready to use.
Q&A
How will these bits be used? How can this possibly be used to deconstruct a Huffman tree?
Read the article ! We can see from the web logs that many of you are not doing that.I'm lazy. What exactly should my code do?
Don't be lazy.I can't help it. I don't like to think. What should I do?
Did you come to a Top-10 Computer Engineering school so you could get a degree without learning to think‽‽‽ What were you thinking‽‽‽ (Don't answer that.)My brain was partially eaten by zombies. How am I supposed to survive???
Help is available.Do you call that help?
Yes… well they do.Did you actually look at those reviews? Would you trust your brain with those guys?
No. Good point.Okay, so give me my hint!!! Capeesh?
Fine. You win. Here's your hint.You will need to log in to view the hint. This helps us understand how many people needed it, while also keeping the hint out of search engines and such, in case we want to hold it back for future semesters. This isn't all that hard, but given the unusual circumstances, we don't want anyone to stay stuck for too long.
-
How should we handle compressing an empty string?
For Spring 2022, this will not be tested and need not be handled. -
How should we handle compressing a single character?
For Spring 2022, this will not be tested and need not be handled.
Updates
There are no updates to HW21 so far.
4/28/2022 |
|