Advanced C Programming

Spring 2022 ECE 264 :: Purdue University

⚠ This is a PAST SEMESTER (Spring 2022).
Due 4/29

Huffman 7: encode data to file

Overview

This is part 7 in an 8-part series in which you are implementing the Huffman Coding algorithm to compress files.

  1. HW15: Implement priority queue for future use
  2. HW16: Calculate character frequencies from a file
  3. HW17: Create a priority queue of trees from the character frequencies in a file.
  4. HW18: Build a single Huffman tree.
  5. HW19: Implement functions to write arbitrary bits to a file, for future use.
  6. HW20: Write the Huffman Tree to the disk.
  7. HW21: Write the encoded file contents to the disk. ◀◀◀ this homework
  8. huffman_8: Create a command-line interface for compressing files.

In this homework you will use a Huffman tree created with your code from HW18 to compress some input data, and write it to a file using your bit_writer.c from HW19.

Input

The input to your code will be an input string and a Huffman tree.

The input string would be like this:

huffman fluffs many mums

The Huffman tree would look like this:

Output

Your code will compress the input string using Huffman coding, and write the compressed data to a file.

Here's how the Huffman tree above would be represented in terms of bits. You will write this only. Do not write anything before or after this.

Here is an illustration of how the parts fit together. (This has not been tested and is intended only as an illustration.) Notably, this does not write the code book or anything else.

const char* path_uncompressed = "…";
const char* path_compressed = "…";
unsigned char* contents_of_file = _read_file(path_uncompressed);  // helper function; not shown
// contents_of_file would be a string like "huffman fluffs many mums"
// unsigned char is equivalent uint8_t
Frequencies freqs;
const char* error = NULL;
if(calc_frequencies(freqs, path_uncompressed, &error) ) {
  Node* head = make_huffman_pq(freqs);
  TreeNode* root = make_huffman_tree(head);
  BitWriter writer = open_bit_writer(path_compressed);
  write_compressed(root, &writer, contents_of_file);
  close_bit_writer(writer);
}
free(contents_of_file);
destroy_huffman_tree(root);

If the file contained "huffman fluffs many mums", then the following would be written (and nothing more).

0100('h')100('u')00('f')00('f')110('m')1011('a')1110('n')011(' ')00('f')0101('l')100('u')00('f')00('f')1111('s')011(' ')110('m')1011('a')1110('n')1010('y')011(' ')110('m')100('u')110('m')1111('s')0000((padding))

In case of any discrepency between this example and the article , the article shall take priority.

Algorithm

You must read the article completely and carefully. Do not ask for help if you have not first read that.

See the article and Huffman walkthrough for a complete description of how the data is compressed using the Huffman tree.

You can implement this in whatever way you like. One possibility is outlined in the Q&A. Use it only if it makes sense to you, and if you believe it is correct.

Getting Started

There is no starter code. However, we recommend creating a new directory for each step of the Huffman series.

  1. You should have already read the article and Huffman walkthrough . If you have not done so already, please do so before you proceed.
  2. Create a directory for HW21 and change into it.
    you@ecegrid-thin1 ~ $ mkdir hw21
    you@ecegrid-thin1 ~ $ cd hw21
  3. Copy all of your files from HW20 into your HW21 directory.
    you@ecegrid-thin1 ~/hw21 $ cp ../hw20/{*.[ch],Makefile} -t ./
  4. Update your Makefile with the new ASG_NICKNAME.
    you@ecegrid-thin1 ~/hw21 $ vim Makefile
    You should now have this: ASG_NICKNAME=HW21 You also need to make sure all of the files listed in the Requirements table will be included when you run make submit.
  5. Edit your huffman.h to add the declarations for write_compressed(…). No other edits are necessary.
    you@ecegrid-thin1 ~/hw21 $ vim huffman.h

Requirements

  1. Your submission must contain each of the following files, as specified:
    file contents
    huffman.c functions
    write compressed(TreeNode✶ root, BitWriter✶ a writer, uint8 t✶ uncompressed bytes)
    return type: void
    Compress and write uncompressed_bytes to a file using the Huffman tree at root and the BitWriter at a_writer.
    • The parameter uncompressed_bytes should be of type uint8_t*. It is just an input string, like "huffman fluffs many mums".
    • Note: uint8_t is equivalent to unsigned char. uncompressed_bytes is just a string. The only reason we didn't declare it as char* is because we need to support all character values.
    • Do not open or close the file.
    • Use your bit_writer.c.
    • Do not call malloc(…) or free(…).
      • ∴ You cannot write the compressed data to a string first.
    • Don't worry about what comes before or after this in the file. Just write the compressed data only.
      • Do not call write_coding_table(…).
    All functions that were present in HW20
    huffman.h types All types that were present in HW20.
    declarations All declarations that were present in HW20, plus the one added for HW21 (specified above).
    test_huffman.c functions
    main(int argc, char✶ argv[])
    return type: int
    • 100% code coverage is required for huffman.c.
      • Direct testing of frequencies.c and priority_queue.c is optional for HW21.
    • Use your miniunit.h.
    priority_queue.h
    same as before
    priority_queue.c
    same as before
    bit_writer.h
    same as before
    bit_writer.c
    same as before
    frequencies.h
    same as before
    frequencies.c
    same as before
    miniunit.h
    as usual
    clog.h
    as usual
    Makefile
    as usual
  2. Submissions must meet the code quality standards and other course policies on homework.

Submit

To submit HW21 from within your hw21 directory, type

make submit

That should result in the following command: 264submit HW21 huffman.c huffman.h frequencies.c frequencies.h priority_queue.c priority_queue.h test_huffman.c bit_writer.c bit_writer.h miniunit.h clog.h Makefile

Pre-tester

The pre-tester for HW21 has been released and is ready to use.

Q&A

  1. How will these bits be used? How can this possibly be used to deconstruct a Huffman tree?

    Read the article ! We can see from the web logs that many of you are not doing that.
  2. I'm lazy. What exactly should my code do?

    Don't be lazy.
  3. I can't help it. I don't like to think. What should I do?

    Did you come to a Top-10 Computer Engineering school so you could get a degree without learning to think‽‽‽ What were you thinking‽‽‽ (Don't answer that.)
  4. My brain was partially eaten by zombies. How am I supposed to survive???

    Help is available.
  5. Do you call that help?

    Yes… well they do.
  6. Did you actually look at those reviews? Would you trust your brain with those guys?

    No. Good point.
  7. Okay, so give me my hint!!! Capeesh?

    Fine. You win. Here's your hint.

    You will need to log in to view the hint. This helps us understand how many people needed it, while also keeping the hint out of search engines and such, in case we want to hold it back for future semesters. This isn't all that hard, but given the unusual circumstances, we don't want anyone to stay stuck for too long.

  8. How should we handle compressing an empty string?
    For Spring 2022, this will not be tested and need not be handled.
  9. How should we handle compressing a single character?
    For Spring 2022, this will not be tested and need not be handled.

Updates

There are no updates to HW21 so far.

4/28/2022
  • Fixed illustration code so that compressed (output) path is different from uncompressed (input) path.