Advanced C Programming

Spring 2019 :: ECE 264 :: Purdue University

⚠ This is for Spring 2019, not the current semester.
Due 4/8

BMP files

Goals

The goals of this assignment are as follows:
  1. Learn how to program with binary files
  2. Practice using structures

Overview

In this exercise, you will write code to read, write, and crop BMP image files.

The BMP file format

A BMP file has the following format:

Header 54 bytes
Palette (optional) 0 bytes (for 24-bit RGB images)
Image Data file size - 54 (for 24-bit RGB images)

The header has 54 bytes, which are divided into the following fields. Note that the #pragma pack(1) directive ensures that the header structure is really 54-byte long by using 1-byte alignment.

typedef struct {             // Total: 54 bytes
  uint16_t  type;             // Magic identifier: 0x4d42
  uint32_t  size;             // File size in bytes
  uint16_t  reserved1;        // Not used
  uint16_t  reserved2;        // Not used
  uint32_t  offset;           // Offset to image data in bytes from beginning of file (54 bytes)
  uint32_t  dib_header_size;  // DIB Header size in bytes (40 bytes)
  int32_t   width_px;         // Width of the image
  int32_t   height_px;        // Height of image
  uint16_t  num_planes;       // Number of color planes
  uint16_t  bits_per_pixel;   // Bits per pixel
  uint32_t  compression;      // Compression type
  uint32_t  image_size_bytes; // Image size in bytes
  int32_t   x_resolution_ppm; // Pixels per meter
  int32_t   y_resolution_ppm; // Pixels per meter
  uint32_t  num_colors;       // Number of colors  
  uint32_t  important_colors; // Important colors 
} BMPHeader;

Note that the number of bytes each field occupies can be obtained by dividing the number 16 or 32 by 8. For example, the field "type" occupies 2 bytes. These fields are all integers. An "uint" means unsigned, and "int" means signed. For example the fields "width" and "height" are signed integers. However, for simplicity, all the BMP files we have will contain only positive integers. You may assume that in your code. Also, we are dealing wih uncompressed BMP format (compression field is 0).

Because of the packing specified in the bmp.h file, you should be able to use fread(…) to read in the first 54 bytes of a BMP file and store 54 bytes in a BMPHeader structure.

Among all these fields in the BMPHeader structure, you have to pay attention to the following fields:

bits_per_pixelnumber of bits per pixel
width_pxnumber of pixel per row
height_pxnumber of rows
sizefile size
image_size_bytesthe size of image data (file size - size of header)

We will further explain bits_per_pixel, width_px, height_px, and image_size_bytes later. You should use the following structure to store a BMP file, the header for the first 54 bytes of a given BMP file, and data should point to a location that is big enough (of image_size_bytes) to store the image data (color information of each pixel).

typedef struct {
    BMPHeader header;
    unsigned char* data; 
} BMPImage;

Effectively, the BMPImage structure stores the entire BMP file.

Let's examine the fields bits, width, height, and image_size_bytes in greater details. The bits field records the number of bits used to represent a pixel. For this assignment, we are dealing with BMP files with only 24 bits per pixel. For 24-bit representation, 8 bits (1 byte) for RED, 8 bits for GREEN, and 8 bits for BLUE.

The width field gives you the number of pixels per row. Therefore, the total number of bytes required to represent a row of pixel for a 24-bit representation is width * 3. However, the BMP format requires each row to be padded at the end such that each row is represented by multiples of 4 bytes of data. For example, if there is only one pixel in each row, we need an additional byte to pad a row. If there are two pixels per row, 2 additional bytes. If there are three pixels per row, 3 additional bytes. If there are four pixels per row, we don't have too perform padding. We require you to assign value 0 to each of the padding byte.

The height field gives you the number of rows. Row 0 is the bottom of the image. The file is organized such that the bottom row follows the header, and the top row is at the end of the file. Within each row, the left most pixel has a lower index. Therefore, the first byte in data, i.e., data[0], belongs to the bottom left pixel.

The imagesize field is height * amount of data per row. Note that the amount of data per row includes padding at the end of each row.

You can visualize the one-dimensional data as a three-dimensional array, which is organized as rows of pixels, with each pixel represented by 3 bytes of colors (24-bit representation). However, because of padding, you cannot easily typecast the one-dimensional data as a 3-dimensional array. Instead, you can first typecast it as a two dimensional array, rows of pixels. For each row of data, you can typecast it as a two-dimensional array, where the first dimension captures pixels from left to right, the second dimension is the color of each pixel (3 bytes or 2 bytes). You don't need to think of this in terms of multi-dimensional arrays.

The Wikipedia article on the BMP file format has a nice diagram and more complete details about this format.

Examining a BMP file directly

The best way to inspect binary data is with a hex dump. From bash, you can type xxd myimage.bmp. Since it will probably be long, you will want to view in vim. One way to do that is type xxd myimage.bmp | vim - from bash. Another way is to open the file in vim and then type :%!xxd. (Do not save!)

Suppose you have the following tiny 6x6 BMP image: . (Yes, it really is only 6 pixels by 6 pixels. Don't worry. A larger version is included in one of the diagrams below.)

To get a hex dump right on the command line, you could type this at bash:

$ xxd 6x6_24bit.bmp

It will be more convenient to view in vim, so we type this from bash instead. (Don't forget the "-" at the end!)

$ xxd 6x6_24bit.bmp | vim -

Here is the hex dump, a you will see it. Don't worry if this looks cryptic. Read on and you will understand it completely.


You can break this apart using the BMP header specification (copied from above).

typedef struct {             // Total: 54 bytes
  uint16_t  type;             // Magic identifier: 0x4d42
  uint32_t  size;             // File size in bytes
  uint16_t  reserved1;        // Not used
  uint16_t  reserved2;        // Not used
  uint32_t  offset;           // Offset to image data in bytes from beginning of file (54 bytes)
  uint32_t  dib_header_size;  // DIB Header size in bytes (40 bytes)
  int32_t   width_px;         // Width of the image
  int32_t   height_px;        // Height of image
  uint16_t  num_planes;       // Number of color planes
  uint16_t  bits_per_pixel;   // Bits per pixel
  uint32_t  compression;      // Compression type
  uint32_t  image_size_bytes; // Image size in bytes
  int32_t   x_resolution_ppm; // Pixels per meter
  int32_t   y_resolution_ppm; // Pixels per meter
  uint32_t  num_colors;       // Number of colors  
  uint32_t  important_colors; // Important colors 
} BMPHeader;

Here is the same hex dump, this time with some annotations.


For this and other binary file formats, you can understand what value goes where by simply looking at the specification and a hex dump of the binary file.

Big-endian vs. little-endian

You may have noticed in the hex dump of the image that the file size (174) is represented as “ae00 0000” (instead of “0000 00ae”). This is because the BMP format is a little endian format. In short, that means the number 0x12345678 (305419896 in decimal notation) will be stored in memory as 0x78 0x56 0x34 0x12.

Remember that two hex digits are one byte. For example, 0x12345678 consist of four bytes: 0x12, 0x34, 0x56, and 0x78. When we store it using little endian, we store the least significant byte (LSB) first in the file (or memory). For that reason, little endian can also be called LSB first.

This may seem counter-intuitive because our own writing system in the physical world is the opposite: big endian. Thus, if you write, “I have 57 pints of ice cream in my freezer,” 5 is the most significant digit of 57, and we write it first.

Little endian architectures include Intel x86 and x86-64 (used on desktop computers and most laptops), as well as most ARM-based CPUs used in phones and tablets. Notably, ecegrid (x86-64) uses little-endian.

Big endian architectures are harder to find, and mostly confined to very old architectures (e.g., IBM System/360, Motorola 68000-series) and relatively exotic server architectures that are based on those old ones (e.g., IBM z/Architecture, Freescale ColdFire).

Important: Endianness only pertains to the order in which the individual bytes comprising multi-byte numeric types (e.g., int, long, etc.) are stored. It does not affect the order of individual bits within a byte, or the elements in normal arrays.

See the Q&A for more.

Checking your understanding

To check your understanding of the file format, try asking yourself the following questions:

  1. How big is this BMP image (in terms of pixels)?
  2. How many bytes is the file?
  3. For the BMP header:
    1. How many bytes does it take up?
    2. Where in the diagram is it?
    3. What is its file offset (i.e., number of bytes from the start of the file)?
  4. Consider the pixel in the lower-left corner of the image.
    1. What color is it?
    2. Where is the data (3 bytes) in the diagram?
    3. How many bytes is that from the beginning of the image data (pixels)?
    4. How many bytes is that from the beginning of the file?
    5. What are its (x,y) coordinates? (Hint: See the small representation of the image in the lower-right corner of that diagram.)
  5. Consider the pixel in the upper-right corner. … and finally the lower-left corner. Ask yourself the same questions about each of those pixels.
  6. Find some "padding" in the diagram. How many bytes is it?
  7. How many bytes does a single row of pixels take up in this image (including the padding)?
  8. Will the number of padding bytes per row always be the same within a given image?
  9. … for all BMP images?
  10. For an arbitrary image that is w pixels wide, at what file offset will the ith row begin?

Handling run-time errors

In this assignment, you will need to handle run-time errors. These are not the same as bugs in your code. These are problems (or special conditions) that might arise due to the inputs from the caller of a function that you write. For example, a file may be inaccessible or corrupt, or malloc(…) may fail and return NULL.

For purposes of this assignment, the error-handling strategy will be two-pronged:

  1. Return a special value if the operation failed. For functions that return an address (e.g., read_bmp(…)) a FILE*, you will return NULL if the operation failed.
  2. Return an error message via pass-by-address. Normally, the caller will pass the address of char*. If the operation is successful, the callee will do nothing with it. However, if there is a failure, the callee will return an error message on the data segment.

You will do this for all functions in this assignment that take a parameter called a_error.

Here is a sketch of the basic pattern we are describing:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <assert.h>

bool do_something(int a, int b, char** a_error) {
  // ...

  if(success == false) {
    if(*a_error == NULL) {
      *a_error = "do_something(..) failed because success == false";
    }
    return false;
  }

  // ...

  return true;
}
int main(int argc, char* argv[]) {
  char* error = NULL;
  bool do_something_succeeded = do_something(10, 11, &a_error);
  if(! do_something_succeeded) {
    fprintf(stderr, *a_error);
    assert(*a_error != NULL);
    return EXIT_FAILURE;
  }
  else {
    assert(*a_error == NULL);
  }
  return EXIT_SUCCESS;
}

Error message will be on the data segment (and/or as returned by strerror(errno)), not on the heap.

Warm-up exercises

This assignment includes a warm-up exercise to help you get ready. This accounts for 10% of your score for HW11. Scoring will be relatively light, but the usual base requirements apply.

  1. Read a text file
    Create a function that reads the contents of a file and returns it as a string on the heap. The caller is responsible for freeing that memory. Use fopen(…), fread(…), and fclose(…).
  2. Write a text file
    Create a function that writes the given content to a file at the specified path. Use fopen(…), fwrite(…), and fclose(…).
  3. Write a Point to a binary file
    Write a function that writes a single Point to a binary file at the specified path. Use fopen(…), fwrite(…), and fclose(…). For the Point class, please copy-paste the following struct type into your file:
    typedef struct { int x; int y; } Point;
    This will be a binary file. This is preparation for working with binary files for images, which work the same way. In general, you will use fwrite(…) for binary files.
  4. Read a Point from a binary file
    Create a function that reads a Point from the binary file at the specified path into a Point on the stack. Do not call malloc(…) or free(…) for this one. Use fopen(…), fread(…), and fclose(…).

The structure of the warmup.c file is described in the Requirements table below. You should write your own warmup.c. You may add helper functions, if you wish.

Opt out.

In a hurry, and don't need the practice? This warm-up is here to help you learn what you need to succeed on the rest of this assignment—not to add additional work. Therefore, we give you an option. Those who feel that they do not need the practice may "opt out". by modifying warmup.c so that it does nothing but print the following message exactly and then exit:

I already know this and do not need to practice.

If you do that, then your score for HW11 will be based solely on the rest of this assignment. If you leave the warmup.c undone, if you do not turn in a warmup.c, or if the message it prints does not match perfectly, then you will receive 0 for the warmup portion of this assignment (10%).

bmp.h and test files

We have provided a bmp.h file and some test image files. You must use our bmp.h. To obtain these, run 264get hw11. If you wish to use other images, besides the ones provided, you may generate them from your test_bmp.c. (This is how our tester works, as well. It includes no BMP files. Instead, it generates temporary BMP files when it runs.) Do not submit any image files.

Requirements

In all functions that accept FILE* fp, you may assume that it is valid and open in the correct mode. However, you may not assume anything about the amount of data or the position of the file pointer. Also, you may not assume that every call to fread(…) or fwrite(…) will succeed.

In all functions that accept char** a_error, you must handle run-time errors as described in the section above (Handling run-time errors).

  1. Your submission must contain each of the following files, as specified:
  2. file contents
    bmp.c functions
    read bmp(FILE✶ fp, const char✶✶ a error)
    return type: BMPImage✶
    Read a BMP image from an already open file.
    • Return the image as a BMPImage on the heap.
    • Use your check_bmp_header(…) to check the integrity of the file.
    • Handle run-time errors using the method described in requirement #3 (below).
      • In case of a run-time error, read_bmp(…) should return NULL.
    • Do not attempt to open or close the file. It is already open.
      • fopen(…) and fclose(…) are not allowed in bmp.c. (See the table of allowed functions/symbols below.)
    check bmp header(const BMPHeader✶ bmp hdr, FILE✶ fp)
    return type: bool
    Test if the BMPHeader is consistent with itself and the already open image file.
    • Return true if and only if the given BMPHeader is valid.
    • A header is valid if ① its magic number is 0x4d42, ② image data begins immediately after the header data (header -> offset == BMP HEADER SIZE), ③ the DIB header is the correct size (DIB_HEADER_SIZE), ④ there is only one image plane, ⑤ there is no compression (header->compression == 0), ⑥ num_colors and important_colors are both 0, ⑦ the image has 24 bits per pixel, ⑧ the size and image_size_bytes fields are correct in relation to the bits, width, and height fields and in relation to the file size.
    • Hints: Q28 and Q29 offer some optional suggestions on how you can structure this, and the relation to read_bmp(…)
    write bmp(FILE✶ fp, BMPImage✶ image, const char✶✶ a error)
    return type: bool
    Write an image to an already open file.
    • Return true if and only if the operation succeeded.
    • Handle run-time errors using the method described in requirement #3 (below).
    • Do not attempt to open the file. It is already open.
      • fopen(…) and fclose(…) are not allowed in bmp.c. (See the table of allowed functions/symbols below.)
    • Do not close the file.
    • You may assume the file is initially empty.
    free bmp(BMPImage✶ image)
    return type: void
    Free all memory referred to by the given BMPImage.
    crop bmp(const BMPImage✶ image, int x, int y, int w, int h, const char✶✶ a error)
    return type: BMPImage✶
    Create a new image containing the cropped portion of the given image.
    • x is the start index, from the left edge of the input image.
    • y is the start index, from the top edge of the input image.
    • w is the width of the new image.
    • h is the height of the new image.
    • You may assume the values of x, y, w, and h are in the right bounds.
    • This will create a new BMPImage, including a BMPHeader that reflects the width and height (and related fields) for the new image (w and h). Copy in pixel data from the original image into the new image data.
    • Handle run-time errors using the method described in requirement #3 (below).
      • Since crop_bmp(…) does not read or write any files, error handling will be simpler here.
    expected.txt test output
    1. If you are using miniunit.h (still optional, but highly recommended) you may simply redirect the output of your working tests to expected.txt to create this.
    test_bmp.c function
    main(…)
    return type: int
    Test all of the required functions in your bmp.c.
    • Using miniunit.h is strongly encouraged (but not required). Diff testing is not very convenient or practical with this assignment.
    • Running your main(…) with no command line arguments should cover all of your code in bmp.c, except for the parts that deal with allocation failures. (You do need to test other kinds of errors.)
    • 100% code coverage is required.
    • We should be able to compile and run your test_bmp.c with someone else's bmp.c.
    • Your test should return EXIT_SUCCESS and produce the output in expected.txt if and only if the implementation code (bmp.c) is correct.
    • If the bmp.c is incorrect, this should return EXIT_FAILURE and/or print output different from your expected.txt.
    • If you are using miniunit.h, your test need not print anything using printf(…). If you are using diff testing, your test should print some sort of human-readable output to stdout (i.e., using printf(…)) to help you determine if your code is working properly.
    • Tests may not depend on any image files other than what was provided with the starter. However, if your tests create any new image files, you may use those to test. In other words, you may dynamically generate any other image files you need, but do not include them as .bmp files with your submission.
    warmup.c functions
    main(int argc, char✶ argv[])
    This function is optional. If included, it will not be tested, but must have a return EXIT_SUCCESS at the end to avoid problems with our tester.)
    read file(const char✶ path, const char✶✶ a error)
    return type: char ✶
    Reads the contents of a file and returns it as a string on the heap.
    • The caller is responsible for freeing that memory.
    • Use fopen(…), fread(…), and fclose(…).
    write file(const char✶ path, const char✶ contents, const char✶✶ a error)
    return type: void
    Write the given content to a file at the specified path.
    • Use fopen(…), fwrite(…), and fclose(…).
    write point(const char✶ path, Point p, const char✶✶ a error)
    return type: void
    Writes a single Point to a file at the specified path.
    • Use fopen(…), fwrite(…), and fclose(…).
    • For the Point class, please copy-paste the following struct type into your file:
      typedef struct { int x; int y; } Point;
    • You may assume the file can be written to successfully (i.e., no error handling).
      • That means you can ignore a_error for write_point(…).
    read point(const char✶ path, const char✶✶ a error)
    return type: Point
    Read a Point from the file at the specified path into a Point on the stack.
    • No malloc(…) or free(…) are necessary for this one.
    • Use fopen(…), fread(…), and fclose(…).
    • You may assume the file can be read successfully (i.e., no error handling).
      • That means you can ignore a_error for read_point(…).
  3. Only the following externally defined functions and constants are allowed in your .c files.
    header functions/symbols allowed in…
    errno.h errno bmp.c, test_bmp.c, warmup.c
    assert.h assert(…) bmp.c, test_bmp.c, warmup.c
    string.h strcat(…), strlen(…), strcpy(…), strcmp(…), strerror(…), strncpy(…), memcpy(…), memcmp(…) bmp.c, test_bmp.c, warmup.c
    stdbool.h true, false bmp.c, test_bmp.c, warmup.c
    stdlib.h malloc(…), free(…), NULL, EXIT_SUCCESS, EXIT_FAILURE bmp.c, test_bmp.c, warmup.c
    stdio.h clearerr(…), feof(…), ferror(…), fgetpos(…), FILE, fread(…), fseek(…), ftell(…), ftello(…), fwrite(…), fflush(…, EOF, SEEK_CUR, SEEK_SET, SEEK_END bmp.c, test_bmp.c, warmup.c
    stdio.h fopen(…), fclose(…), printf(…), fprintf(…), stdout test_bmp.c, warmup.c
  4. For all functions that take a parameter called a_error:
    • In case of any error with the file operations (e.g., file not found, unreadable, etc.), set *a_error to the message string returned by strerror(errno).
    • In case of malformed data in the image header or pixel data, return a descriptive error message on the data segment. The exact text is up to you, but it should be descriptive. Do not simply return the same string for all errors (e.g., "ERROR").
    • In case of no error, set *a_error = NULL.
    • For operations that you do not know how to test, you may ensure 100% test coverage by using the ternary operator.
      Elephant* elephant = malloc(sizeof(*elephant)); *a_error = (elephant != NULL ? NULL : strerror(errno));
      There are no elephants in this assignment. This is just an illustration.
      Do not use this form for conditions that you do know how to test (e.g., malformed header, missing file, etc.). It is a workaround to trick gcov into thinking you have tested everything (since gcov goes by lines).
  5. Make no assumptions about the maximum image size.
  6. No function in bmp.c except for free_bmp(…) may modify its arguments or any memory referred to directly or indirectly by its arguments.
  7. Do not modify the bmp.h file.
  8. Do not commit any memory faults (e.g., memory leaks, bufferflows, etc.)—even in case of run-time errors. (This is always the expectation, whether explicitly stated in the requirements or not.)
  9. All files must be in one directory. Do not put images in subdirectories.
  10. Submissions must meet the code quality standards and the policies on homework and academic integrity.

Submit

To submit HW11, type 264submit HW11 bmp.c test_bmp.c expected.txt warmup.c miniunit.h clog.h from inside your hw11 directory. If your code does not depend on miniunit.h or clog.h, those may be omitted. Do not submit any image files.

Pre-tester

The pre-tester for HW11 has not yet been released. As soon as it is ready, this note will be changed.

Q&A

  1. Do I need to manually reverse the bytes when I read/write a BMP file?
    No, not on our platform.

    For this assignment you are copying bytes directly from a file to memory, and back. Luckily for you, the x86 and x64 architectures, which power most Linux and Windows computers happen to use little-endian for storing numbers in memory. You may have noticed this when using the x/… command in gcc.

    In fact, the choice of little- or big-endian is actually a bit arbitrary, at least for fixed-width storage in memory or binary files.
  2. Are the bits reversed, too?
    How would you know? … Remember, it's all just bytes.
  3. Would a string representation of an integer (e.g., char* s = "abc") also be reversed in memory?
    No.

    Long version: A string is just an array of characters, and each character is really just a number. Endian-ness only affects how a single number is stored, not bigger structures such as arrays or structures. Also, since a char is only one byte on our platform (actually, all platforms by virtue of a special provision in the standard, but I digress...), endian-ness would not affect even an individual character because it only applies to the order of bytes within a number requiring multiple bytes to store in a file or memory.

    You can observe this directly from gdb.
  4. Why are the RGB color components written in blue-green-red order (instead of red-blue-green)?
    This also has to do with BMP being a little endian format.
  5. What are uint16_t and uint32_t?
    These are special types that have a guaranteed size of 2 and 4 bytes, respectively.
  6. Do I still need to use sizeof(…) when referring to the size of uint16_t and uint32_t?
    Yes.
  7. What is unsigned?
    The unsigned keyword is a part of some type names (e.g., unsigned int, unsigned char, etc.) and indicates that the type cannot be negative.
  8. Are there other types I should know about?
    “Should” is relative, but yeah, there are many other numeric types. Wikipedia has a decent list.
  9. Will all of this be on the exam?
    All of the concepts in this and the other assignments can be considered within scope. That includes this Q&A.
  10. What does “valid” mean for check_bmp_header(…)?
    You are checking that none of the information in the header contradicts itself or the contents of the file (esp. the file size).
  11. Is the gray_earth.bmp image valid?
    Yes and no. That file follows the BMPv5 format specification. For this assignment we are following the simpler BMPv3 format specification. You may want to ignore that file.
  12. Can we use assert(…) in our check_bmp_header(…)?
    Yes and no… but mostly no. You may use assert(…) wherever you like, but only for detecting errors in your code; it should never be used to check for errors in the inputs or anything else. In other words, it is not for run-time checking.

    Real-world software development operations typically use compiler features to effectively remove all assert(…) statements prior to shipping a product. Thus, you should use assert(…) only for things that need not be checked after the product is completed and deployed to users.
  13. Why do we need ftell(…)?
    It might help you get the file size. (We'll leave it up to you to discover how, and why that would be needed in the first place.)
  14. Why does check_bmp_header(…) take a FILE* fp as a parameter?
    You should use that to make sure the actual file size matches the information in the BMP header. See Q15.
  15. Why does fwrite(…) add a newline character (0x0a)? to the end of my file
    It doesn't. If you open the binary file in vim and then use :%!xxd you may see an extraneous 0x0a at the end. That's because vim (like many code editors) adds a newline to the end of a file, if there's not one there already. Some solutions:
    1. (BEST) Use xxd myfile.bmp | vim - from bash. (Don't forget the '-' at the end!)
    2. Ignore the 0x0a.
    3. Open the file with vim -b myfile.bmp and then use :%!xxd to convert to the hex dump. The -b tells vim to open it in binary mode, which (among other things) disables the newline at the end.
    4. From within vim, open the file with :tabe ++binary myfile.bmp and then use :%!xxd to convert to the hex dump. This also opens it in binary mode.
  16. How can read_bmp(…) return a descriptive error message when check_bmp_header(…) does not?
    You may want to use a helper function to do the checking for both. Your helper would be a lot like check_bmp_header(…) but return an error message (i.e., via a char** a_error). This isn't a requirement—just a suggestion. Use it only if you find it helpful.
  17. How are the pixels in a BMP numbered?
    For purposes of the BMP image format, the pixels start in the lower-left, and progress left-to-right then bottom-to-top. See the diagram for a more concrete example.

    For purposes of most image processing APIs and discussion, we generally designate (0,0) as the upper-left.
  18. What are some good helpers to use?
    It's up to you. Here are some ideas to get you thinking:
    1. long int _get_file_size(…)
    2. _Color _get_pixel(BMPImage* image, int x, int y)
      typedef struct { unsigned char r, g, b; };
    3. int _get_image_row_size_bytes(BMPHeader* bmp_hdr)
    4. int _get_image_size_bytes(BMPHeader* bmp_hdr)
    You may use/copy those if you find them helpful, but we make no guarantees as to whether they will work for you.
  19. How do I read an image?
    The structure of the BMP file format is given above under The BMP file format. In short, a BMP file consists of a header (BMPHeader struct object stored in a binary file) plus an array of unsigned char (one byte each). Each pixel is three bytes, with one for each of red, green, and blue. You won't need to worry about the individual colors for this assignment.
    An example is shown in Q1 in this Q&A.
    The following explanation omits details about error checking.
    Your read_file(…) will first read a BMPHeader from the file using fread(…) This is just like the warm-up.
    Based on the information in the BMPHeader object, you will know how many bytes are in the image pixel data. With that, you will read an array of unsigned char using fread(…).
    For fread(…), this is all you need. You won't need to do much with the pixel data, except for crop_bmp(…). However, you do need to understand the fields in the image header in order to test it.
    The primary focus of this assignment is on testing. The reading and writing of images is not intended to be very challenging.
    Your write_bmp(…) will be very similar. It will write a BMPHeader object using fwrite(…) and then use fwrite(…) to write the pixel data.
  20. What should my error messages say?
    There is no standard or required text. Just make sure the error message describes the problem specifically in a way that would make sense to a user. “Error” would be a poor error message. Examples of better error messages include “Unable to write to file”, “Corrupt: invalid padding between rows”, and so forth. Do not simply set the error message to the same thing (e.g., ERROR) in all cases. Think about how you would like it if GCC just returned "ERROR" for all compiler errors, with no explanation.
  21. How do the pieces of this file relate to the image?
    Images are made up of small dots called "pixels". Images can be stored in a variety of ways. For this assignment, we focus on the BMP file format—and in particular, the 24-bit color format. In this format, each pixel consists of 3 bytes: red, green, and blue. This image data (pixels) is stored right after the header (54 bytes).
    The header is just a struct object containing many details about the image, such as the width, height, and file format. After the header, the pixels are laid out in a row in the file, starting with the lower-left pixel.
    Each row must be a multiple of 4 bytes. To ensure that is the case, 0 to 3 bytes of padding may be added to the end of each row.
    All of this is illustrated in the diagram (annotated xxd output) in Q1 above. Be sure you understand the diagram before you proceed.
  22. What is the value of the padding bytes?
    The padding bytes are intentionally wasted space in the file. They are there to make the rows of pixel data line up. The value won't be used, but for consistency, they must contain zero.
  23. Is there any example code I can refer to?
    This code from Prof. Lu's book is similar respects to this assignment. There are many differences between that code and this assignment, but you may find it useful to understand the high level. In particular, note how the error checking code is designed to prevent memory leaks, even when reading corrupt BMP files. Do not copy that code (or any other code).
  24. Should my program terminate from inside read_bmp(…)?
    No. It should terminate only from the return statement at the end of your main(…).
  25. Should my main(…) take image file names from the command line (e.g., via argv)?
    No. Hard code the image filenames in your main(…) and include the image files with your submission.
  26. What if read_file(…) (or others) are called with NULL for fp?
    You may assume fp is not NULL.
  27. What unit are x, y, width, and height measured in?
    Pixels.
  28. Should read_bmp(…) call check_bmp_header(…)?
    You have freedom to design the implementations of these functions as you wish, but here is a hint. Since your read_bmp(…) needs to check the header and return an error message, you will probably want to make a helper function (_check_bmp_header(BMPHeader* bmp_hdr, FILE* fp, char** a_error)), which you would call from both read_bmp(…) and this function (check_bmp_header(…)).
  29. How can I make the error checking code tighter, so it doesn't obscure my program logic?
    This snippet illustrates a way to structure your test code for this (and other) functions without obscuring the code logic.
    That snippet assumes the error message is stored on the heap. For HW11, the error message must be stored on the data segment (and/or as returned by strerror(errno)), not on the heap. You do not have to use that snippet. If you choose to use it, it is your responsibility to ensure that your code meets the specification for this semester.
  30. What does criteria ⑧ mean, under check_bmp_header(…)?
    It says:
    the size and image_size_bytes fields are correct in relation to the bits, width, and height fields and in relation to the file size.
    In other words, you should be able to calculate the .size and .image_size_bytes fields from the .bits .width, and .height fields. Also, you should be able to calculate the file size from those fields. These should agree.
  31. What if the width or height passed to crop_bmp(…) is out of bounds?
    You may ignore this case. We will not test with values of width or height that are out of bounds.

Updates

3/31/2019 Removed “preview” designation after making the following changes: ① added const where possible; ② specified that error messages are on the data segment (and/or as returned by strerror(…)); ③ clarified that miniunit is the best way to test (but still not required); ④ disallowed submitting your own .bmp files (to avoid clogging the server); ⑤ clarified that tests must provide 100% line coverage; and ⑥ provided a trick to allow you to achieve 100% code coverage for error conditions that you don't know how to test.
4/1/2019
  • Clarification: Error messages returned via a_error should be on the data segment (and/or as returned by strerror(errno)), not on the heap. (This was already specified in many locations and was mentione in an email to the class, but there were a few errant mentions leftover from the “preview” version of the homework.)
  • Corrected minor typos: The word “error” had been inadvertently dropped from several sentences.
  • Readability: Removed some redundant text to make the requirements table more readable.
  • Correction: write_bmp(…, const char** a_error). The bmp.h was already correct.
  • Correction: Corrected a few field names in the explanation toward the top.
  • Clarification: read_bmp(…) should return NULL in case of a run-time error.
4/5/2019 OK to assume x, y, w, h passed to crop_bmp(…) are in the right bounds. Linked to snippet and edited Q29.
4/8/2019 Corrected typo in Q&A #28. This is very minor and does not affect the requirements in any way.
4/10/2019 Added memcmp(…)