Due 4/19

BMP image files

Goals

The goals of this assignment are as follows:

Learn how to program with binary files
Practice using structures

Overview

In this exercise, you will write code to read, write, and crop BMP image files.

The BMP file format

A BMP file has the following format:

Header	54 bytes
Palette (optional)	0 bytes (for 24-bit RGB images)
Image Data	file size - 54 (for 24-bit RGB images)

The header has 54 bytes, which are divided into the following fields. Note that the #pragma pack(1) directive ensures that the header structure is really 54-byte long by using 1-byte alignment.

typedef struct {             // Total: 54 bytes
  uint16_t  type;             // Magic identifier: 0x4d42
  uint32_t  size;             // File size in bytes
  uint16_t  reserved1;        // Not used
  uint16_t  reserved2;        // Not used
  uint32_t  offset;           // Offset to image data in bytes from beginning of file (54 bytes)
  uint32_t  dib_header_size;  // DIB Header size in bytes (40 bytes)
  int32_t   width_px;         // Width of the image
  int32_t   height_px;        // Height of image
  uint16_t  num_planes;       // Number of color planes
  uint16_t  bits_per_pixel;   // Bits per pixel
  uint32_t  compression;      // Compression type
  uint32_t  image_size_bytes; // Image size in bytes
  int32_t   x_resolution_ppm; // Pixels per meter
  int32_t   y_resolution_ppm; // Pixels per meter
  uint32_t  num_colors;       // Number of colors  
  uint32_t  important_colors; // Important colors 
} BMPHeader;

Note that the number of bytes each field occupies can be obtained by dividing the number 16 or 32 by 8. For example, the field "type" occupies 2 bytes. These fields are all integers. An "uint" means unsigned, and "int" means signed. For example the fields "width" and "height" are signed integers. However, for simplicity, all the BMP files we have will contain only positive integers. You may assume that in your code. Also, we are dealing wih uncompressed BMP format (compression field is 0).

Because of the packing specified in the bmp.h file, you should be able to use fread(…) to read in the first 54 bytes of a BMP file and store 54 bytes in a BMPHeader structure.

Among all these fields in the BMPHeader structure, you have to pay attention to the following fields:

bits_per_pixel	number of bits per pixel
width_px	number of pixel per row
height_px	number of rows
size	file size
image_size_bytes	the size of image data (file size - size of header)

We will further explain bits_per_pixel, width_px, height_px, and image_size_bytes later. You should use the following structure to store a BMP file, the header for the first 54 bytes of a given BMP file, and data should point to a location that is big enough (of image_size_bytes) to store the image data (color information of each pixel).

typedef struct {
    BMPHeader header;
    unsigned char* data; 
} BMPImage;

Effectively, the BMPImage structure stores the entire BMP file.

Let's examine the fields bits, width, height, and image_size_bytes in greater details. The bits field records the number of bits used to represent a pixel. For this assignment, we are dealing with BMP files with only 24 bits per pixel. For 24-bit representation, 8 bits (1 byte) for RED, 8 bits for GREEN, and 8 bits for BLUE.

The width field gives you the number of pixels per row. Therefore, the total number of bytes required to represent a row of pixel for a 24-bit representation is width * 3. However, the BMP format requires each row to be padded at the end such that each row is represented by multiples of 4 bytes of data. For example, if there is only one pixel in each row, we need an additional byte to pad a row. If there are two pixels per row, 2 additional bytes. If there are three pixels per row, 3 additional bytes. If there are four pixels per row, we don't have too perform padding. We require you to assign value 0 to each of the padding byte.

The height field gives you the number of rows. Row 0 is the bottom of the image. The file is organized such that the bottom row follows the header, and the top row is at the end of the file. Within each row, the left most pixel has a lower index. Therefore, the first byte in data, i.e., data[0], belongs to the bottom left pixel.

The imagesize field is height * amount of data per row. Note that the amount of data per row includes padding at the end of each row.

You can visualize the one-dimensional data as a three-dimensional array, which is organized as rows of pixels, with each pixel represented by 3 bytes of colors (24-bit representation). However, because of padding, you cannot easily typecast the one-dimensional data as a 3-dimensional array. Instead, you can first typecast it as a two dimensional array, rows of pixels. For each row of data, you can typecast it as a two-dimensional array, where the first dimension captures pixels from left to right, the second dimension is the color of each pixel (3 bytes or 2 bytes). You don't need to think of this in terms of multi-dimensional arrays.

The Wikipedia article on the BMP file format has a nice diagram and more complete details about this format.

Examining a BMP file directly

The best way to inspect binary data is with a hex dump. From bash, you can type xxd myimage.bmp. Since it will probably be long, you will want to view in vim. One way to do that is type xxd myimage.bmp | vim - from bash. Another way is to open the file in vim and then type :%!xxd. (Do not save!)

Suppose you have the following tiny 6x6 BMP image: . (Yes, it really is only 6 pixels by 6 pixels. Don't worry. A larger version is included in one of the diagrams below.)

To get a hex dump right on the command line, you could type this at bash:

$ xxd 6x6_24bit.bmp

It will be more convenient to view in vim, so we type this from bash instead. (Don't forget the "-" at the end!)

$ xxd 6x6_24bit.bmp | vim -

Here is the hex dump, a you will see it. Don't worry if this looks cryptic. Read on and you will understand it completely.

You can break this apart using the BMP header specification (copied from above).

typedef struct {             // Total: 54 bytes
  uint16_t  type;             // Magic identifier: 0x4d42
  uint32_t  size;             // File size in bytes
  uint16_t  reserved1;        // Not used
  uint16_t  reserved2;        // Not used
  uint32_t  offset;           // Offset to image data in bytes from beginning of file (54 bytes)
  uint32_t  dib_header_size;  // DIB Header size in bytes (40 bytes)
  int32_t   width_px;         // Width of the image
  int32_t   height_px;        // Height of image
  uint16_t  num_planes;       // Number of color planes
  uint16_t  bits_per_pixel;   // Bits per pixel
  uint32_t  compression;      // Compression type
  uint32_t  image_size_bytes; // Image size in bytes
  int32_t   x_resolution_ppm; // Pixels per meter
  int32_t   y_resolution_ppm; // Pixels per meter
  uint32_t  num_colors;       // Number of colors  
  uint32_t  important_colors; // Important colors 
} BMPHeader;

Here is the same hex dump, this time with some annotations.

For this and other binary file formats, you can understand what value goes where by simply looking at the specification and a hex dump of the binary file.

Big-endian vs. little-endian

You may have noticed in the hex dump of the image that the file size (174) is represented as “ae00 0000” (instead of “0000 00ae”). This is because the BMP format is a little endian format. In short, that means the number 0x12345678 (305419896 in decimal notation) will be stored in memory as 0x78 0x56 0x34 0x12.

Remember that two hex digits are one byte. For example, 0x12345678 consist of four bytes: 0x12, 0x34, 0x56, and 0x78. When we store it using little endian, we store the least significant byte (LSB) first in the file (or memory). For that reason, little endian can also be called LSB first.

This may seem counter-intuitive because our own writing system in the physical world is the opposite: big endian. Thus, if you write, “I have 57 pints of ice cream in my freezer,” 5 is the most significant digit of 57, and we write it first.

Little endian architectures include Intel x86 and x86-64 (used on desktop computers and most laptops), as well as most ARM-based CPUs used in phones and tablets. Notably, ecegrid (x86-64) uses little-endian.

Big endian architectures are harder to find, and mostly confined to very old architectures (e.g., IBM System/360, Motorola 68000-series) and relatively exotic server architectures that are based on those old ones (e.g., IBM z/Architecture, Freescale ColdFire).

Important: Endianness only pertains to the order in which the individual bytes comprising multi-byte numeric types (e.g., int, long, etc.) are stored. It does not affect the order of individual bits within a byte, or the elements in normal arrays.

See the Q&A for more.

Checking your understanding

To check your understanding of the file format, try asking yourself the following questions:

How big is this BMP image (in terms of pixels)?
How many bytes is the file?
For the BMP header:

How many bytes does it take up?
Where in the diagram is it?
What is its file offset (i.e., number of bytes from the start of the file)?

Consider the pixel in the lower-left corner of the image.

What color is it?
Where is the data (3 bytes) in the diagram?
How many bytes is that from the beginning of the image data (pixels)?
How many bytes is that from the beginning of the file?
What are its (x,y) coordinates? (Hint: See the small representation of the image in the lower-right corner of that diagram.)

Consider the pixel in the upper-right corner. … and finally the lower-left corner. Ask yourself the same questions about each of those pixels.
Find some "padding" in the diagram. How many bytes is it?
How many bytes does a single row of pixels take up in this image (including the padding)?
Will the number of padding bytes per row always be the same within a given image?
… for all BMP images?
For an arbitrary image that is w pixels wide, at what file offset will the i^th row begin?

Handling run-time errors

In this assignment, you will need to handle run-time errors. These are not the same as bugs in your code. These are problems (or special conditions) that might arise due to the inputs from the caller of a function that you write. For example, a file may be inaccessible or corrupt, or malloc(…) may fail and return NULL.

For purposes of this assignment, the error-handling strategy will be two-pronged:

Return a special value if the operation failed. For functions that return an address (e.g., read_bmp(…)) ~~a FILE*~~, you will return NULL if the operation failed.
Return an error message via pass-by-address. Normally, the caller will pass the address of char*. If the operation is successful, the callee will do nothing with it. However, if there is a failure, the callee will return an error message on the data segment.

You will do this for all functions in this assignment that take a parameter called a_error.

Here is a sketch of the basic pattern we are describing:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <assert.h>

bool do_something(int a, int b, char** a_error) {
  // ...

  if(success == false) {
    if(*a_error == NULL) {
      *a_error = "do_something(..) failed because success == false";
    }
    return false;
  }

  // ...

  return true;
}
int main(int argc, char* argv[]) {
  char* error = NULL;
  bool do_something_succeeded = do_something(10, 11, &a_error);
  if(! do_something_succeeded) {
    fprintf(stderr, *a_error);
    assert(*a_error != NULL);
    return EXIT_FAILURE;
  }
  else {
    assert(*a_error == NULL);
  }
  return EXIT_SUCCESS;
}

Warm-up exercises

This assignment includes a warm-up exercise to help you get ready. This accounts for 10% of your score for HW16. Scoring will be relatively light, but the usual base requirements apply.

Read a text file
Create a function that reads the contents of a file and returns it as a string on the heap. The caller is responsible for freeing that memory. Use fopen(…), fread(…), and fclose(…).
Write a text file
Create a function that writes the given content to a file at the specified path. Use fopen(…), fwrite(…), and fclose(…).
Write a Point to a binary file
Write a function that writes a single Point to a binary file at the specified path. Use fopen(…), fwrite(…), and fclose(…). For the Point class, please copy-paste the following struct type into your file:
```
typedef struct { int x; int y; } Point;
```
This will be a binary file. This is preparation for working with binary files for images, which work the same way. In general, you will use fwrite(…) for binary files.
Read a Point from a binary file
Create a function that reads a Point from the binary file at the specified path into a Point on the stack. Do not call malloc(…) or free(…) for this one. Use fopen(…), fread(…), and fclose(…).

The structure of the warmup.c file is described in the Requirements table below. You should write your own warmup.c. You may add helper functions, if you wish.

Opt out.

In a hurry, and don't need the practice? This warm-up is here to help you learn what you need to succeed on the rest of this assignment—not to add additional work. Therefore, we give you an option. Those who feel that they do not need the practice may "opt out". by modifying warmup.c so that it does nothing but print the following message exactly and then exit:

I already know this and do not need to practice.

If you do that, then your score for HW16 will be based solely on the rest of this assignment. If you leave the warmup.c undone, if you do not turn in a warmup.c, or if the message it prints does not match perfectly, then you will receive 0 for the warmup portion of this assignment (10%).

bmp.h and test files

We have provided a bmp.h file and some test image files. You must use our bmp.h. To obtain these, run 264get hw16. If you wish to use other images, besides the ones provided, you may generate them from your test_bmp.c. (This is how our tester works, as well. It includes no BMP files. Instead, it generates temporary BMP files when it runs.) Do not submit any image files.

TDD

As always, you must use TDD, but this time, you choose your own steps.

The requirements are as follows:

You must have at least 6 submissions, where each step…

Adds a significant amount of code, relative to the previous step.
Passes your own tests.
Your tests have 100% line coverage, as reported by gcov. (Your make coverage rule should do this for you.)

Q: What does “significant amount of code” mean?
A: Your effort should be spread out in a way that reflects that you are doing TDD.

Be reasonable. In other words, do not circumvent the purpose. In case of any regrade requests, we will evaluate whether your process demonstrated a good faith effort to follow TDD.

Requirements

In all functions that accept FILE* fp, you may assume that it is valid and open in the correct mode. However, you may not assume anything about the amount of data or the position of the file pointer. Also, you may not assume that every call to fread(…) or fwrite(…) will succeed.

In all functions that accept char** a_error, you must handle run-time errors as described in the section above (Handling run-time errors).

Follow the TDD procedure described above.
Your submission must contain each of the following files, as specified:

file	contents
bmp.c	functions	`read bmp(FILE✶ fp, const char✶✶ a error)` → return type: BMPImage✶ Read a BMP image from an already open file. Return the image as a `BMPImage` on the heap. Use your `check_bmp_header(…)` (or whatever helper it calls, i.e., `_check_bmp_header(…)`) to check the integrity of the file. Handle run-time errors using the method described in requirement #3 (below). In case of a run-time error, `read_bmp(…)` should return `NULL`. Do not attempt to open or close the file. It is already open. `fopen(…)` and `fclose(…)` are not allowed in bmp.c. (See the table of allowed functions/symbols below.)
		`check bmp header(const BMPHeader✶ bmp hdr, FILE✶ fp)` → return type: bool Test if the `BMPHeader` is consistent with itself and the already open image file. Return true if and only if the given `BMPHeader` is valid. A header is valid if ① its magic number is 0x4d42, ② image data begins immediately after the header data (`header -> offset == BMP HEADER SIZE`), ③ the DIB header is the correct size (`DIB_HEADER_SIZE`), ④ there is only one image plane, ⑤ there is no compression (`header->compression == 0`), ⑥ `num_colors` and `important_colors` are both 0, ⑦ the image has 24 bits per pixel, ⑧ the `size` and `image_size_bytes` fields are correct in relation to the `bits`, `width`, and `height` fields and in relation to the file size. Hints: Q28 and Q29 offer some optional suggestions on how you can structure this, and the relation to `read_bmp(…)`
		`write bmp(FILE✶ fp, BMPImage✶ image, const char✶✶ a error)` → return type: bool Write an image to an already open file. Return true if and only if the operation succeeded. Handle run-time errors using the method described in requirement #3 (below). Do not attempt to open the file. It is already open. `fopen(…)` and `fclose(…)` are not allowed in bmp.c. (See the table of allowed functions/symbols below.) Do not close the file. You may assume the file is initially empty.
		`free bmp(BMPImage✶ image)` → return type: void Free all memory referred to by the given `BMPImage`.
		`crop bmp(const BMPImage✶ image, int x, int y, int w, int h, const char✶✶ a error)` → return type: BMPImage✶ Create a new image containing the cropped portion of the given image. `x` is the start index, from the left edge of the input image. `y` is the start index, from the top edge of the input image. `w` is the width of the new image. `h` is the height of the new image. You may assume the values of `x`, `y`, `w`, and `h` are in the right bounds. This will create a new `BMPImage`, including a `BMPHeader` that reflects the width and height (and related fields) for the new image (`w` and `h`). Copy in pixel data from the original image into the new image data. Handle run-time errors using the method described in requirement #3 (below). Since `crop_bmp(…)` does not read or write any files, error handling will be simpler here.
expected.txt	test output	If you are using miniunit.h (still optional, but highly recommended) you may simply redirect the output of your working tests to expected.txt to create this.
test_bmp.c	function	`main(…)` → return type: int Test all of the required functions in your bmp.c. Using miniunit.h is strongly encouraged (but not required). Diff testing is not very convenient or practical with this assignment. Running your `main(…)` with no command line arguments should cover all of your code in bmp.c, except for the parts that deal with allocation failures. (You do need to test other kinds of errors.) 100% code coverage is required. We should be able to compile and run your test_bmp.c with someone else's bmp.c. Your test should return EXIT_SUCCESS ~~and produce the output in expected.txt~~ if and only if the implementation code (bmp.c) is correct. If the bmp.c is incorrect, this should return `EXIT_FAILURE` and/or print output different from your expected.txt. If you are using miniunit.h, your test need not print anything using `printf(…)`. If you are using diff testing, your test should print some sort of human-readable output to stdout (i.e., using `printf(…)`) to help you determine if your code is working properly. Tests may not depend on any image files other than what was provided with the starter. However, if your tests create any new image files, you may use those to test. In other words, you may dynamically generate any other image files you need, but do not include them as .bmp files with your submission.
warmup.c	functions	`main(int argc, char✶ argv[])` This function is optional. If included, it will not be tested, but must have a `return EXIT_SUCCESS` at the end to avoid problems with our tester.)
		`read file(const char✶ path, const char✶✶ a error)` → return type: char ✶ Reads the contents of a file and returns it as a string on the heap. The caller is responsible for freeing that memory. Use `fopen(…)`, `fread(…)`, and `fclose(…)`.
		`write file(const char✶ path, const char✶ contents, const char✶✶ a error)` → return type: void Write the given content to a file at the specified path. Use `fopen(…)`, `fwrite(…)`, and `fclose(…)`.
		`write point(const char✶ path, Point p, const char✶✶ a error)` → return type: void Writes a single Point to a file at the specified path. Use `fopen(…)`, `fwrite(…)`, and `fclose(…)`. For the `Point` class, please copy-paste the following struct type into your file: `typedef struct { int x; int y; } Point;` You may assume the file can be written to successfully (i.e., no error handling). That means you can ignore `a_error` for `write_point(…)`.
		`read point(const char✶ path, const char✶✶ a error)` → return type: Point Read a `Point` from the file at the specified path into a `Point` on the stack. No `malloc(…)` or `free(…)` are necessary for this one. Use `fopen(…)`, `fread(…)`, and `fclose(…)`. You may assume the file can be read successfully (i.e., no error handling). That means you can ignore `a_error` for `read_point(…)`.

Only the following externally defined functions and constants are allowed in your .c files.

header	functions/symbols	allowed in…
errno.h	`errno`	`bmp.c`, `test_bmp.c`, `warmup.c`
assert.h	`assert(…)`	`bmp.c`, `test_bmp.c`, `warmup.c`
string.h	`strcat(…)`, `strlen(…)`, `strcpy(…)`, `strcmp(…)`, `strerror(…)`, `strncpy(…)`, `memcpy(…)`, `memcmp(…)`	`bmp.c`, `test_bmp.c`, `warmup.c`
stdbool.h	`true`, `false`	`bmp.c`, `test_bmp.c`, `warmup.c`
stdlib.h	`malloc(…)`, `free(…)`, `NULL`, `EXIT_SUCCESS`, `EXIT_FAILURE`	`bmp.c`, `test_bmp.c`, `warmup.c`
stdio.h	`clearerr(…)`, `feof(…)`, `ferror(…)`, `fgetpos(…)`, `FILE`, `fread(…)`, `fseek(…)`, `ftell(…)`, `ftello(…)`, `fwrite(…)`, `fflush(…`, `EOF`, `SEEK_CUR`, `SEEK_SET`, `SEEK_END`	`bmp.c`, `test_bmp.c`, `warmup.c`
stdio.h	`fopen(…)`, `fclose(…)`, `printf(…)`, `fprintf(…)`, `stdout`	`test_bmp.c`, `warmup.c`

For all functions that take a parameter called a_error:
- In case of any error with the file operations (e.g., file not found, unreadable, etc.), set *a_error to the message string returned by strerror(errno).
- In case of malformed data in the image header or pixel data, return a descriptive error message on the data segment. The exact text is up to you, but it should be descriptive. Do not simply return the same string for all errors (e.g., "ERROR").
- In case of no error, set *a_error = NULL.
- For operations that you do not know how to test, you may ensure 100% test coverage by using the ternary operator.
  Elephant* elephant = malloc(sizeof(*elephant)); *a_error = (elephant != NULL ? NULL : strerror(errno));
  
  There are no elephants in this assignment. This is just an illustration.
  
  Do not use this form for conditions that you do know how to test (e.g., malformed header, missing file, etc.). It is a workaround to trick gcov into thinking you have tested everything (since gcov goes by lines).
Make no assumptions about the maximum image size.
No function in bmp.c except for free_bmp(…) may modify its arguments or any memory referred to directly or indirectly by its arguments.
Do not modify the bmp.h file.
Do not commit any memory faults (e.g., memory leaks, bufferflows, etc.)—even in case of run-time errors. (This is always the expectation, whether explicitly stated in the requirements or not.)
All files must be in one directory. Do not put images in subdirectories.
Submissions must meet the code quality standards and the policies on homework and academic integrity.

Submit

To submit HW16 from within your hw16 directory, type 264submit HW16 bmp.c test_bmp.c warmup.c miniunit.h clog.h

If your code does not depend on miniunit.h or clog.h, those may be omitted. Do not submit any image files.

In general, to submit any assignment for this course, you will use the following command:

264submit ASSIGNMENT FILES…

Submit often and early, even well before you are finished. Doing so creates a backup that you can retrieve in case of a problem (e.g., accidentally deleted your files).

To retrieve your most recent submission, type 264get --restore ASSIGNMENT (e.g., 264get --restore hw16).

To retrieve an earlier submission, first type 264get --list ASSIGNMENT to view your past submissions and find the timestamp of the one you want to retrieve. Then, type 264get --restore -t TIMESTAMP ASSIGNMENT.

Scores will be posted to the Scores page after the deadline for each assignment.

Pre-tester ●

The pre-tester for HW16 has been released and is ready to use.

Using the pretester

The pretester is a tool for checking your work after you believe you are done, and before we have scored it. It is not a substitute for your own checking, but it may help you avoid big surprises by letting you know if your checking was not adequate. To use the pre-tester, first submit your code. Then, type the following command. (Do this only after you have submitted, and only after you believe your submission is perfect.)

264test hw16

Do not ask TAs or instructors which tests you failed.

Keep in mind:

Pre-testing is intended only for those who believe they are done and believe their submission is perfect.
The pre-tester is not part of the requirements of this or any other assignment.
You are responsible for reading the assignment carefully, and ensuring that your code meets all requirements.
Feedback is limited, to ensure that everyone learns to test their own code.
If your code is failing some tests, review your tests and make sure they are comprehensive enough to catch any bugs (deviations from requirements). Follow the tips given by the pre-tester.
Code quality issues are not reported by the pre-tester; writing clean code is something you must learn to do from the start, not a clean-up step to do at the end.

Logistics:

If we discover that we have not checked some significant part of the assignment requirements, we may add additional tests at any time up to the point when scores are released.
The pre-tester will only be enabled after much of the class has submitted the assignment, and at least a few people have submitted perfect submissions. This is to allow us to test the pre-tester.
The pre-tester checks your most recent submission. You must submit first.
You may be limited to running the pre-tester ≤24 times in a 24-hour period. (This is not implemented yet but will be added.)

Q&A

Do I need to manually reverse the bytes when I read/write a BMP file?
No, not on our platform.

For this assignment you are copying bytes directly from a file to memory, and back. Luckily for you, the x86 and x64 architectures, which power most Linux and Windows computers happen to use little-endian for storing numbers in memory. You may have noticed this when using the x/… command in gcc.

In fact, the choice of little- or big-endian is actually a bit arbitrary, at least for fixed-width storage in memory or binary files.
Are the bits reversed, too?
How would you know? … Remember, it's all just bytes.
Would a string representation of an integer (e.g., char* s = "abc") also be reversed in memory?
No.

Long version: A string is just an array of characters, and each character is really just a number. Endian-ness only affects how a single number is stored, not bigger structures such as arrays or structures. Also, since a char is only one byte on our platform (actually, all platforms by virtue of a special provision in the standard, but I digress...), endian-ness would not affect even an individual character because it only applies to the order of bytes within a number requiring multiple bytes to store in a file or memory.

You can observe this directly from gdb.
Why are the RGB color components written in blue-green-red order (instead of red-blue-green)?
This also has to do with BMP being a little endian format.
What are uint16_t and uint32_t?
These are special types that have a guaranteed size of 2 and 4 bytes, respectively.
Do I still need to use sizeof(…) when referring to the size of uint16_t and uint32_t?
Yes.
What is unsigned?
The unsigned keyword is a part of some type names (e.g., unsigned int, unsigned char, etc.) and indicates that the type cannot be negative.
Are there other types I should know about?
“Should” is relative, but yeah, there are many other numeric types. Wikipedia has a decent list.
What does “valid” mean for check_bmp_header(…)?
You are checking that none of the information in the header contradicts itself or the contents of the file (esp. the file size).
Is the gray_earth.bmp image valid?
Yes and no. That file follows the BMPv5 format specification. For this assignment we are following the simpler BMPv3 format specification. You may want to ignore that file.
Can we use assert(…) in our check_bmp_header(…)?
Yes and no… but mostly no. You may use assert(…) wherever you like, but only for detecting errors in your code; it should never be used to check for errors in the inputs or anything else. In other words, it is not for run-time checking.

Real-world software development operations typically use compiler features to effectively remove all assert(…) statements prior to shipping a product. Thus, you should use assert(…) only for things that need not be checked after the product is completed and deployed to users.
Why do we need ftell(…)?
It might help you get the file size. (We'll leave it up to you to discover how, and why that would be needed in the first place.)
Why does check_bmp_header(…) take a FILE* fp as a parameter?
You should use that to make sure the actual file size matches the information in the BMP header. See Q15.
Why does fwrite(…) add a newline character (0x0a)? to the end of my file
It doesn't. If you open the binary file in vim and then use :%!xxd you may see an extraneous 0x0a at the end. That's because vim (like many code editors) adds a newline to the end of a file, if there's not one there already. Some solutions:
1. (BEST) Use xxd myfile.bmp | vim - from bash. (Don't forget the '-' at the end!)
2. Ignore the 0x0a.
3. Open the file with vim -b myfile.bmp and then use :%!xxd to convert to the hex dump. The -b tells vim to open it in binary mode, which (among other things) disables the newline at the end.
4. From within vim, open the file with :tabe ++binary myfile.bmp and then use :%!xxd to convert to the hex dump. This also opens it in binary mode.
How can read_bmp(…) return a descriptive error message when check_bmp_header(…) does not?
You may want to use a helper function to do the checking for both. Your helper would be a lot like check_bmp_header(…) but return an error message (i.e., via a char** a_error). This isn't a requirement—just a suggestion. Use it only if you find it helpful.
How are the pixels in a BMP numbered?
For purposes of the BMP image format, the pixels start in the lower-left, and progress left-to-right then bottom-to-top. See the diagram for a more concrete example.

For purposes of most image processing APIs and discussion, we generally designate (0,0) as the upper-left.
What are some good helpers to use?
It's up to you. Here are some ideas to get you thinking:
1. long int _get_file_size(…)
2. _Color _get_pixel(BMPImage* image, int x, int y)
  typedef struct { unsigned char r, g, b; };
3. int _get_image_row_size_bytes(BMPHeader* bmp_hdr)
4. int _get_image_size_bytes(BMPHeader* bmp_hdr)
You may use/copy those if you find them helpful, but we make no guarantees as to whether they will work for you.
How do I read an image?
The structure of the BMP file format is given above under The BMP file format. In short, a BMP file consists of a header (BMPHeader struct object stored in a binary file) plus an array of unsigned char (one byte each). Each pixel is three bytes, with one for each of red, green, and blue. You won't need to worry about the individual colors for this assignment.
An example is shown in Q1 in this Q&A.

The following explanation omits details about error checking.

Your read_file(…) will first read a BMPHeader from the file using fread(…) This is just like the warm-up.

Based on the information in the BMPHeader object, you will know how many bytes are in the image pixel data. With that, you will read an array of unsigned char using fread(…).

For fread(…), this is all you need. You won't need to do much with the pixel data, except for crop_bmp(…). However, you do need to understand the fields in the image header in order to test it.

The primary focus of this assignment is on testing. The reading and writing of images is not intended to be very challenging.

Your write_bmp(…) will be very similar. It will write a BMPHeader object using fwrite(…) and then use fwrite(…) to write the pixel data.
What should my error messages say?
There is no standard or required text. Just make sure the error message describes the problem specifically in a way that would make sense to a user. “Error” would be a poor error message. Examples of better error messages include “Unable to write to file”, “Corrupt: invalid padding between rows”, and so forth. Do not simply set the error message to the same thing (e.g., ERROR) in all cases. Think about how you would like it if GCC just returned "ERROR" for all compiler errors, with no explanation.
How do the pieces of this file relate to the image?

Images are made up of small dots called "pixels". Images can be stored in a variety of ways. For this assignment, we focus on the BMP file format—and in particular, the 24-bit color format. In this format, each pixel consists of 3 bytes: red, green, and blue. This image data (pixels) is stored right after the header (54 bytes).

The header is just a struct object containing many details about the image, such as the width, height, and file format. After the header, the pixels are laid out in a row in the file, starting with the lower-left pixel.

Each row must be a multiple of 4 bytes. To ensure that is the case, 0 to 3 bytes of padding may be added to the end of each row.

All of this is illustrated in the diagram (annotated xxd output) in Q1 above. Be sure you understand the diagram before you proceed.
What is the value of the padding bytes?
The padding bytes are intentionally wasted space in the file. They are there to make the rows of pixel data line up. The value won't be used, but for consistency, they must contain zero.
Is there any example code I can refer to?
This code from Prof. Lu's book is similar respects to this assignment. There are many differences between that code and this assignment, but you may find it useful to understand the high level. In particular, note how the error checking code is designed to prevent memory leaks, even when reading corrupt BMP files. Do not copy that code (or any other code).
Should my program terminate from inside read_bmp(…)?
No. It should terminate only from the return statement at the end of your main(…).
Should my main(…) take image file names from the command line (e.g., via argv)?
No. Hard code the image filenames in your main(…) and include the image files with your submission.
What if read_file(…) (or others) are called with NULL for fp?
You may assume fp is not NULL.
What unit are x, y, width, and height measured in?
Pixels.
Should read_bmp(…) call check_bmp_header(…)?
You have freedom to design the implementations of these functions as you wish, but here is a hint. Since your read_bmp(…) needs to check the header and return an error message, you will probably want to make a helper function (_check_bmp_header(BMPHeader* bmp_hdr, FILE* fp, char** a_error)), which you would call from both read_bmp(…) and this function (check_bmp_header(…)).
How can I make the error checking code tighter, so it doesn't obscure my program logic?

This snippet illustrates a way to structure your test code for this (and other) functions without obscuring the code logic.

⚠ That snippet assumes the error message is stored on the heap. For HW16, the error message must be stored on the data segment (and/or as returned by strerror(errno)), not on the heap. You do not have to use that snippet. If you choose to use it, it is your responsibility to ensure that your code meets the specification for this semester.
What does criteria ⑧ mean, under check_bmp_header(…)?
It says:
the size and image_size_bytes fields are correct in relation to the bits, width, and height fields and in relation to the file size.
In other words, you should be able to calculate the .size and .image_size_bytes fields from the .bits .width, and .height fields. Also, you should be able to calculate the file size from those fields. These should agree.
What if the width or height passed to crop_bmp(…) is out of bounds?
You may ignore this case. We will not test with values of width or height that are out of bounds.
How can I test that my code works for malformed (invalid) image files?
There is no required method of doing this. This assignment gives you some flexibility
Yeah, that's lovely, but can you give me some ideas?

Fine.

One way is to use the xxd command. Check the man page.

you@eceprog ~/264/hw16 $man xxd

You will find that xxd can go in reverse. Thus, you can take the hex dump of the 6x6_24bit.bmp, modify it in Vim, and then convert the hex dump back to a binary.

you@eceprog ~/264/hw16 $xxd -g1 6x6_24bit.bmp 6x6_24bit.wrong_filename.txt

You can modify any byte you want, just by changing the hex values for the bytes. In this case, you could change ae to af to mess up the file size in the header.

you@eceprog ~/264/hw16 $vim 6x6_24bit.wrong_filename.txt

Convert the modified hex dump into a binary.

you@eceprog ~/264/hw16 $xxd -r 6x6_24bit.bmp 6x6_24bit.wrong_filename.bmp

Now, you have a BMP file with the wrong size indicated in the header.

But there are simpler ways. For example, you could read a good BMP using your code, modify some header values, and then write it to a new filename.

Even simpler yet, in your test code, you could read a good header (from 6x6_24bit.bmp), modify it in memory, and then call check_bmp_header(…) and/or read_bmp(…).

You may also find the truncate command useful. It can shorten or lengthen a file.

you@eceprog ~/264/hw16 $man truncate

And you can also edit a BMP in Vim directly. Use -b to avoid some ickiness. But this isn't a great option. The above ideas will serve you better.

you@eceprog ~/264/hw16 $vim 6x6_24bit.wrong_filename.bmp
What about the test images included in the starter?
Most of those are left over from the earlier version of this assignment that this is based on. We keep them in there to give you more examples, but practically speaking, you can do everything you need using just the 6x6_24bit.bmp and that will be a lot simpler, since you can see and understand the whole file in one screen-full.

Updates

4/18/2024 You do not need to check for fwrite(…) failure. We can't think of an easy way for you to test that such a check is working so we won't require that you check for that.

Advanced C Programming

Spring 2024 ECE 26400 :: Purdue University