Due 12/7

Parallel adaptive threshold

Goals

The goals of this assignment are as follows:

Learn to write parallel programs using multiple threads and the pthread library.
Increase your depth in image programming.

Overview

You will create binarize(…), a function that works with your code from hw11 to convert an image to monochrome (black & white) using the “adaptive threshold” method. As an intermediate step, you will also need to create to_gray(…), a function which converts a color image to grayscale by averaging the red, blue, and green components.

Adaptive Thresholding

Thresholding is a technique used to convert an image to monochrome (black & white). It is important for problems such as scanned document processing, OCR, image segmentation, and medical imaging. Pixels brighter than some threshold are set to white. The rest are set to black. The key question becomes this: What should the threshold be?

If the image is color, then it must first be converted to grayscale. For purposes of this assignment, we will use regular 24-bit images where the red, blue, and green components are the same. In the image below (taken with a smartphone), the image at left is actually color. Notice that there is a little tint from the lighting in the room. The image right is true grayscale.

color

gray

// Credit: Donald Knuth; image is a photo of The Art of Computer Programming: Fundamental Algorithms, volume 1, 2nd edition, p. 272; photo by Alex Quinn

In the simplest case, you could use 50% gray ( ) so that would be converted to . That method—often called fixed threshold because it uses the same threshold value for all pixels—generally works well for scanned documents, as long as the brightness and contrast were set properly when they were scanned. However, in cases of photographs taken in uneven lighting or certain medical applications, a fixed threshold provides poor results, as you see below.

gray	monochrome, fixed threshold

Even if we were to use 30% gray or 70% gray, we would still have similar problems. Because the lighting is different in different regions of the image, there is no one good threshold to use for every pixel. The solution is adaptive threshold. The basic idea is that for each pixel in an image, you may use a different threshold value to decide whether that pixel should be black or white.

gray	monochrome, adaptive threshold

There are many good ways to decide the threshold, none of them perfect. For this assignment, we will keep it simple:

For a neighborhood radius r, a pixel at (x₀, y₀) shall be white if and only if its intensity is greater than the average intensity of all pixels in the (2r + 1) × (2r + 1) neighborhood surrounding that pixel. That neighborhood shall consist of all pixels at (x_i, y_i), such that |x_i - x₀| ≤ r, including the current pixel being thresholded. Pixels that are not white shall be black. In calculating the threshold, do not include pixels that are beyond the boundaries of the image. This means that for such edge pixels, the neighborhood will be smaller.

Example: Suppose we binarize the following 6x6 image with radius=1.

250	120	0	20	240	100
220	110	10	30	230	130
40	45	50	55	60	65
70	75	80	85	90	95
155	145	135	125	115	105
165	175	185	195	205	215

The pixel at (x=2, y=3) currently has intensity value of 80. The neighborhood (radius=1) consists of pixels having intensity values 45, 50, 55, 75, 80, 85, 145, 135, and 125. The average of those is 88⅓. Since 80 ≤ 88⅓, the pixel at (x=2, y=3) shall be black.

Always calculate the threshold based on only the values in the input image.

(Optional) More information about adaptive threshold:
Adaptive Threshold (Prof. Robert Fisher, University of Edinburgh)

Calculating in parallel

Calculating adaptive threshold can be computationally intensive, especially on high resolution images. Therefore, ou will implement the adaptive threshold algorithm in parallel by using the pthread library. How can parallelism be implemented for this algorithm? Each pixel in the output image must have its threshold calculated and its intensity value set. Threads can be created such that each thread operates on a particular region of pixels in the image. How you choose to allocate pixels to a particular thread is up to you!

Your to_gray(…) and binarize(…) functions will create copies, which makes your life a lot easier than it would be if they were modifying the image in place.

See the man pages for pthread_create(…) and pthread_join(…) for details on their parameters and how to pass data to your worker function.

Warm-up exercises

The following warm-up exercises will help you prepare for this assignment. These will account for 20% of the points for this assignment, unless you opt out (see below). Handle run-time errors using the method described in hw11.

Zero an array
Create a function boolXzero＿array(int✶Xarray,XintXsize,XintXnum＿threads,Xchar✶✶Xerror) that sets every element of the given array to zero using num_threads threads.
Shift an array
Create a function boolXshift(int✶Xarray,XintXsize,XintXoffset,XintXnum＿threads,Xchar✶✶Xerror) that sets each element to the element offset positions ahead of it, wrapping around the end, as needed. In other words, an element at position i will be changed to the value that was at position i + offset % size prior to the operation. For example, if you passed in {1, 2, 3, 4, 5} and called shift(…, 5, 2, …, …), then after calling the function, the array would contain {3, 4, 5, 1, 2}. You may copy the array, if you like, but the function must not leak memory.

Note: Unlike your binarize(…), this warm-up modifies the array in place, and wraps around the edges.

These should be in a single file, called warmups.c. A main(…) is optional. If included, it will not be tested, but must have a return 0 at the end (to avoid problems with our tester). Warm-ups will be tested only very lightly. These are for your benefit.

Opt out

The opt-out procedure is the same as for hw11.

Test-driven development

Use test-driven development to do this assignment incrementally. Follow the same instructions as in hw11.

Run-time error handling

Throughout this assignment (including the warm-ups), all functions that accept char** error must handle run-time errors as described in the section titled Handling run-time errors in hw11.

In particular, be sure to handle the cases where pthread_create(…) or malloc(…) return NULL, and the case where binarize(…) (or median(…) for the bonus) receive an image that is not a 24-bit grayscale image (with red=green=blue).

You may assume the BMPImage is valid, as defined by check_bmp(…), and that it was opened using read_bmp(…).

Files

Here are the files you will create for this assignment.

file	contents
mtat.c	functions	`to＿gray(BMPImage✶Ximage,XintXnum＿threads,Xchar✶✶Xerror)` → return type: BMPImage✶ Convert the given image to grayscale (24-bit image with red=blue=green) by setting the red, blue, and green components of each pixel to the average of the three. Handle run-time errors using the method described in hw11. The input and output `BMPImage` will be a 24-bit BMP (v3). The output must have red=blue=green for all pixels.
mtat.c	functions	`binarize(BMPImage✶Ximage,XintXradius,XintXnum＿threads,Xchar✶✶Xerror)` → return type: BMPImage✶ Convert the given image to monochrome (black & white) using the adaptive threshold algorithm described above. Handle run-time errors using the method described in hw11. The input and output `BMPImage` will be a 24-bit BMP (v3) with red=blue=green. In the output, the intensity value of red, blue, and green must be either 0 (black) or 255 (white).
mtat.h	declarations	both functions from mtat.c
test_mtat.c	functions	`main(intXargc,Xchar✶Xargv[])` → return type: int Test your `binarize(…)`. This must be structured using the same format as hw11. Your `main(…)` must return EXIT_SUCCESS (0).
warmups.c	functions	as described in the warm-ups section

Include all BMP files that your test_mtat.c relies on. Images must be G-rated and non-copyright-infringing.

You do not need a test_mtat.txt.

Compiling

This assignment relies on your code from hw11 but they must be in separate files. Do not copy your hw11 code into mtat.c or mtat.h. Your code should work with any working hw11 code. Your mtat.c must #include both mtat.h and bmp.h.

To compile thread that uses the pthread library, you must include the -pthread flag.

The following command should work:

gcc -pthread -o mtat mtat.c bmp.c test_mtat.c

Requirements

~~Your submission must contain two files, mtat.c and warmups.c, having the functions indicated in the table above.~~
Please turn in as follows:
264submit hw12 warmups.c mtat.c mtat.h bmp.c bmp.h test_mtat.c *.bmp
If your test_mtat.c depends on any other files—unlikely but possible— be sure to include them.
Similarly to hw11…
1. Make no assumptions about the maximum image size.
2. No function in mtat.c may modify its arguments or any memory referred to directly or indirectly by its arguments.
3. Do not modify the mtat.h file.
4. All files must be in one directory. Do not put images in subdirectories.

Only the following external header files, functions, and symbols are allowed. All others are prohibited unless approved by the instructor. This does not apply to your bmp.c and bmp.h. "*" means you may use anything in that header file.

header	functions/symbols	allowed in…
assert.h	`assert(…)`	`mtat.c`, `test_mtat.c`
math.h	`*`	`mtat.c`, `test_mtat.c`
pthread.h	`*`	`mtat.c`, `test_mtat.c`
stdbool.h	`true`, `false`	`mtat.c`, `test_mtat.c`
stdlib.h	`malloc(…)`, `free(…)`, `qsort(…)`, `NULL`, `EXIT_SUCCESS`, `EXIT_FAILURE`	`mtat.c`, `test_mtat.c`
stdio.h	`FILE`	`mtat.c`, `test_mtat.c`
stdio.h	`printf(…)`, `fprintf(…)`, `stdout`	`test_mtat.c`
string.h	`strcat(…)`, `strlen(…)`, `strcpy(…)`, `strcmp(…)`, `strerror(…)`, `memcpy(…)`	`mtat.c`, `test_mtat.c`

Feel free to ask if there is something you would like to use. For this assignment, we would expect most/all such requests to be approved, but please obtain approval before submitting anything using headers or symbols not listed above.

As usual…
1. Do not use dynamic arrays (C99 feature, e.g., char s[n];)
2. Code must successfully compile and run with your test_mtat.c and any other valid test file.
3. Code must meet all requirements in the Course Policies and Code Quality Standards.
  Note: If you choose to use assert(…), be sure you understand the specific guidelines that relate to it.
4. All code must be your own.
5. Your test code must be comprehensive (and original).

How much work is this?

This assignment does not require a lot of code. However, writing code that uses threads is different from most of what you have done in ECE 264. In particular, it can be challenging to debug. Give yourself time to understand multi-threaded programming.

Bonus #1: Median filter (2 bonus points)

Add a multi-threaded median filter using a very similar technique to the adaptive threshold. Do not use bubble sort or any other n² algorithm to calculate the median. You may however use qsort(…).

Your function signature will look like this:

BMPImage✶Xmedian(BMPImage✶Ximage,XintXradius,XintXnum＿threads,Xchar✶✶Xerror)

To indicate that you are attempting bonus #1, you must include the following at the top of your mtat.h file.

#define HW12_BONUS_1

Bonus #2: Constant time median filter (4 bonus points)

Two additional bonus points are possible for anybody implementing this constant time median filter algorithm described in this article by Simon Perreault & Patrick Hébert. Other very efficient algorithms may be considered for this, if you obtain approval from the instructor in advance of the submission deadline.

To indicate that you are attempting bonus #2, you must include the following at the top of your mtat.h file.

#define HW12_BONUS_2

You may receive credit for either bonus #1 or bonus #2 but not both. In addition, we may require an in-person walk-through of your work for either of these (and especially #2).

Bonus #3: Fastest adaptive threshold (▒ bonus points)

We are hoping to add one additional bonus, in which points would be given to the fastest implementation of the adaptive threshold in the class. This would be tested on large images, probably with a larger radius (e.g., maybe on a 10000x10000 image with radius=100 or something like that). However, we haven't yet figured out an efficient way to manage this. In the meantime, just be thinking about how you might do this more efficiently than someone else. We will let you know ≥2 days before the deadline if this materializes.

Q&A

How much code should I be writing for the zero_array(…) part of the warmup?
The core of it isn't much code. Here is is before adding in the error handling. (Do not copy.)

Adding in the error handling (as required) increases the amount of code. (Do not copy.)

Updates

12/3/2015: OK to use stdio.h in any file; median(…) takes a radius parameter

12/7/2015: OK to use qsort(…) in any file; update submission files; corrected compile command to include test_mtat.c

Advanced C Programming

Autumn 2015 :: ECE 264 :: Purdue University