Advanced C Programming

Summer 2024 ECE 26400 :: Purdue University

Due 8/3

Sorting 2: generic data structures

Goals

The goals of this assignment are as follows:
  1. Practice writing code using void* addresses.
  2. Practice writing code using function addresses (function pointers).

Overview

In this assignment, you will refactor one of your sorting algorithms from HW14 to support any data type. Alternatively, you can implement a new sorting algorithm, such as an insertion sort or a selection sort.

There are no starter files.

About void* addresses

In C, void* represents an address of any type of data, for instance it could be the address of an int or a complex struct like BSTNode from HW14. Since we do not know the type of data in the address, we are disallowed performing operations such as dereferencing (the unary * operator), indexing (the [] operator), and performing any sort of address arithmetic (for instance, using +).

C allows you to freely assign from any type of address into a void*, and allows you to freely assign from a void* into any other type (with the exception of cases that would discard a const qualifier). As a result, it is entirely up to the programmer to ensure these assignments make sense; as otherwise the program may have inconsistent behavior which possibly leads to memory violations or segmentation faults.

These addresses are typically used for one of three reasons:

  1. The type of the data has not yet been decided (for example, in malloc(…)).
  2. The type of the data does not matter (for example, in printf(…) with the %p argument).
  3. Any type of data is supported (for example, in qsort(…)).

We have worked with the first two cases quite a bit on previous assignments, subtly taking advantage of the ability to assign from void* to another address type and back. The third case was just briefly seen on HW14, which we will explore more in depth on this assignment.

When writing functions that are designed to support any type of data, the inability to dereference or perform address arithmetic are most notable. In some cases, this requires passing in the size of the data referenced by the address to process, as seen in fread(…) and fwrite(…). In other cases, we can instead take advantage of function addresses to tell the function how to perform certain required actions.

Function addresses

Code that deals with generic data types often needs to pass functions as parameters. However, the declarations are UGLY.

We will start with a couple examples.

Example #1: Declare local variable of type “address to function taking parameter types and returning return type

Let's start with an example. In the snippet below, we have two functions, _print_square(…) and _print_cube(…). Instead of calling them directly, we will assign the address of each function to a local variable and then call them by way of that local variable, just to demonstrate the syntax.

#include <stdio.h>
#include <stdlib.h>

void _print_square(int n) {
    printf("%d² is %d\n", n, n*n);
}

void _print_cube(int n) {
    printf("%d³ is %d\n", n, n*n*n);
}

int main(int argc, char* argv[]) {

    // Declare variable «fn» of type "address to function taking an int and returning void"
    void (*fn)(int);

    // Initialize fn to the address of _print_square(…).  Normally we delcare and initialize
    // on the same line, but we have separated the two for the clarity of this explanation.
    fn = _print_square;

    // Call _print_square(…) by way of variable «fn».
    fn(7);

    // Now set «fn» to the address of _print_cube.
    fn = _print_cube;

    // Call _print_cube(…) by way of variable «fn».
    fn(7);

    return EXIT_SUCCESS;
}
/* vim: set tabstop=4 shiftwidth=4 fileencoding=utf-8 noexpandtab: */

The type of variable fn is void(*)(int). In words, we will say, “«fn» is the address to a function taking an int and returning void.”

Example #2: Declare parameter of type “address to function taking parameter types and returning return type

Instead of creating a local variable, we can also use that same type (void(*fn)(int)) as a parameter to a function. In the example below, we call a function _call_print_fn(…) which then calls another function, specified by the second argument.

#include <stdio.h>
#include <stdlib.h>

void _print_square(int n) {
    printf("%d² is %d\n", n, n*n);
}

void _print_cube(int n) {
    printf("%d³ is %d\n", n, n*n*n);
}

void _call_print_fn(int n, void (*print_fn)(int)) {
    print_fn(n);
}

int main(int argc, char* argv[]) {

    // Call _print_square(…) by way of the helper function.
    _call_print_fn(7, _print_square);

    // Call _print_cube(…) by way of the helper function.
    _call_print_fn(7, _print_cube);

    return EXIT_SUCCESS;
}
/* vim: set tabstop=4 shiftwidth=4 fileencoding=utf-8 noexpandtab: */

Like the previous example, the type of parameter print_fn is void(*)(int). In words, we will say, “«print_fn» is the address to a function taking an int and returning void.”

The warmup included in the assingment will give you practice working with function addresses to ensure you are ready to use them on the homework. It is worth of your grade on this homework.

Basic form for function address variables

The basic form for declaring a variable called Nameof type “address to function taking Parameter Types and returning Return Type” is:

Return Type(*Name)(Parameter Types)

Yes, it's UGLY. The name of that variable is in the middle!!!

Do not use parentheses when passing a function address or assigning it to a variable. In the code above, _call_print_fn(7, _print_cube(…)) will not work.

Passing a function address is totally different from invoking your mu_run(…) macro with a function (e.g., mu_run(_test_empty_list)). Although mu_run(…) looks like a function, it is really just transforming your code so that function can be called elsewhere and/or its name can be printed. In contrast, passing a function address is passing an actual address. It is a number that can be printed using printf(…), if you wish.

Generic linked lists

As part of this homework, your task is to create a linked list which supports any type of data, then implement a function sort_list(…) which can sort that linked list based on the passed compare function address. Your code should be tested on at least two different types of data, so you are confident it will work with any arbitrary type of data we will test it on. Upon declaring the head of the linked list, you will decide on the type of data in the list, then remain consistent with that type until the list is destroyed.

We will make use of three different function addresses formats to help work with the data, print, destroy, and compare. All of these functions will take in either void* or const void* parameters. The type of the data can be assumed inside of implementations of the function, as long as we are consistent in areas they are used. That is, if we are working with a linked list of strings, all our function address arguments should assume the void* refers to a char*; we should not mix char* and int* function addresses on a list of strings.

Print function

The print function void (*print_fn)(const void*) is passed into print_list(…) to tell it how to print the data in the address in the eacn node. Most implementations of print functions will simply call printf(…) with different format codes. Note that for a particular type of data, there may be multiple reasonable print functions; as long as the type is consistent with the type of data in the linked list this is valid (that is, don't try to print a linked list of integers using a print function for strings).

Example: Print string

Note this is assuming the address type in the node is char*, it is sometimes useful to consider nodes containing char** which would need a subtly different print function.

void _print_string(const void* a_value) {
  const char* str = a_value;
  fputc(str, stdout);
}

Example: Print float value

void _print_double_two_decimals(const void* a_value) {
  const double* a_num = a_value;
  printf("%.2f", *a_num);
}

Destroy function

The destroy function void (*destroy_fn)(void*) is passed into empty_list(…) to tell it how to free up any memory associated with the address in each node.

Example: Destroy heap data

The C standard library function free(…) matches this signature and can be used as a destroy function for heap allocated data.

Example: destroy non-heap data

Since a destroy function will be required to make use of empty_list(…), we will often make use of an empty destroy function for the sake of stack allocated values or values on the data segment.

void _destroy_nothing(void* a_value) {
  // no operation
}

Example: destroy a linked list

As an extreme example, we may decide to create a nested linked list, where the inner lists contain heap allocated strings. You probably will not need to test anything this complicated, but its provided as an example of destroy functions.

void _destroy_list_of_heap_allocated_data(void* a_value) {
  List* a_list = a_value;
  empty_list(a_list, free);
}

Compare functions

The compare function int (*compare_fn)(const void*, const void*) is passed into sort_list(…) to describe which of the two elements is bigger. It should return a positive number if the first parameter is bigger, or a negative number if the second parameter is bigger. If the two parameters are the same size, it should return 0.

A compare function can be used by comparing its return value to 0 with <, >= and alike:

void _compare_example(const void* a_first, const void* a_second,
                      int (*compare_fn)(const void*, const void*)) {
  // equivelent to first < second
  if (compare_fn(a_first, a_second) < 0) {
    // do stuff if first is less than second
  }
}

Example: compare strings

strcmp(…) is setup perfectly to use as a compare function; however for our usecase the parameter types are wrong so we will need to construct a function to call it:

int _compare_strings(const void* a_left, const void* a_right) {
  const char* left_str = a_left;
  const char* right_str = a_right;
  return strcmp(left_str, right_str);
}

Example: compare strings reversed

By negating the return value, we can sort the list in decending order instead of ascending.

int _compare_strings_reversed(const void* a_left, const void* a_right) {
  const char* left_str = a_left;
  const char* right_str = a_right;
  return -strcmp(left_str, right_str);
}

Example: compare strings by length

We are not required to sort strings alphabetically, any consistent method for comparison is valid as long as we follow the specification above. For instance we can sort by string length:

int _compare_strings_by_length(const void* a_left, const void* a_right) {
  const char* left_str = a_left;
  const char* right_str = a_right;
  return strlen(left_str) - strlen(right_str);
}

Testing

You will likely want to test each function with at least 2 types of data in the list to ensure your code is properly generic.

Integer stack allocated list

int _test_integer_append_size_3() {
  mu_start();
  int numbers[] = { 1, 6, 2 };
  int numbers_len = sizeof(numbers) / sizeof(*numbers);
  List list = { .head = NULL, .tail = NULL, .size = 0 };
  for (int i = 0; i < numbers_len; i++) {
    append(&list, numbers + i);
  }
  // test list structure
  mu_check(list.size == 3);
  mu_check(list.head != NULL);
  mu_check(list.head->next != NULL);
  mu_check(list.head->next->next != NULL);
  mu_check(list.head->next->next->next == NULL);
  mu_check(list.head->next->next == list.tail);
  // test list contents
  mu_check(list.head->a_value == numbers + 0);
  mu_check(list.head->next->a_value == numbers + 1);
  mu_check(list.head->->next->next->a_value == numbers + 2);
  // list is stack allocated, no data inside the node to free 
  // however, still need to free up the nodes
  empty_list(&list, _destroy_nothing);
  mu_end();
}

String heap allocated list

int _test_heap_string_sort_list_size_4() {
  mu_start();
  const char* strings[] = { "Grape", "Apple", "Cranberry", "Blueberry" };
  int strings_len = sizeof(strings) / sizeof(*strings);
  List list = { .head = NULL, .tail = NULL, .size = 0 };
  for (int i = 0; i < strings_len; i++) {
    // we have written a lot of examples of this helper before,
    // using it to ensure heap allocation
    append(&list, _copy_string(strings[i]));
  }
  sort_list(&list, _compare_strings);
  // print list for debug
  print_list(&list, _print_string);
  // test list structure
  mu_check(list.size == 4);
  mu_check(list.head != NULL);
  mu_check(list.head->next != NULL);
  mu_check(list.head->next->next != NULL);
  mu_check(list.head->next->next->next != NULL);
  mu_check(list.head->next->next->next->next == NULL);
  mu_check(list.head->next->next->next != list.tail);
  // test list contents
  mu_check_strings_equal(list.head->a_value, "Apple");
  mu_check_strings_equal(list.head->next->a_value, "Blueberry");
  mu_check_strings_equal(list.head->next->next->a_value, "Cranberry");
  mu_check_strings_equal(list.head->next->next->next->a_value, "Grape");
  // list is heap allocated, so both the contents and the nodes need to be freed
  empty_list(&list, free);
  mu_end();
}

Warm-up exercises

This assignment includes a warm-up exercise to help you get ready. This accounts for of your score for HW17. Scoring will be relatively light, but the usual base requirements apply.

  1. Integer print function
    Create a print function for printing integers in decimal from a void*. This should only require a single line of code.
  2. Integer compare function
    Create a compare function for integers. This should be similar to the string length comparison example above.

The structure of the warmup.c file is described in the Requirements table below. You should write your own warmup.c. You may add helper functions, if you wish.

Opt out.

In a hurry, and don't need the practice? This warm-up is here to help you learn what you need to succeed on the rest of this assignment—not to add additional work. Therefore, we give you an option. Those who feel that they do not need the practice may "opt out". by modifying warmup.c so that it does nothing but print the following message exactly and then exit:

I already know this and do not need to practice.

If you do that, then your score for HW17 will be based solely on the rest of this assignment. If you leave the warmup.c undone, if you do not turn in a warmup.c, or if the message it prints does not match perfectly, then you will receive 0 for the warmup portion of this assignment ().

TDD

As always, you must use TDD, but this time, you choose your own steps.

The requirements are as follows:

You must have at least 4 submissions, where each step…

Q: What does “significant amount of code” mean?
A: Your effort should be spread out in a way that reflects that you are doing TDD.

Be reasonable. In other words, do not circumvent the purpose. In case of any regrade requests, we will evaluate whether your process demonstrated a good faith effort to follow TDD.

Requirements

  1. Your submission must contain each of the following files, as specified:
  2. file contents
    warmup.c function definitions
    main(int argc, char✶ argv[])
    return type: int
    Do not modify main(…) unless you are opting out.

    If you choose to opt out, it should contain the following. (You may copy this.)

    printf("I already know this and do not need to practice.");
        return EXIT_SUCCESS;
     print integer(const void✶ a value)
    return type: void
     compare integers(const void✶ a first, const void✶ a second)
    return type: int
    sorts.h type definitions
    List
    struct type with 3 fields: head (ListNode*), tail (ListNode*) and size (int)
    • Whenever size==0, head and tail must be NULL.
    ListNode
    struct type with 2 fields: a_value(void*), next (ListNode*)
    function declarations one for each required function in your sorts.c
    • Do not include helpers (if any) here. Also do not include any structs used as part of a helper.
    sorts.c function definitions
    append(List✶ a list, void✶ a value)
    return type: void
    Appends a new node to a_list containing a_value.
    • Create a new node to contain a_value and place it on the list after tail.
    • Update size and tail to be consistent with the new list structure.
    • When a_list->size==0, a_list->head and a_list->tail will be NULL.
    • When a_list->size!=0, a_list->head and a_list->tail will NOT be NULL.
    • malloc(…) may be called exactly once in this function.
    • Do not attempt to sort values here, you lack the proper context.
    • Do not try to copy a_value, if the caller wishes a copy they will copy it themselves.
    sort list(List✶ a list, int (✶compare fn)(const void✶, const void✶))
    return type: void
    Sorts a_list using any algorithm of your choice. We recommend merge sort as you can reuse your code from HW14.
    • Use compare_fn to sort data in the list. Do not sort by the addresses a_value directly.
    • Heap allocation (i.e., calls to malloc(…)) is allowed, provided you free up any objects that are not returned as part of a_list.
    • This can be implemented without any additional malloc(…) calls, which will likely be the easiest approach.
    • If you decide to sort by allocating a second linked list, you are free to replace the list in a_list with entirely new nodes if you wish, provided the old list is freed. Make sure you don't accidently free up a_value (use an empty destroy function with empty_list(…)).
    • If you wish to use another type of data structure (e.g. a binary search tree), define any relevant structs in sorts.c.
    • Do not call compare_fn on the nodes, only on a_value.
    print list(const List✶ a list, void (✶print fn)(const void✶))
    return type: void
    Print the data contained in the linked list.
    • Call print_fn on a_value to print the data.
    • Separate elements with a single newline chracter. Do not rely on the print function including a newline.
    • Do not call print_fn on the nodes, only on a_value.
    empty list(List✶ a list, void (✶destroy fn)(void✶))
    return type: void
    Free all the nodes in the list, and call destroy_fn on all a_value in the nodes.
    • Set head and tail to NULL, and size to 0.
    • free(…) may be called from a total of one place in this function and any it depends on, excluding calls inside destroy_fn.
    • Do not call free(…) on a_value, only call free(…) on the nodes.
    • Do not call destroy_fn on the nodes, only call destroy_fn on a_value
    test_sorts.c function definitions
    main(int argc, char✶ argv[])
    return type: int
    Test your functions in sorts.c.
    • This must cause every line of code in your sorts.c to be executed (that is, 100% code coverage).
    • Every public function in sorts.c must be called directly from main(…) and/or from a helper within test_sorts.c.
    Functions to pass in as function addresses for calling methods in sorts.c.
    • These functions should be in test_sorts.c, not sorts.c.
    • Your code in test_sorts.c should not assume a particular address function or type of data will be used in any method.
  3. Use the typedef syntax to declare all struct types.
  4. Only the following externally defined functions and constants are allowed in your .c files. (You may put the corresponding #include <…> statements in the .c file or in your sorts.h, at your option.)
    header functions/symbols allowed in…
    stdbool.h bool, true, false warmup.c, sorts.c, sorts.h, test_sorts.c
    stdio.h printf, fprintf, stdout, FILE warmup.c, sorts.c, test_sorts.c
    stdlib.h malloc, free, NULL, EXIT_SUCCESS, EXIT_FAILURE warmup.c, sorts.c, test_sorts.c, sorts.h
    assert.h assert warmup.c, sorts.c, test_sorts.c
    string.h strcmp, strlen, strcpy test_sorts.c
    Feel free to suggest additional header files or functions that you would like to use.
  5. Submissions must meet the code quality standards and the policies on homework and academic integrity.

Submit

To submit HW17 from within your hw17 directory, type 264submit HW17 sorts.h sorts.c test_sorts.c miniunit.h log_macros.h Makefile

Pre-tester

The pre-tester for HW17 has been released and is ready to use.

Q&A

  1. What sorting algorithm should I use?
    We recommend merge sort simply because you already wrote that code on HW14 and it will be fairly easy to adapt. If you wish, you could use another method such as insertion sort or selection sort.
  2. Can I us a binary search tree sort?
    You may, just make sure you do not modify sorts.h to include any references to binary search tree functions or structs. Instead, either declare them at the top of sorts.c or define a new header file (e.g. bst_sort.h). Defining a new header will make it easier to independently test your binary search tree, just make sure you don't forget to submit it when you run 264submit.
  3. Can I use qsort(…)?
    No. sort_list(…) does not get passed the proper context to call qsort(…) over a_value, and attempting to qsort(…) the nodes directly will be very difficult.
  4. Is compare_fn the same as in qsort(…)?
    The compare function behaves nearly the same, but there is one subtle difference with how we are constructing our lists. In qsort(…), the compare function is always passed the address of an array element, meaning we have to dereference the address once to get the element being sorted. Our generic linked lists support storing address types directly in the list rather than nested inside an array, meaning the functions get passed the type directly.
    Concretely, this means in qsort(…), an array of strings will be passed char** for the compare function, while a linked list of strings will be passed char*. We would get the same behavior if you intentionally stored char** in a_value.
  5. Why aren't we sorting arrays directly? Why are we just sorting linked lists?
    Working with arrays generically leads to some very ugly syntax and some slightly hacky code. We did not feel requiring that added anything to the assignment so just stopped at sorting linked lists.
  6. Can I add helper functions to sorts.c?
    Yes. Make sure the names begin with "_".

Updates

7/29/2024
  • Created assignment
7/31/2024
  • Corrected return type of append in the requirements table.
  • If you already wrote your append function with a return, it should still work just fine in our tests as C allows you to ignore a return type when calling. Just make sure the a_list is updated as per the requirements.