Advanced C Programming

Spring 2024 ECE 26400 :: Purdue University

This is a PAST SEMESTER (Spring 2024).
Due 3/10

Split string 2

Learning goals

In this assignment, you will learn to program with linked lists, while also strengthening your skills with addresses, multi-level data structures, and heap memory. Specifically, you will…

  1. Write code that uses linked lists (a dynamic structure).
  2. Modify existing code that uses one data structure to use another data structure.
  3. Free a linked list.
  4. Write unit tests for code that uses linked lists and heap memory.

Overview

In HW09, you wrote code to split a string according to one or more delimiters. In this assignment, you will adapt that code to use linked lists instead of arrays.

Most of the specification for this assignment is the same as HW09. There are a few differences:

How much work is this?

The instructor's solution (split_string.c) is 55 sloc (source lines of code, excluding comments and blank lines). Relative to HW09, 12 sloc were removed and 26 sloc were added.

Getting Started

Note: There is no starter code for HW10.

Copy your HW09 code.

you@eceprog ~ $ cd 264
you@eceprog ~/264 $ cp -R HW09/ HW10
you@eceprog ~/264 $ cd HW10

Delete try_struct_type_definitions.c.

you@eceprog ~/264 $ rm try_struct_type_definitions.c

It is not relevant for this assignment.

Update your Makefile.

you@eceprog ~/264 $ vim Makefile

Change ASG_NICKNAME to HW10. Leave BASE_NAME as split_string.

ASG_NICKNAME = HW10
BASE_NAME = split_string

Comment out the body of (almost) every test function in test_split_string.c.

You can probably reuse most of your test code from HW10. However, it will require some changes. To reduce the number of compiler errors you have to contend with, you will comment out most of the old code from HW09 and then bring it back incrementally.

You can leave your test code for copy_substring(…) alone. That function will not change, and so the test code need not change, either.

Your test code will look a little like this. (Names and number of functions will vary.)

int _test_copy_substring() {
	mu_start();
	//────────────────────
	........................................
	........................................
	........................................
	//────────────────────
	mu_end();
}

int _test_free_strings() {
	mu_start();
	//────────────────────
	/*
	........................................
	........................................
	........................................
	*/
	//────────────────────
	mu_end();
}

int _test_find_substring() {
	mu_start();
	//────────────────────
	/*
	........................................
	........................................
	........................................
	*/
	//────────────────────
	mu_end();
}

int _test_split_string() {
	mu_start();
	//────────────────────
	/*
	........................................
	........................................
	........................................
	........................................
	*/
	//────────────────────
	mu_end();
}

Comment out find_substring(…) and split_string(…) in split_string.c.

Again, you can reuse most of your code. To minimize the compiler errors you get when you start converting your old code, you want to comment out most of it, and then bring it back gradually.

Leave your copy_substring(…) function alone (not commented out). It will not need to change.

Delete the body of your free_strings(…) function.

It will need to change complete. Leave the bodies of your other functions alone.

Your split_string.c will look like this.

char* copy_substring(char const* src_string, size_t substring_len) {
  ........................................
  ........................................
  ........................................
}

/*
struct FindResult find_substring(char const** a_pos, struct Strings needles) {
  ........................................
  ........................................
  ........................................
}

struct Strings split_string(char const* text, struct Strings delimiters) {
  ........................................
  ........................................
  ........................................
}
*/

void free_strings(struct Strings* a_strings) {
}

Modify all code to use the short (typedef) syntax for struct type names and definitions

This will mostly just mean replacing "struct " with "" everywhere.

In Vim, you can use this command. (Press 'y' to accept one chance, or 'a' to accept all changes.)

:%s/struct //gc

Remember to do this in both all code files: split_string.c, split_string.h, and test_split_string.c.

Add a function declaration and stub function definition for append_substring(…)

In split_string.h, add the following just below the last struct type definition (and above the // DO NOT MODIFY… comment):

void append_substring(char const* string, size_t max_string_len, Strings* a_strings);

In split_string.c, add the following (anywhere you like, within reason).

void append_substring(char const* string, size_t max_string_len, Strings* a_strings) {
}

This is called a stub. It allows the code to compile while you are working on adding functionality.

Be careful to copy those lines exactly.

Your split_string.c will now look like this.

char* copy_substring(char const* src_string, size_t substring_len) {
  ........................................
  ........................................
  ........................................
}

void append_substring(char const* string, size_t max_string_len, Strings* a_strings) {
}

/*
FindResult find_substring(char const** a_pos, Strings needles) {
  ........................................
  ........................................
  ........................................
}

Strings split_string(char const* text, Strings delimiters) {
  ........................................
  ........................................
  ........................................
}
*/

void free_strings(Strings* a_strings) {
}

Make sure the new code can compile.

Requirements

  1. Follow the following development steps, in order—and submit after every step. You may submit more often, but be sure to have at least one submission for each step below. You cannot go back. There will be a penalty if you do not follow this.
    1. Set up your project as described in Getting started (above). Modify split_string.h with the required struct type definitions (described in the table below). Make sure everything compiles and your tests for copy_substring(…) still pass. Submit
    2. Add or update a test for the append_substring(…) function to add a single string to an empty Strings object (i.e., a Strings object that refers to an empty list). Implement just enough to add a single node to an empty list. At this point, it is okay if your code has a memory leak.
      Do not implement the entire append function, yet.
      Submit
    3. Add or update a test for the append_substring(…) function to add a string to a non-empty Strings object (i.e., a Strings object that refers to a linked list with at least one node). Implement the rest of the append_substring(…) function, so that it can append to any Strings object (i.e., empty or non-empty). At this point, it is okay if your code has a memory leak. Submit
    4. Add or update a test for free_strings(…) to free a Strings object with one string (i.e., one node in the linked list). Implement just enough in split_string.c to make your new test pass, i.e., to free a Strings object referring to a list with one node.
      Do not implement free_strings(…) for lists with multiple strings, yet.
      Submit
    5. Add or update a test for find_substring(…) in test_split_string.c. Implement just enough in split_string.c to make that test pass. Submit
    6. Add or update a test for split_string(…) in test_split_string.c. Implement just enough in split_string.c to make that test pass. Submit
    You will need to add other stuff, too. For example, there is no step written above for freeing a list of size ≥ 2 but obviously, you will need that.
  2. Your submission must contain each of the following files, as specified:
    file contents
    split_string.h struct types
    Node
    Define a struct type called Node with two fields: .next (Node*) and .string (char*).
    Strings
    Define a struct type called struct Strings with two three fields: .strings (char**) .head (Node*), .tail (Node*), and .num_strings (size_t).
    FindResult
    Define a struct type called struct FindResult with two fields: .needle (char const*) and .idx (size_t).
    function declarations The bottom of the file contains declarations for the functions that are required for split_string.c. Do not modify those declarations (unless directed in writing by the instructor).
    split_string.c function definitions
    find substring(char const✶✶ a pos, Strings needles)
    return type: struct FindResult
    Find the first occurrence of any of the strings in needles within the string beginning at *a_pos.
    • a_pos is the address of a char* indicating where to begin searching. If we call the full string being searched the haystack, *a_pos could be the beginning of the haystack or some address in the middle—wherever we want to begin searching.
    • If at least one of the needles is found within the string at *a_pos
      • Return a struct FindResult object of which the .idx field indicates the index within the string at *a_pos where the first needle was found, and the .needle is one of the strings within needles.
      • Set *a_pos to the address of the character immediately after the first needle found.
      • In case an of the needles overlap, the one with the earlier index takes priority. For example, when searching "abcdefghi" for the needles {"cde", "cd", "hi"}, the .needle field of the returned object would be "cde" because it comes first in the array of needles.
      • find_substring(…) must not result in any calls to malloc(…).
    • If none of the needles is found…
      • Return a struct FindResult object of which the .idx field is 0 and the .needle field is NULL.
      • Do not modify *a_pos.
    • This is a support function. It is included in this assignment to guide you in implementing split_string(…), though it could certainly be useful on its own.
    • compared to HW09): requirements are the same. body will be a little different. With instructor's solution, 3 lines were removed and 3 lines added (i.e., same total length).
    copy substring(char const✶ src string, size t substring len)
    return type: char✶
    Create a string on the heap containing the first (up to) substring_len characters from src_string.
    • Each call to copy_substring(…) should result in one call to malloc(…).
    • Caller is responsible for freeing the heap memory buffer allocated by this function.
      • Do not call free(…) in this function.
    • Compared to HW09): No changes.
    append substring(char const✶ string, size t max string len, Strings✶ a strings)
    Append a new node to the linked list referred to by *a_strings. The .string field of the new node should refer to a newly allocated string on the heap containing the first (up to) max_string_len characters from string.
    • You can (and should) call copy_substring(…) from append_substring(…).
    • Calling append_substring(…) should result in two (2) calls to malloc(…).
    • Caller is responsible for freeing the heap memory buffer allocated by this function.
      • Do not call free(…) in this function.
    • Compared to HW09): This function is new. Body of instructor's solution is 10 sloc (source lines of code). Yours may be longer.
    free strings(Strings✶ a strings)
    Free all memory referred to (directly or indirectly) by *a_strings.
    • The purpose of free_strings(…) is to free the result that will be returned by split_string(…).
    • This must free each of the strings (i.e., (*a_strings).strings[0], etc.) as well as the outer array (i.e., (*a_strings).strings).
    • Calling free_strings(…) should not result in any calls to malloc(…)
    • Compared to HW09): Requirements are the same. Body will be completely different. Instructor's solution is 7 sloc (source lines of code). Yours may be longer.
    split string(char const✶ text, Strings delimiters)
    return type: Strings
    Split the string text by each non-overlapping occurrence of any of the strings in delimiters.
    • Return a linked list of strings (as a Strings object), an array of strings (char*) on the heap, including the delimiters. Each string in the linked list this array (including the delimiters) should be newly created (malloc'd).
    • Each call to split_string(…) should result in 2n n + 1 calls to malloc, where n is the number of parts in the resulting array.
      • There will be one call to malloc(…) for the string and one call to malloc(…) for the linked list node.
    • The Strings object itself will be on the stack.
        You know it has to be on the stack because this returns a Strings object (not a Strings*).
    • In case a delimiter falls at the beginning or end of text, and/or consecutive occurrences of delimiters, the returned array will contain one or more empty strings.
    • Caller is responsible for freeing the heap memory buffer allocated by this function.
      • Do not call free(…) in this function.
    • Compared to HW09): Requirements are the same. Body will be a little different. With instructor's solution, 5 lines (out of 13) were removed, and 5 lines added (i.e., same total length).
    test_split_string.c functions
    main(int argc, char✶ argv[])
    return type: int
    Test the code in your split_string.c using your miniunit.h.
    • 100% code coverage is required for split_string.c.
     test ▒▒▒()
    return type: int
    • Use your mu_check_strings_equal(…) from HW06 to check that the code in your split_string.c is working correctly.
     test ▒▒▒()
    return type: int
     test ▒▒▒()
    return type: int
    miniunit.h macros Same as HW06. Okay to modify, if you wish.
    log_macros.h macros Same as HW05. Okay to modify, if you wish.
    Makefile macros Same as HW07. Okay to modify, if you wish.
  3. Do not use typecasts anywhere in HW10. The code quality standard says typecasts may not be used, except when they are truly necessary, safe, and you can articulate why. They are not necessary for anything in HW10.
  4. Only the following external header files, functions, and symbols are allowed in split_string.c.
    header functions/symbols allowed in…
    stdbool.h bool, true, false *
    stdio.h * test_split_string.c
    assert.h assert *
    stdlib.h EXIT_SUCCESS test_split_string.c
    stdlib.h malloc, free *
    string.h strlen, strcpy, strncpy, strcmp, strstr, strncmp, memcpy, memmove *
    All others are prohibited unless approved by the instructor. Feel free to ask if there is something you would like to use.
  5. Do not use strtok(…) (function from C standard libary).
  6. Submissions must meet the code quality standards and the policies on homework and academic integrity.

Submit

To submit HW10 from within your hw10 directory, type

make submit

That should result in the following command: 264submit HW10 split_string.c split_string.h test_split_string.c miniunit.h log_macros.h Makefile

Pre-tester

The pre-tester for HW10 has been released and is ready to use.

Q&A

Updates

3/9/2024
  • Requirements table was fixed to remove some lingering text from HW09 including type names and expected number of calls to malloc(…).