Advanced C Programming

Spring 2024 ECE 26400 :: Purdue University

This is a PAST SEMESTER (Spring 2024).
Due 3/4

Split string

Learning goals

In this assignment will practice some of the same skills you learned in HW08. Specifically, you will:

  1. Use malloc(…) to allocate buffers on the heap.
  2. Write bug-resistant code that uses malloc(…).
  3. Debug common problems that occur when using dynamic memory.

In addition, you will learn how to:

  1. Define a struct type.
  2. Use struct objects on the stack and on the heap.
  3. Free memory referred to by a struct object.
  4. Free elements of an array.
  5. Use const to avoid inadvertently modifying data that was not supposed to change.

Overview

In HW08, you wrote code to join an array of strings with some delimiter between each string. This assignment is (mostly) the reverse operation, with a twist, which we will get to.

You will create a function split_string(…), which splits a string according to one or more delimiters. (That's the twist.)

Example: Splitting the string "ABC--DEF;GHI=JKL" by the delimiters "--", ";", and "=" would produce the array {"ABC", "--", "DEF", ";", "GHI", "=", "JKL"}.

Why would we do split by multiple delimiters?

Originally, the idea for this assignment was to have you take a paragraph of text as input, and split it into sentences (i.e., by '.', '?', and '!'), and then split each sentence into words. The first step of that would be to split the text by '.', '?', and '!'.

As we were writing the solution and description for that assignment, we started by writing up a support function, split_string(…), which would split the paragraph text by any delimiters you specify (i.e., '.', '?', and '!'). It was getting complicated when we got to how to deal with varying amounts of whitespace, as well as the possibility of multiple consecutive delimiters. Since split_string(…) is the logical reverse of join_strings(…), we decided to shorten the assignment to just this.

This isn't a crazy thing to do in any case. For example, Python allows you to split a string by a regular expression (pattern), which effectively allows you to split by multiple delimiters (ex: re.split(r'[.!?]', 'No! But why? Because.')). It can be useful for… yeah, you guess it!… splitting a paragraph by sentences (among many other things).

Getting Started

Get the starter code

you@eceprog ~/264/ $ 264get hw09
This will write the following files: 1. hw09/split_string.h 1. hw09/try_struct_type_definitions.c Ok? (y/n)y 2 files were written
you@eceprog ~/264/ $ cd hw09
you@eceprog ~/264/hw09 $

Copy your log_macros.h, miniunit.h, and Makefile from previous assignments.

you@eceprog ~/264/hw09 $ cp ../hw08/log_macros.h ./
you@eceprog ~/264/hw09 $ cp ../hw08/miniunit.h ./
you@eceprog ~/264/hw09 $ cp ../hw08/Makefile ./
you@eceprog ~/264/hw09 $

Update your Makefile

you@eceprog ~/264/hw09 $ vim Makefile

Change ASG_NICKNAME to HW09 and BASE_NAME to split_string.

ASG_NICKNAME = HW09
BASE_NAME = split_string

Make sure split_string.h will be submitted when you type make submit. Your SUBMIT_FILES variable might vary in how it uses the other variables. Just make sure that when all variables have been expanded, SUBMIT_FILES includes all of the files that must be submitted with this assignment.

If your Makefile looks like the one in the Q&A for HW08, you could probably just change

SUBMIT_FILES = $(ALL_C_FILES) $(MORE_H_FILES) Makefile

… to …

SUBMIT_FILES = $(ALL_C_FILES) $(ALL_H_FILES) Makefile

In previous assignments, submitting the main header file (e.g., split_string.h) was not required. This time, it is. Make sure your Makefile includes all required files when you submit (make submit)—including split_string.h.

Requirements

  1. Follow the following development steps, in order—and submit after every step. You may submit more often, but be sure to have at least one submission for each step below. You cannot go back. There will be a penalty if you do not follow this.
    1. Fill in the two struct type definitions in split_string.h. Check for errors by compiling and running try_struct_type_definitions.c. Submit
    2. Add a test for copy_substring(…) in test_split_string.c. Implement just enough in split_string.c to make that test pass. Submit
    3. Add a test for free_strings(…). It will need to create an array of strings on the heap. Feel free to reuse your code from your HW08 and/or the try_struct_type_definitions.c in the HW09 starter code. Implement just enough in split_string.c to make your new test pass. Submit
    4. Add a test for find_substring(…) in test_split_string.c. Implement just enough in split_string.c to make that test pass. Hint: strstr(…). Google it or man strstr. Submit
    5. Add a test for split_string(…) in test_split_string.c. Implement just enough in split_string.c to make that test pass. Submit
  2. Your submission must contain each of the following files, as specified:
    file contents
    split_string.h struct types
    struct Strings
    Define a struct type called struct Strings with two fields: .strings (char**) and .num_strings (size_t).
    struct FindResult
    Define a struct type called struct FindResult with two fields: .needle (char const*) and .idx (size_t).
    function declarations The bottom of the file contains declarations for the functions that are required for split_string.c. Do not modify those declarations (unless directed in writing by the instructor).
    split_string.c function definitions
    find substring(char const✶✶ a pos, struct Strings needles)
    return type: struct FindResult
    Find the first occurrence of any of the strings in needles within the string beginning at *a_pos.
    • a_pos is the address of a char* indicating where to begin searching. If we call the full string being searched the haystack, *a_pos could be the beginning of the haystack or some address in the middle—wherever we want to begin searching.
    • If at least one of the needles is found within the string at *a_pos
      • Return a struct FindResult object of which the .idx field indicates the index within the string at *a_pos where the first needle was found, and the .needle is one of the strings within needles.
      • Set *a_pos to the address of the character immediately after the first needle found.
      • In case an of the needles overlap, the one with the earlier index takes priority. For example, when searching "abcdefghi" for the needles {"cde", "cd", "hi"}, the .needle field of the returned object would be "cde" because it comes first in the array of needles.
      • find_substring(…) must not result in any calls to malloc(…).
    • If none of the needles is found…
      • Return a struct FindResult object of which the .idx field is 0 and the .needle field is NULL.
      • Do not modify *a_pos.
    • This is a support function. It is included in this assignment to guide you in implementing split_string(…), though it could certainly be useful on its own.
    copy substring(char const✶ src string, size t substring len)
    return type: char✶
    Create a string on the heap containing the first (up to) substring_len characters from src_string.
    • Each call to copy_substring(…) should result in one call to malloc(…).
    • Caller is responsible for freeing the heap memory buffer allocated by this function.
      • Do not call free(…) in this function.
    free strings(struct Strings✶ a strings)
    Free all memory referred to (directly or indirectly) by *a_strings.
    • The purpose of free_strings(…) is to free the result that will be returned by split_string(…).
    • This must free each of the strings (i.e., (*a_strings).strings[0], etc.) as well as the outer array (i.e., (*a_strings).strings).
    • Calling free_strings(…) should not result in any calls to malloc(…)
    split string(char const✶ text, struct Strings delimiters)
    return type: struct Strings
    Split the string text by each non-overlapping occurrence of any of the strings in delimiters.
    • Return an array of strings (char*) on the heap, including the delimiters. Each string in this array (including the delimiters) should be newly created (malloc'd).
    • Each call to split_string(…) should result in n + 1 calls to malloc, where n is the number of parts in the resulting array. The +1 is for the array of strings (i.e., the outer array).
    • In case a delimiter falls at the beginning or end of text, and/or consecutive occurrences of delimiters, the returned array will contain one or more empty strings.
    • Caller is responsible for freeing the heap memory buffer allocated by this function.
      • Do not call free(…) in this function.
    test_split_string.c functions
    main(int argc, char✶ argv[])
    return type: int
    Test the code in your split_string.c using your miniunit.h.
    • 100% code coverage is required for split_string.c.
     test ▒▒▒()
    return type: int
    • Use your mu_check_strings_equal(…) from HW06 to check that the code in your split_string.c is working correctly.
     test ▒▒▒()
    return type: int
     test ▒▒▒()
    return type: int
    miniunit.h macros Same as HW06. Okay to modify, if you wish.
    log_macros.h macros Same as HW05. Okay to modify, if you wish.
    Makefile macros Same as HW07. Okay to modify, if you wish.
  3. Do not use typecasts anywhere in HW09. The code quality standard says typecasts may not be used, except when they are truly necessary, safe, and you can articulate why. They are not necessary for anything in HW09.
  4. Only the following external header files, functions, and symbols are allowed in split_string.c.
    header functions/symbols allowed in…
    stdbool.h bool, true, false *
    stdio.h * test_split_string.c
    assert.h assert *
    stdlib.h EXIT_SUCCESS test_split_string.c
    stdlib.h malloc, free *
    string.h strlen, strcpy, strncpy, strcmp, strstr, strncmp, memcpy, memmove *
    All others are prohibited unless approved by the instructor. Feel free to ask if there is something you would like to use.
  5. Do not use strtok(…) (function from C standard libary).
  6. Submissions must meet the code quality standards and the policies on homework and academic integrity.

Submit

To submit HW09 from within your hw09 directory, type

make submit

That should result in the following command: 264submit HW09 split_string.c test_split_string.c split_string.h miniunit.h log_macros.h Makefile

Pre-tester

The pre-tester for HW09 has been released and is ready to use.

Q&A

Updates

There are no updates to HW09 so far.