Split string 2
Learning goals
In this assignment, you will learn to program with linked lists, while also strengthening your skills with addresses, multi-level data structures, and heap memory. Specifically, you will…
- Write code that uses linked lists (a dynamic structure).
- Modify existing code that uses one data structure to use another data structure.
- Free a linked list.
- Write unit tests for code that uses linked lists and heap memory.
Overview
In HW09, you wrote code to split a string according to one or more delimiters. In this assignment, you will adapt that code to use linked lists instead of arrays.
Most of the specification for this assignment is the same as HW09. There are a few differences:
- Use the short (typedef) form of all struct type definitions and struct type names.
- Modify the
Strings
(fkastruct Strings
) type to store the strings in a linked list instead of an array. - Add a support function:
append_substring(…)
. It will append a string to the end of the linked list referenced by an existingStrings
object.
How much work is this?
The instructor's solution (split_string.c) is 55 sloc (source lines of code, excluding comments and blank lines). Relative to HW09, 12 sloc were removed and 26 sloc were added.
Getting Started
Note: There is no starter code for HW10.
Copy your HW09 code.
you@eceprog ~
$
cd 264
you@eceprog ~/264
$
cp -R HW09/ HW10
you@eceprog ~/264
$
cd HW10
Delete try_struct_type_definitions.c.
you@eceprog ~/264
$
rm try_struct_type_definitions.c
It is not relevant for this assignment.
Update your Makefile.
you@eceprog ~/264
$
vim Makefile
Change ASG_NICKNAME
to HW10
.
Leave BASE_NAME
as split_string
.
ASG_NICKNAME = HW10
BASE_NAME = split_string
…
Comment out the body of (almost) every test function in test_split_string.c.
You can probably reuse most of your test code from HW10. However, it will require some changes. To reduce the number of compiler errors you have to contend with, you will comment out most of the old code from HW09 and then bring it back incrementally.
You can leave your test code for
copy_substring(…)
alone. That function will not change,
and so the test code need not change, either.
Your test code will look a little like this. (Names and number of functions will vary.)
int _test_copy_substring() {
mu_start();
//────────────────────
........................................
........................................
........................................
//────────────────────
mu_end();
}
int _test_free_strings() {
mu_start();
//────────────────────
/*
........................................
........................................
........................................
*/
//────────────────────
mu_end();
}
int _test_find_substring() {
mu_start();
//────────────────────
/*
........................................
........................................
........................................
*/
//────────────────────
mu_end();
}
int _test_split_string() {
mu_start();
//────────────────────
/*
........................................
........................................
........................................
........................................
*/
//────────────────────
mu_end();
}
Comment out find_substring(…)
and split_string(…)
in split_string.c.
Again, you can reuse most of your code. To minimize the compiler errors you get when you start converting your old code, you want to comment out most of it, and then bring it back gradually.
Leave your copy_substring(…)
function alone (not commented out).
It will not need to change.
Delete the body of your free_strings(…)
function.
It will need to change complete. Leave the bodies of your other functions alone.
Your split_string.c will look like this.
char* copy_substring(char const* src_string, size_t substring_len) {
........................................
........................................
........................................
}
/*
struct FindResult find_substring(char const** a_pos, struct Strings needles) {
........................................
........................................
........................................
}
struct Strings split_string(char const* text, struct Strings delimiters) {
........................................
........................................
........................................
}
*/
void free_strings(struct Strings* a_strings) {
}
Modify all code to use the short (typedef) syntax for struct type names and definitions
This will mostly just mean replacing "struct "
with ""
everywhere.
In Vim, you can use this command. (Press 'y' to accept one chance, or 'a' to accept all changes.)
:%s/struct //gc
⚠ Remember to do this in both all code files: split_string.c, split_string.h, and test_split_string.c.
Add a function declaration and stub function definition for append_substring(…)
In split_string.h, add the following just below the last struct type definition
(and above the // DO NOT MODIFY…
comment):
void append_substring(char const* string, size_t max_string_len, Strings* a_strings);
In split_string.c, add the following (anywhere you like, within reason).
void append_substring(char const* string, size_t max_string_len, Strings* a_strings) {
}
This is called a stub. It allows the code to compile while you are working on adding functionality.
⚠ Be careful to copy those lines exactly.
Your split_string.c will now look like this.
char* copy_substring(char const* src_string, size_t substring_len) {
........................................
........................................
........................................
}
void append_substring(char const* string, size_t max_string_len, Strings* a_strings) {
}
/*
FindResult find_substring(char const** a_pos, Strings needles) {
........................................
........................................
........................................
}
Strings split_string(char const* text, Strings delimiters) {
........................................
........................................
........................................
}
*/
void free_strings(Strings* a_strings) {
}
Make sure the new code can compile.
Requirements
-
Follow the following development steps, in order—and submit after every step.
You may submit more often, but be sure to have at least one submission for each
step below. You cannot go back. There will be a penalty if you do not follow this.
-
Set up your project as described in Getting started (above).
Modify split_string.h with the required struct type definitions (described in the table below).
Make sure everything compiles and your tests for
copy_substring(…)
still pass. Submit -
Add or update a test for the
append_substring(…)
function to add a single string to an emptyStrings
object (i.e., aStrings
object that refers to an empty list). Implement just enough to add a single node to an empty list. At this point, it is okay if your code has a memory leak.⚠ Do not implement the entire append function, yet.Submit -
Add or update a test for the
append_substring(…)
function to add a string to a non-emptyStrings
object (i.e., aStrings
object that refers to a linked list with at least one node). Implement the rest of theappend_substring(…)
function, so that it can append to anyStrings
object (i.e., empty or non-empty). At this point, it is okay if your code has a memory leak. Submit -
Add or update a test for
free_strings(…)
to free aStrings
object with one string (i.e., one node in the linked list). Implement just enough in split_string.c to make your new test pass, i.e., to free aStrings
object referring to a list with one node.⚠ Do not implementSubmitfree_strings(…)
for lists with multiple strings, yet. -
Add or update a test for
find_substring(…)
in test_split_string.c. Implement just enough in split_string.c to make that test pass. Submit -
Add or update a test for
split_string(…)
in test_split_string.c. Implement just enough in split_string.c to make that test pass. Submit
You will need to add other stuff, too. For example, there is no step written above for freeing a list of size ≥ 2 but obviously, you will need that. -
Set up your project as described in Getting started (above).
Modify split_string.h with the required struct type definitions (described in the table below).
Make sure everything compiles and your tests for
- Your submission must contain each of the following files, as specified:
file contents split_string.h struct types NodeDefine a struct type calledNode
with two fields:.next
(Node*
) and.string
(char*
).StringsDefine a struct type called
withstructStringstwothree fields:.strings
(char**
).head
(Node*
),.tail
(Node*
), and.num_strings
(size_t
).FindResultDefine a struct type calledstruct FindResult
with two fields:.needle
(char const*
) and.idx
(size_t
).function declarations The bottom of the file contains declarations for the functions that are required for split_string.c. Do not modify those declarations (unless directed in writing by the instructor). split_string.c function definitions find substring(char const✶✶ a pos, Strings needles)
→ return type: struct FindResultFind the first occurrence of any of the strings in needles within the string beginning at*a_pos
.- a_pos is the address of a
char*
indicating where to begin searching. If we call the full string being searched the haystack,*a_pos
could be the beginning of the haystack or some address in the middle—wherever we want to begin searching. -
If at least one of the needles is found within the string
at
*a_pos
…- Return a
struct FindResult
object of which the.idx
field indicates the index within the string at*a_pos
where the first needle was found, and the.needle
is one of the strings within needles. - Set
*a_pos
to the address of the character immediately after the first needle found. - In case an of the needles overlap, the one with the earlier index takes
priority. For example, when searching
"abcdefghi"
for the needles{"cde", "cd", "hi"}
, the.needle
field of the returned object would be"cde"
because it comes first in the array of needles. find_substring(…)
must not result in any calls tomalloc(…)
.
- Return a
- If none of the needles is found…
-
Return a
struct FindResult
object of which the.idx
field is 0 and the.needle
field isNULL
. - Do not modify
*a_pos
.
-
Return a
-
This is a support function. It is included in this assignment
to guide you in implementing
split_string(…)
, though it could certainly be useful on its own. - ⇆ compared to HW09): requirements are the same. body will be a little different. With instructor's solution, 3 lines were removed and 3 lines added (i.e., same total length).
copy substring(char const✶ src string, size t substring len)
→ return type: char✶Create a string on the heap containing the first (up to) substring_len characters from src_string.- Each call to
copy_substring(…)
should result in one call tomalloc(…)
. - Caller is responsible for freeing the heap memory buffer allocated by this function.
- ⚠ Do not call
free(…)
in this function.
- ⚠ Do not call
- ⇆ Compared to HW09): No changes.
append substring(char const✶ string, size t max string len, Strings✶ a strings)
*a_strings
. The.string
field of the new node should refer to a newly allocated string on the heap containing the first (up to) max_string_len characters from string.-
You can (and should) call
copy_substring(…)
fromappend_substring(…)
. -
Calling
append_substring(…)
should result in two (2) calls tomalloc(…)
. - Caller is responsible for freeing the heap memory buffer allocated by this function.
- ⚠ Do not call
free(…)
in this function.
- ⚠ Do not call
- ⇆ Compared to HW09): This function is new. Body of instructor's solution is 10 sloc (source lines of code). Yours may be longer.
free strings(Strings✶ a strings)
*a_strings
.- The purpose of
free_strings(…)
is to free the result that will be returned bysplit_string(…)
. - This must free each of the strings
(i.e.,
(*a_strings).strings[0]
, etc.) as well as the outer array (i.e.,(*a_strings).strings
). - Calling
free_strings(…)
should not result in any calls tomalloc(…)
- ⇆ Compared to HW09): Requirements are the same. Body will be completely different. Instructor's solution is 7 sloc (source lines of code). Yours may be longer.
split string(char const✶ text, Strings delimiters)
→ return type: StringsSplit the stringtext
by each non-overlapping occurrence of any of the strings indelimiters
.- Return
a linked list of strings (as a
Strings
object),an array of strings (including the delimiters. Each string in the linked listchar*
) on the heap,this array(including the delimiters) should be newly created (malloc'd). - Each call to
split_string(…)
should result in 2nn + 1calls to malloc, where n is the number of parts in the resulting array.-
There will be one call to
malloc(…)
for the string and one call tomalloc(…)
for the linked list node.
-
There will be one call to
- The
Strings
object itself will be on the stack.-
You know it has to be on the stack because this returns a
Strings
object (not aStrings*
). - In case a delimiter falls at the beginning or end of
text
, and/or consecutive occurrences of delimiters, the returned array will contain one or more empty strings. - Caller is responsible for freeing the heap memory buffer allocated by this function.
- ⚠ Do not call
free(…)
in this function.
- ⚠ Do not call
- ⇆ Compared to HW09): Requirements are the same. Body will be a little different. With instructor's solution, 5 lines (out of 13) were removed, and 5 lines added (i.e., same total length).
test_split_string.c functions main(int argc, char✶ argv[])
→ return type: intTest the code in your split_string.c using your miniunit.h.- 100% code coverage is required for split_string.c.
test ▒▒▒()
→ return type: int- Use your
mu_check_strings_equal(…)
from HW06 to check that the code in your split_string.c is working correctly.
test ▒▒▒()
→ return type: inttest ▒▒▒()
→ return type: intminiunit.h macros Same as HW06. Okay to modify, if you wish. log_macros.h macros Same as HW05. Okay to modify, if you wish. Makefile macros Same as HW07. Okay to modify, if you wish. - a_pos is the address of a
- ⚠ Do not use typecasts anywhere in HW10. The code quality standard says typecasts may not be used, except when they are truly necessary, safe, and you can articulate why. They are not necessary for anything in HW10.
-
Only the following external header files, functions, and symbols are allowed in split_string.c.
header functions/symbols allowed in… stdbool.h bool
,true
,false
*
stdio.h *
test_split_string.c
assert.h assert
*
stdlib.h EXIT_SUCCESS
test_split_string.c
stdlib.h malloc
,free
*
string.h strlen
,strcpy
,strncpy
,strcmp
,strstr
,strncmp
,memcpy
,memmove
*
-
⚠ Do not use
strtok(…)
(function from C standard libary). - Submissions must meet the code quality standards and the policies on homework and academic integrity.
Submit
To submit HW10 from within your hw10 directory, type
make submit
That should result in the following command:
264submit HW10 split_string.c split_string.h test_split_string.c miniunit.h log_macros.h Makefile
Pre-tester ●
The pre-tester for HW10 has been released and is ready to use.
Q&A
Updates
3/9/2024 |
|