JSON 5: parse objects
Learning goals
You will learn or practice how to:
- Work with binary search trees as part of a larger program.
- Refactor existing code, avoiding breaking previous tests.
- Combine all major learning objectives into a single program that demonstrates what you learned in this class.
Overview
In this homework, you will adapt your code from HW15 to allow parsing JSON objects. After completing objects, yu will have the majority of the functionality of a JSON parser working. Refer to the previous parts for specifications on valid and invalid JSON.
This assignment adds the major missing functionality to your JSON parser. If you complete this last step of the project, it might be good for a resume.
If you did not do well on HW15, you will likely have a hard time on this assignment. Consider instead doing RE02 to earn back some of the lost credit before trying to earn extra credit here.
A JSON file contains text that represents a number, a string, a list, an object, a boolean, or null. Here are some examples:
type | json | data |
---|---|---|
number | 10 | 10 |
string | "ten" | "ten" |
list | [2, 6, 4] | |
object | {"key1": "value", "key2": true} | Imagine a lovely picture of a binary search tree here. I plan to draw one during lecture. |
boolean | true | true |
null | null |
Nested elements
As in C, whitespace is ignored (except within a string). Lists and objects can both contain any type of data, thus all the following are valid:
type | json | data |
---|---|---|
list | [2, 6, 4] | |
list | [2, 6, "four"] | |
list | [2, [3, "+", 3], "four"] | |
list | [{"key": "value"}, {"key": "another value"}] | Also plan to draw this during lecture |
object | {"array": [1, 2, 3]} | And this one too, its a lot of drawing to do just to post this! |
object | {"nested object": {"key": 5}} | Yeah, pictures |
Getting Started on EC03
- The starter code for EC03 is the same as previous JSON assignment. You will be required to modify the header to add the new functions.
- Start by creating a directory and
cp
to copy your code from HW15. -
Spend some time fixing functionality from previous JSON parts that you did not finish in time for HW15. If there is a lot of previous functionality that needs fixing, consider working on RE02 first instead. The following functionality should be working before you attempt objects:
- Parsing integers, strings, boolean, and null: objects can contain all of these.
- Parsing list: objects are essentially more complicated lists. If your lists do not work, it will be hard for you to figure out objects.
- Reading and writing JSON files: we will be testing objects with files.
- Might as well make a submission to RE02 if you improved something.
-
Start by creating a helper to insert key and an element into a binary search tree.
Recommended signature:
void _bst_insert(BSTNode** a_root, char const* key, Element element)
. - We recommend finishing
get_element(…)
next. You can test it by creating binary trees manually using your helper function. - Test and submit.
- Next, update
print_element(…)
to support objects. This will help you debug. - Test and submit.
- Create
parse_object(…)
- Start by implementing code to parse an empty object.
- Test and submit.
- Next, create code to parse an element with a single key and value.
- Test and submit.
- Next, update the code to parse two key and value pairs.
- Test and submit.
- Update the code to parse any number of key and value pairs.
- Test and submit.
- Finally, add code to handle any invalid cases you did not handle before.
- Test and submit.
- Update
parse_element(…)
to support JSON objects. - Create recursive helper function to free the binary search tree. Recommended signature:
void _free_tree(BSTNode** a_root)
. - Finally, update
free_element(…)
to support JSON objects. - Test that there are no memory leaks, even for incorrect input.
you@ecegrid-thin1 ~/
$
cd 264
you@ecegrid-thin1 ~/264/
$
mkdir ec03
you@ecegrid-thin1 ~/264/
$
cp hw15/* -v -t ec03/
you@ecegrid-thin1 ~/264/
$
cd ec03
you@ecegrid-thin1 ~/264/ec03
$
Requirements
- Your submission must contain each of the following files, as specified:
file contents json.h type definitions Element- Everything that was in element before.
- To the type enum, add one new types:
ELEMENT_OBJECT
. - To the struct, add one new fields:
BSTNode* as_object
.
BSTNodestruct type with 4 fields:key
(char*
),element
(Element
),left
, andright
(BSTNode*
). Sort the tree usingkey
, no need to considerelement
when sorting.function declarations One for each required function in your json.c - Do not include helpers (if any) here.
json.c functions All functions that were required in json.c previously. Only new functions or functions that were changed will be listed below. get element(BSTNode✶ root, char const✶ key)
→ return type: Element✶Finds the node in the binary search tree which matches the key.- If found, return the address of the element in the node.
- If not found, return
NULL
. - Hint: you can write this either using recursion or a loop.
- Hint: this function will be helpful in testing
parse_object(…)
.
parse object(BSTNode✶✶ a root, char const✶✶ a pos)
→ return type: boolSet*a_root
to the root of a tree ofElement
objects.- Caller is responsible for freeing the memory if
parse_object(…)
returnstrue
. - Objects consist of a
'{'
, followed by 0 or more key-value pairs separated by commas, and finally a'}'
. Each key-value pair consists of a string key, a character, then a JSON-encoded element (integer, string, list, object, boolean, or null). See the examples above. (There will be no HBB on this definition, unless there is something truly wrong and/or grossly unclear.) There may be any number/amount of whitespace characters (
' '
,'\n'
, or'\t'
), before/after any of the element and before/after the key. - Return
true
if a properly formed object was found. *a_pos should be set to the next character in the input string, after the object that was just parsed.- Ex:
parse_object(…)
should returntrue
for {}, {"number": 123}, {"string": "Hello", "list": [ 1, 2, 3 ]}, { "nested_object" : { "integer": 5 } }, and { "string_contains": "trailing_characters" }ABC.
- Ex:
- Return
false
if an object was not found (i.e., syntax error in the input string). *a_pos should refer to the unexpected character that indicated a problem.- Ex:
parse_object(…)
should returnfalse
for A{}, {"key": 1,,"value": 2}, { {, "key": 1 { "key": 1 "key2": 2}. and { "key" 1 }.
- Ex:
- The parsed elements are stored in a binary search tree sorted by the key values.
-
If duplicate keys are found, replace the value in the old node with the new element parsed.
For example, {"duplicate": 123, "duplicate": 456}
should produce the same result as {"duplicate": 456},
⚠ do not forget to free the old element. - Whenever
parse_object(…)
returnsfalse
, do not modify*a_root
, and free any heap memory that was allocated prior to discovery of the error. - Hint: you already have logic written to parse a string, reuse that logic for the key instead of writing it again.
parse element(Element✶ a element, char const✶✶ a pos)
→ return type: bool- First, eat any whitespace at
*a_pos
.- “Eat whitespace” just means to skip over any whitespace characters (i.e., increment
*a_pos
untilisspace(**a_pos)==false
).
- “Eat whitespace” just means to skip over any whitespace characters (i.e., increment
- Next, decide what kind of element this is.
- If it's a digit (
isdigit(**a_pos)
) or hyphen ('-'
), set the element'stype
toELEMENT_INT
and callparse int(&(a element -> as int), a pos)
. - If it's a string (
**a_pos=='"'
), then set the element'stype
toELEMENT_STRING
and callparse string(&(a element -> as string), a pos)
. - If it's a list (
**a_pos == '['
), then set the element'stype
toELEMENT_LIST
and call:parse list(&(a element -> as list), a pos)
. - If it's a boolean (
**a_pos == 't'
or**a_pos == 'f'
), then set the element'stype
toELEMENT_BOOL
and call:parse bool(&(a element -> as bool), a pos)
. - If it's null (
**a_pos == 'n'
), then set the element'stype
toELEMENT_NULL
, set the element'sas_null
toNULL
, and call:parse null(a pos)
. - If it's an object (
**a_pos == '{'
), then set the element'stype
toELEMENT_OBJECT
and call:parse object(&(a element -> as object), a pos)
.
- If it's a digit (
- Return whatever was returned by the relevant function.
- If none of those functions was called—i.e., if the next character was none of the expected characters, then return
false
.
- If none of those functions was called—i.e., if the next character was none of the expected characters, then return
- Do not modify
*a_pos
directly inparse_element(…)
, except for eating whitespace.*a_pos
can—and should—be modified in the relevant parse functions.
- Caller is responsible for freeing memory by calling
free_element(…)
wheneverparse_element(…)
returnstrue
. - Whenever
parse_element(…)
returnsfalse
, do not modify*a_element
, and free any heap memory that was allocated prior to discovery of the error.
print element to file(Element element, FILE✶ file)
→ return type: voidGiven anElement
object, print it in JSON notation to the passed file.- Spacing is up to you, as long as it is valid JSON.
- If element is an integer, print it using
fprintf(…)
. - If element is a string, then print it (with double-quotes) using
fprintf(…)
. - If element is a list, print a
'['
. Then print each element in the list usingprint_element_to_file(…)
(recursively), separated by commas. Finally, print']'
. - If element is an object, print a
'{'
. Then print each key-value pair (in any order) in the object usingfprintf(…)
, followed by':
, followed by the element usingprint_element_to_file(…)
(recursively). Separate each key-value pair using commas. Finally, print'}'
. - If element is a boolean or null, print it using a string constant.
free element(Element element)
→ return type: voidFree the contents of theElement
, as needed.- If it contains a string, free the string.
- If it contains a linked list, free the list, including all elements.
- If it contains a binary search tree, free each node in the tree, including all elements.
- ⚠ Do not attempt to free the
Element
object itself.free_element(element)
only frees dynamic memory that element refers to.
test_json.c functions main(int argc, char✶ argv[])
→ return type: intTest your all of the above functions using yourminiunit.h.
.- This should consist primarily of calls to
mu_run(_test_▒▒▒)
. - 100% code coverage is required.
- Your main(…) must return EXIT_SUCCESS.
- You may ignore any trailing characters in the input string in any function starting
with
parse_
, as long as the input starts with a well-formed JSON element.- Acceptable: 123AAA, "12"AAA, "12",[,
read_json(…)
, other than whitespace. - You only need to support the specific features of JSON that are explicitly required in this assignment description. You do not need to support unicode (e.g., "萬國碼", "يونيكود", "യൂണികോഡ്"), backslash escapes (e.g., "\n"), embedded quotes (e.g., "He said, \"Roar!\""), floating point numbers (e.g., 3.1415), non-decimal notations (e.g., 0xdeadbeef, 0600)
- Do not add helper functions to json.h.
- There may be no memory faults (e.g., leaks, invalid read/write, etc.), even when
parse_▒▒▒(…)
orread_json(…)
return false. -
The following external header files, functions, and symbols are
allowed.
header functions/symbols allowed in… stdbool.h bool
,true
,false
json.c
,test_json.c
,json.h
stdio.h fprintf
,fputc
,fputs
,stdout
,fflush
,fopen
,fclose
,fseek
,ftell
,rewind
,fread
,fgetc
json.c
,test_json.c
,json.h
assert.h assert
json.c
,test_json.c
ctype.h isdigit
,isspace
json.c
,test_json.c
stdlib.h EXIT_SUCCESS
,abs
,malloc
,free
,size_t
json.c
,test_json.c
string.h strncpy
,strchr
,strlen
,strcmp
json.c
,test_json.c
limits.h INT_MIN
,INT_MAX
json.c
,test_json.c
miniunit.h anything
test_json.c
clog.h anything
json.c
,test_json.c
- Submissions must meet the code quality standards and the course policies on homework and academic integrity.
Submit
To submit EC03 from within your ec03 directory,
type
264submit EC03 json.c json.h test_json.c miniunit.h clog.h Makefile *.json
If your code does not depend on miniunit.h or clog.h, those may be omitted. Your json.h will most likely be identical to the starter. Makefile will not be checked, but including it may help in case we need to do any troubleshooting. Make sure to submit all required json files required by your tests. It is your responsibility to ensure the tester has all the files required to test. You are encouraged to modify your Makefile to do so.
Pre-tester ●
The pre-tester for EC03 has been released and is ready to use.
Q&A
How do I ensure I am not printing too many commas when printing objects?
The easiest method to do this is to just pass in a booleandid_print_first_element
by address to your recursive printing logic. If the boolean is true, print the comma before your key-value pair; else set the boolean to true and do not print a comma.How do I ensure the keys are correct when printing?
JSON objects do not preserve the order of keys when parsing, so printing in the original order is not required.
We recommend just doing an in-order traversal for printing, as that will make all the keys in alphabetical order. However, any order is valid as long as all keys are eventually printed.
How can I structure my tests?
Here's a start. (We may add to this at some point.)// OK TO COPY / ADAPT this snippet---but ONLY if you understand it completely. // ⚠ Do not copy blindly. // // This test is nowhere near adequate on its own. It is provided to illustrate how to // use helper functions to streamline your test code. #include <stdio.h> #include <stdlib.h> #include "json.h" #include "miniunit.h" // helper headers you wish to use from json.h void _bst_insert(BSTNode** a_root, char* key, Element element); // all tests from JSON 4, do not delete old tests!!! static int _test_parse_object_two_values() { mu_start(); //────────────────────────────────────────────────────── char const* input = "{\"key1\": 123, \"key2\": \"value\"}ABC"; char const* pos = input; BSTNode* root; bool is_success = parse_object(&root, &pos); mu_check(is_success); // because the input is valid // I was too lazy to count the characters, so made C count it for me // the 3 is the trailing characters mu_check(pos == input + strlen(input) - 3); mu_check(root != NULL); // we could check specific locations in the tree, // or we could just use the helper which does not care Element* a_element = get_element(root, "key1"); mu_check(a_element != NULL); mu_check(a_element->type == ELEMENT_INT); mu_check(a_element->as_int == 123); element = get_element(root, "key2"); mu_check(a_element != NULL); mu_check(a_element->type == ELEMENT_STRING); mu_check_strings_equal(a_element->as_string, "value"); free_element((Element){.as_object = root, .type = ELEMENT_OBJECT}); //────────────────────────────────────────────────────── mu_end(); } int main(int argc, char* argv[]) { mu_run(_test_parse_object_two_values); mu_run(_test_write_int_zero); return EXIT_SUCCESS; }
That string looks super messy, is there a nicer way we can write it?
You may find it easier to create JSON objects as files. This will prevent the need to escape all the double quotes. You could create helper functions to make your tests easier to write. Some examples are below, Example JSON object file contents:{ "key1": "This is a string", "key2": 12345, "nested_object": { "with a nested list": [ 1, 2, true, null ], "okay, this object is probably a bit complex": true, "you probably want to test simplier ones :)": false } }
Can I submit this homework without doing objects?
You will earn little to no credit on this homework without an attempt at objects. Note you do not need to finish everything to get credit, just finishing some of the required functions will earn credit for this assignment.
If you did not finish HW15, consider instead working on RE02.
Can I turn in this assignment late?
Due to how close we are to the end of the semester, late work is unlikely to be accepted. Submit early and often and follow TDD to ensure you get as much credit for your work as possible.
Updates
Updates will be posted here if any.