Advanced C Programming

Summer 2023 ECE 264 :: Purdue University

This is a past semester (Summer 2023)
Due 8/4

JSON 5: parse objects

Learning goals

You will learn or practice how to:

  1. Work with binary search trees as part of a larger program.
  2. Refactor existing code, avoiding breaking previous tests.
  3. Combine all major learning objectives into a single program that demonstrates what you learned in this class.

Overview

In this homework, you will adapt your code from HW15 to allow parsing JSON objects. After completing objects, yu will have the majority of the functionality of a JSON parser working. Refer to the previous parts for specifications on valid and invalid JSON.

This assignment adds the major missing functionality to your JSON parser. If you complete this last step of the project, it might be good for a resume.

If you did not do well on HW15, you will likely have a hard time on this assignment. Consider instead doing RE02 to earn back some of the lost credit before trying to earn extra credit here.

A JSON file contains text that represents a number, a string, a list, an object, a boolean, or null. Here are some examples:

type json data
number 10 10
string "ten" "ten"
list [2, 6, 4] linked list with values 2, 6, 4
object {"key1": "value", "key2": true} Imagine a lovely picture of a binary search tree here. I plan to draw one during lecture.
boolean true true
null null

Nested elements

As in C, whitespace is ignored (except within a string). Lists and objects can both contain any type of data, thus all the following are valid:

type json data
list [2, 6, 4] linked list with values 2, 6, 4
list [2, 6, "four"] linked list with values 2, 6, "four"
list [2, [3, "+", 3], "four"] linked list with values 2, (3, "+", 3), "four"
list [{"key": "value"}, {"key": "another value"}] Also plan to draw this during lecture
object {"array": [1, 2, 3]} And this one too, its a lot of drawing to do just to post this!
object {"nested object": {"key": 5}} Yeah, pictures

Getting Started on EC03

  1. The starter code for EC03 is the same as previous JSON assignment. You will be required to modify the header to add the new functions.
  2. Start by creating a directory and cp to copy your code from HW15.
  3. you@ecegrid-thin1 ~/ $ cd 264
    you@ecegrid-thin1 ~/264/ $ mkdir ec03
    you@ecegrid-thin1 ~/264/ $ cp hw15/* -v -t ec03/
    you@ecegrid-thin1 ~/264/ $ cd ec03
    you@ecegrid-thin1 ~/264/ec03 $
  4. Spend some time fixing functionality from previous JSON parts that you did not finish in time for HW15. If there is a lot of previous functionality that needs fixing, consider working on RE02 first instead. The following functionality should be working before you attempt objects:
    • Parsing integers, strings, boolean, and null: objects can contain all of these.
    • Parsing list: objects are essentially more complicated lists. If your lists do not work, it will be hard for you to figure out objects.
    • Reading and writing JSON files: we will be testing objects with files.
  5. Might as well make a submission to RE02 if you improved something.
  6. Start by creating a helper to insert key and an element into a binary search tree. Recommended signature: void _bst_insert(BSTNode** a_root, char const* key, Element element).
  7. We recommend finishing get_element(…) next. You can test it by creating binary trees manually using your helper function.
  8. Test and submit.
  9. Next, update print_element(…) to support objects. This will help you debug.
  10. Test and submit.
  11. Create parse_object(…)
    1. Start by implementing code to parse an empty object.
    2. Test and submit.
    3. Next, create code to parse an element with a single key and value.
    4. Test and submit.
    5. Next, update the code to parse two key and value pairs.
    6. Test and submit.
    7. Update the code to parse any number of key and value pairs.
    8. Test and submit.
    9. Finally, add code to handle any invalid cases you did not handle before.
    10. Test and submit.
  12. Update parse_element(…) to support JSON objects.
  13. Create recursive helper function to free the binary search tree. Recommended signature: void _free_tree(BSTNode** a_root).
  14. Finally, update free_element(…) to support JSON objects.
  15. Test that there are no memory leaks, even for incorrect input.

Requirements

  1. Your submission must contain each of the following files, as specified:
    file contents
    json.h type definitions
    Element
    • Everything that was in element before.
    • To the type enum, add one new types: ELEMENT_OBJECT.
    • To the struct, add one new fields: BSTNode* as_object.
    BSTNode
    struct type with 4 fields: key (char*), element (Element), left, and right (BSTNode*). Sort the tree using key, no need to consider element when sorting.
    function declarations One for each required function in your json.c
    • Do not include helpers (if any) here.
    json.c functions All functions that were required in json.c previously. Only new functions or functions that were changed will be listed below.
    get element(BSTNode✶ root, char const✶ key)
    return type: Element✶
    Finds the node in the binary search tree which matches the key.
    1. If found, return the address of the element in the node.
    2. If not found, return NULL.
    3. Hint: you can write this either using recursion or a loop.
    4. Hint: this function will be helpful in testing parse_object(…).
    parse object(BSTNode✶✶ a root, char const✶✶ a pos)
    return type: bool
    Set *a_root to the root of a tree of Element objects.
    1. Caller is responsible for freeing the memory if parse_object(…) returns true.
    2. Objects consist of a '{', followed by 0 or more key-value pairs separated by commas, and finally a '}'. Each key-value pair consists of a string key, a character , then a JSON-encoded element (integer, string, list, object, boolean, or null). See the examples above. (There will be no HBB on this definition, unless there is something truly wrong and/or grossly unclear.) There may be any number/amount of whitespace characters (' ', '\n', or '\t'), before/after any of the element and before/after the key.
    3. Return true if a properly formed object was found.  *a_pos should be set to the next character in the input string, after the object that was just parsed.
      1. Ex: parse_object(…) should return true for {}, {"number": 123}, {"string": "Hello", "list": [ 1, 2, 3 ]}, { "nested_object" : { "integer": 5 } }, and { "string_contains": "trailing_characters" }ABC.
    4. Return false if an object was not found (i.e., syntax error in the input string).  *a_pos should refer to the unexpected character that indicated a problem.
      1. Ex: parse_object(…) should return false for A{}, {"key": 1,,"value": 2}, { {, "key": 1 { "key": 1 "key2": 2}. and { "key" 1 }.
    5. The parsed elements are stored in a binary search tree sorted by the key values.
    6. If duplicate keys are found, replace the value in the old node with the new element parsed. For example, {"duplicate": 123, "duplicate": 456} should produce the same result as {"duplicate": 456},
      do not forget to free the old element.
    7. Whenever parse_object(…) returns false, do not modify *a_root, and free any heap memory that was allocated prior to discovery of the error.
    8. Hint: you already have logic written to parse a string, reuse that logic for the key instead of writing it again.
    parse element(Element✶ a element, char const✶✶ a pos)
    return type: bool
    1. First, eat any whitespace at *a_pos.
      1. “Eat whitespace” just means to skip over any whitespace characters (i.e., increment *a_pos until isspace(**a_pos)==false).
    2. Next, decide what kind of element this is.
      1. If it's a digit (isdigit(**a_pos)) or hyphen ('-'), set the element's type to ELEMENT_INT and call parse int(&(a element -> as int), a pos).
      2. If it's a string (**a_pos=='"'), then set the element's type to ELEMENT_STRING and call parse string(&(a element -> as string), a pos).
      3. If it's a list (**a_pos == '['), then set the element's type to ELEMENT_LIST and call: parse list(&(a element -> as list), a pos).
      4. If it's a boolean (**a_pos == 't' or **a_pos == 'f'), then set the element's type to ELEMENT_BOOL and call: parse bool(&(a element -> as bool), a pos).
      5. If it's null (**a_pos == 'n'), then set the element's type to ELEMENT_NULL, set the element's as_null to NULL, and call: parse null(a pos).
      6. If it's an object (**a_pos == '{'), then set the element's type to ELEMENT_OBJECT and call: parse object(&(a element -> as object), a pos).
    3. Return whatever was returned by the relevant function.
      • If none of those functions was called—i.e., if the next character was none of the expected characters, then return false.
    4. Do not modify *a_pos directly in parse_element(…), except for eating whitespace.
      • *a_pos can—and should—be modified in the relevant parse functions.
    5. Caller is responsible for freeing memory by calling free_element(…) whenever parse_element(…) returns true.
    6. Whenever parse_element(…) returns false, do not modify *a_element , and free any heap memory that was allocated prior to discovery of the error.
    print element to file(Element element, FILE✶ file)
    return type: void
    Given an Element object, print it in JSON notation to the passed file.
    1. Spacing is up to you, as long as it is valid JSON.
    2. If element is an integer, print it using fprintf(…).
    3. If element is a string, then print it (with double-quotes) using fprintf(…).
    4. If element is a list, print a '['. Then print each element in the list using print_element_to_file(…) (recursively), separated by commas. Finally, print ']'.
    5. If element is an object, print a '{'. Then print each key-value pair (in any order) in the object using fprintf(…), followed by ':, followed by the element using print_element_to_file(…) (recursively). Separate each key-value pair using commas. Finally, print '}'.
    6. If element is a boolean or null, print it using a string constant.
    free element(Element element)
    return type: void
    Free the contents of the Element, as needed.
    1. If it contains a string, free the string.
    2. If it contains a linked list, free the list, including all elements.
    3. If it contains a binary search tree, free each node in the tree, including all elements.
    4. Do not attempt to free the Element object itself. free_element(element) only frees dynamic memory that element refers to.
    test_json.c functions
    main(int argc, char✶ argv[])
    return type: int
    Test your all of the above functions using your miniunit.h..
    • This should consist primarily of calls to mu_run(_test_▒▒▒).
    • 100% code coverage is required.
    • Your main(…) must return EXIT_SUCCESS.
  2. You may ignore any trailing characters in the input string in any function starting with parse_, as long as the input starts with a well-formed JSON element.
    • Acceptable: 123AAA, "12"AAA, "12",[,
    Do not ignore trailing characters in read_json(…), other than whitespace.
  3. You only need to support the specific features of JSON that are explicitly required in this assignment description. You do not need to support unicode (e.g., "萬國碼", "يونيكود", "യൂണികോഡ്"), backslash escapes (e.g., "\n"), embedded quotes (e.g., "He said, \"Roar!\""), floating point numbers (e.g., 3.1415), non-decimal notations (e.g., 0xdeadbeef, 0600)
  4. Do not add helper functions to json.h.
  5. There may be no memory faults (e.g., leaks, invalid read/write, etc.), even when parse_▒▒▒(…) or read_json(…) return false.
  6. The following external header files, functions, and symbols are allowed.
    header functions/symbols allowed in…
    stdbool.h bool, true, false json.c, test_json.c, json.h
    stdio.h fprintf, fputc, fputs, stdout, fflush, fopen, fclose, fseek, ftell, rewind, fread, fgetc json.c, test_json.c, json.h
    assert.h assert json.c, test_json.c
    ctype.h isdigit, isspace json.c, test_json.c
    stdlib.h EXIT_SUCCESS, abs, malloc, free, size_t json.c, test_json.c
    string.h strncpy, strchr, strlen, strcmp json.c, test_json.c
    limits.h INT_MIN, INT_MAX json.c, test_json.c
    miniunit.h anything test_json.c
    clog.h anything json.c, test_json.c
    For miniunit.h and clog.h, you can use anything from HW05. You are welcome to change them to your liking, and/or add more in the same spirit. All others are prohibited unless approved by the instructor. Feel free to ask if there is something you would like to use.
  7. Submissions must meet the code quality standards and the course policies on homework and academic integrity.

Submit

To submit EC03 from within your ec03 directory, type 264submit EC03 json.c json.h test_json.c miniunit.h clog.h Makefile *.json

If your code does not depend on miniunit.h or clog.h, those may be omitted. Your json.h will most likely be identical to the starter. Makefile will not be checked, but including it may help in case we need to do any troubleshooting. Make sure to submit all required json files required by your tests. It is your responsibility to ensure the tester has all the files required to test. You are encouraged to modify your Makefile to do so.

Pre-tester

The pre-tester for EC03 has been released and is ready to use.

Q&A

  1. How do I ensure I am not printing too many commas when printing objects?

    The easiest method to do this is to just pass in a boolean did_print_first_element by address to your recursive printing logic. If the boolean is true, print the comma before your key-value pair; else set the boolean to true and do not print a comma.
  2. How do I ensure the keys are correct when printing?

    JSON objects do not preserve the order of keys when parsing, so printing in the original order is not required.

    We recommend just doing an in-order traversal for printing, as that will make all the keys in alphabetical order. However, any order is valid as long as all keys are eventually printed.

  3. How can I structure my tests?

    Here's a start. (We may add to this at some point.)
    // OK TO COPY / ADAPT this snippet---but ONLY if you understand it completely.
    // ⚠ Do not copy blindly.
    // 
    // This test is nowhere near adequate on its own.  It is provided to illustrate how to
    // use helper functions to streamline your test code.
    
    #include <stdio.h>
    #include <stdlib.h>
    #include "json.h"
    #include "miniunit.h"
    
    // helper headers you wish to use from json.h
    void _bst_insert(BSTNode** a_root, char* key, Element element);
    
    // all tests from JSON 4, do not delete old tests!!!
    
    static int _test_parse_object_two_values() {
      mu_start();
      //──────────────────────────────────────────────────────
    
      char const* input = "{\"key1\": 123, \"key2\": \"value\"}ABC";
      char const* pos = input;
      BSTNode* root;
      bool is_success = parse_object(&root, &pos);
      mu_check(is_success);   // because the input is valid
      // I was too lazy to count the characters, so made C count it for me
      // the 3 is the trailing characters
      mu_check(pos == input + strlen(input) - 3); 
      mu_check(root != NULL);
      // we could check specific locations in the tree,
      // or we could just use the helper which does not care
      Element* a_element = get_element(root, "key1");
      mu_check(a_element != NULL);
      mu_check(a_element->type == ELEMENT_INT);
      mu_check(a_element->as_int == 123);
      element = get_element(root, "key2");
      mu_check(a_element != NULL);
      mu_check(a_element->type == ELEMENT_STRING);
      mu_check_strings_equal(a_element->as_string, "value");
    
      free_element((Element){.as_object = root, .type = ELEMENT_OBJECT});
      //──────────────────────────────────────────────────────
      mu_end();
    }
    
    int main(int argc, char* argv[]) {
      mu_run(_test_parse_object_two_values);
      mu_run(_test_write_int_zero);
      return EXIT_SUCCESS;
    }
    
  4. That string looks super messy, is there a nicer way we can write it?

    You may find it easier to create JSON objects as files. This will prevent the need to escape all the double quotes. You could create helper functions to make your tests easier to write. Some examples are below, Example JSON object file contents:
          {
            "key1": "This is a string",
            "key2": 12345,
            "nested_object": {
                "with a nested list": [ 1, 2, true, null ],
                "okay, this object is probably a bit complex": true,
                "you probably want to test simplier ones :)": false
            }
          }
        
  5. Can I submit this homework without doing objects?

    You will earn little to no credit on this homework without an attempt at objects. Note you do not need to finish everything to get credit, just finishing some of the required functions will earn credit for this assignment.

    If you did not finish HW15, consider instead working on RE02.

  6. Can I turn in this assignment late?

    Due to how close we are to the end of the semester, late work is unlikely to be accepted. Submit early and often and follow TDD to ensure you get as much credit for your work as possible.

Updates

Updates will be posted here if any.