Advanced C Programming

Summer 2022 ECE 264 :: Purdue University

⚠ This is a PAST SEMESTER (Summer 2022).
Due 7/30

JSON 4: text files

Learning goals

You will learn or practice how to:

  1. Read and write text files using FILE* and the related functions.
  2. Refactor existing code, avoiding breaking previous tests.

Overview

In this homework, you will adapt your code from HW13 to allow reading and writing JSON files. In addition, you will add support for 2 new types: booleans and null. As you add new functionality, you will also be ensuring existing functionality does not break. Refer to the previous parts for specifications on valid and invalid JSON.

JSON is a file format for exchanging hierarchical data between programs, often written in different programming languages and/or on different computers. In web programming, this allows a server application written in any language (e.g., Python or even C) to pass complex structures of information to the user's browser, which then renders it. Most often when we work with JSON, it is not through string constants in code, but rather via JSON files, which have the .json fle extension.

A JSON file contains text that represents a number, a string, a list, an object, a boolean, or null. However, for the sake of this assignment, you do not need to worry about objects. We will handle them in EC04. Here are some examples:

type json data
number 10 10
string "ten" "ten"
list [2, 6, 4] linked list with values 2, 6, 4
boolean true true
null null

Linked lists

As in C, whitespace is ignored (except within a string). Thus, the following are all equivalent.

type json data
list [2, 6, 4] linked list with values 2, 6, 4
list [2, 6, "four"] linked list with values 2, 6, "four"
list [2, [3, "+", 3], "four"] linked list with values 2, (3, "+", 3), "four"

Getting Started on HW15

  1. The starter code for HW15 is the same as previous JSON assignment. You will be required to modify the header to add the new functions.
  2. Start by creating a directory and cp to copy your code from HW13.
  3. e 264get HW15 to fetch the starter code.
    you@ecegrid-thin1 ~/ $ cd 264
    you@ecegrid-thin1 ~/264/ $ mkdir hw15
    you@ecegrid-thin1 ~/264/ $ cp hw13/* -v -t hw15/
    you@ecegrid-thin1 ~/264/ $ cd hw15
    you@ecegrid-thin1 ~/264/hw15 $
  4. Create print_element_to_file(…)
    1. Rename print_element(…) to print_element_to_file(…) and add the extra FILE* parameter. Make no other changes yet.
    2. Recreate print_element(…) by having it call print_element_to_file(…). You should only need a single line of code.
    3. Ensure all tests for print_element(…) still pass.
    4. Submit.
    5. Add tests for print_element_to_file(…). You will need to use fopen(…) and fclose(…).
    6. Ensure your new tests are failing.
    7. Implement the FILE* parameter from print_element_to_file(…)
    8. Ensure both your new test and your previous tests for print_element(…) work.
    9. Submit.
  5. Implement write_json(…).
    1. Implement tests for the function. They can similar to your tests for print_element_to_file(…).
    2. Implement write_json(…). It should call print_element_to_file(…).
    3. Submit.
    4. Consider if there are any edge cases for the function. If so, add more tests.
  6. Submit.
  7. Implement read_json(…).
    1. We recommend creating a helper that reads all charaters in a file and returns a malloced string of the contents.
    2. Declare your helper function, but only implement the minimal code so it compiles.
    3. Create tests for your helper function.
    4. Implement enough code in the helper function to pass your first test.
    5. Ensure tests are passing.
    6. ...
    7. Implement the final code for your helper function.
    8. Ensure all tests are passing for the helper function.
    9. Add tests for read_json(…).
    10. Implement read_json(…). With the helper function, the implementation should be trivial.
    11. Ensure your tests are passing.
    12. Submit
  8. Implement parse_null(…).
    1. Add support for null to the enum and union for Element.
    2. Ensure no previous tests broke.
    3. Add some tests for valid null.
    4. Implement enough code to pass the tests.
    5. Submit.
    6. Add some tests for invalid null.
    7. Implement enough code to pass the tests.
    8. Submit.
    9. Add tests for other JSON functions using null.
    10. Implement code in other functions for null.
    11. Ensure all tests are passing, both new and old.
  9. Implement parse_boolean(…).
    1. Add support for booleans to the enum and union for Element.
    2. Ensure no previous tests broke.
    3. Add some tests for valid booleans.
    4. Recommended: add a helper to allow reusing some logic from parse_null(…).
    5. Implement enough code to pass the tests.
    6. Submit.
    7. Add some tests for invalid booleans.
    8. Implement enough code to pass the tests.
    9. Submit.
    10. Add tests for other JSON functions using booleans.
    11. Implement code in other functions for booleans.
    12. Ensure all tests are passing, both new and old.
  10. Test that there are no memory leaks, even for incorrect input.

How much work is this?

You must be disciplined in your development appraoch. If you try to build this all at once, success is extremely unlikely.

Instructor's solution for parse_null(…) and parse_bool(…) is 15 sloc (5 sloc in the body of a helper called by both).

Instructor's solution for write_json(…) and read_json(…) is 27 sloc (in addition to the sloc from the body of print_element_to_file(…)).

Changes to existing code in JSON is 13 sloc.

SLOC means “source lines of code” and does not lines that are blank or contain only comments. The figures above do not count test_json.c. Also, those figures are based on a tight—but defensible—solution that meets code quality standards and uses no methods or language features we have not yet covered.

#}

Requirements

  1. Your submission must contain each of the following files, as specified:
    file contents
    json.h type definitions
    Element
    • Everything that was in element before.
    • To the type enum, add two new types: ELEMENT_NULL and ELEMENT_BOOL.
    • To the struct, add two new fields: void* as_null and bool as_bool.
    function declarations one for each required function in your json.c
    • Do not include helpers (if any) here.
    json.c functions All functions that were required in json.c previously. Only new functions or functions that were changed will be listed below.
    parse bool(bool✶ a value, char const✶✶ a pos)
    return type: bool
    Set *a_value to whatever boolean value is found at *a_pos.
    1. *a_pos is initially the address of the first character of the boolean literal in the input string.
    2. *a_value is the (already allocated) location where the parsed boolean should be stored.
    3. Return true if a properly formed boolean literal was found at *a_pos*a_pos should refer to the next character in the input string, i.e., after the last character of the boolean literal. Note that trailing characters are acceptable, just like previous functions.
      1. Ex: parse_bool(…) should return true for true, false, and falsehood
      Note that the last example is valid because it starts with false.
    4. Return false if boolean literal was not found.  *a_pos should refer to the unexpected character that indicated a problem.
      1. Ex: parse_bool(…) should return false for A, 123, "true", and truly
      Note that the third example is a string literal. Also note that in the last example, *a_pos will be 'l' as that is the first invalid character.
    5. Calling parse_bool(…) should not result in any calls to malloc(…).
    6. Do not confuse the boolean return value from the result. The return value determines whether a boolean was found, while *a_value is the boolean that was found.
    7. Whenever parse_bool(…) returns false, *a_value should not be modified.
    parse null(char const✶✶ a pos)
    return type: bool
    Determine whether null is found at *a_pos.
    1. *a_pos is initially the address of the first character of the null literal in the input string.
    2. Return true if a properly formed null literal was found at *a_pos*a_pos should refer to the next character in the input string, i.e., after the last character of the null literal. Note that trailing characters are acceptable, just like previous functions.
      1. Ex: parse_null(…) should return true for null, null literal, and nullify
      Note that the last two examples are valid because they starts with null.
    3. Return false if boolean literal was not found.  *a_pos should refer to the unexpected character that indicated a problem.
      1. Ex: parse_null(…) should return false for A, 123, "null", and nil
      Note that the third example is a string literal. Also note that in the last example, *a_pos will be 'i' as that is the first invalid character.
    4. Calling parse_null(…) should not result in any calls to malloc(…).
    5. This function has no result, as the only possible result is null.
    parse element(Element✶ a element, char const✶✶ a pos)
    return type: bool
    1. First, eat any whitespace at *a_pos.
      1. “Eat whitespace” just means to skip over any whitespace characters (i.e., increment *a_pos until isspace(**a_pos)==false).
    2. Next, decide what kind of element this is.
      1. If it's a digit (isdigit(**a_pos)) or hyphen ('-'), set the element's type to ELEMENT_INT and call parse int(&(a element -> as int), a pos).
      2. If it's a string (**a_pos=='"'), then set the element's type to ELEMENT_STRING and call parse string(&(a element -> as string), a pos).
      3. If it's a list (**a_pos == '['), then set the element's type to ELEMENT_LIST and call: parse list(&(a element -> as list), a pos).
      4. If it's a boolean (**a_pos == 't' or **a_pos == 'f'), then set the element's type to ELEMENT_BOOL and call: parse bool(&(a element -> as bool), a pos).
      5. If it's null (**a_pos == 'n'), then set the element's type to ELEMENT_NULL, set the element's as_null to NULL, and call: parse null(a pos).
    3. Return whatever was returned by the relevant function.
      • If none of those functions was called—i.e., if the next character was none of the expected characters, then return false.
    4. Do not modify *a_pos directly in parse_element(…), except for eating whitespace.
      • *a_pos can—and should—be modified in the relevant parse functions.
    5. Caller is responsible for freeing memory by calling free_element(…) whenever parse_element(…) returns true.
    6. Whenever parse_element(…) returns false, do not modify *a_element , and free any heap memory that was allocated prior to discovery of the error.
    print element to file(Element element, FILE✶ file)
    return type: void
    Given an Element object, print it in JSON notation to the passed file.
    1. Spacing is up to you, as long as it is valid JSON.
    2. If element is an integer, print it using fprintf(…).
    3. If element is a string, then print it (with double-quotes) using fprintf(…).
    4. If element is a list, print a '['. Then print each element in the list using print_element_to_file(…) (recursively), separated by commas. Finally, print ']'.
    5. If element is a boolean or null, print it using a string constant.
    print element(Element element)
    return type: void
    This function should be exactly one line. It should simply call print_element_to_file(…) with the proper argument for printing to the console.
    free element(Element element)
    return type: void
    Free the contents of the Element, as needed. Note that this is the same description from before, just copied over to emphasize that booleans and null do not need to be freed.
    1. If it contains a string, free the string.
    2. If it contains a linked list, free the list, including all elements.
    3. Do not attempt to free the Element object itself. free_element(element) only frees dynamic memory that element refers to.
    write json(char const✶ filename, Element element)
    return type: void
    Writes a JSON element to the file defined by filename.
    1. Opens the file named filename.
    2. Create the file if it does not exist, replace contents if it does exist.
    3. Write the contents of element to the file.
    4. Hint: use print_element_to_file(…).
    5. Do not add the .json extension automatically. The caller will pass in the full filename they wish to create. If the caller passes in "example", create a file named example, not example.json.
    read json(char const✶ filename, Element✶ a element)
    return type: bool
    Reads a JSON element from the file defined by filename.
    1. Opens the file named filename and reads in the contents as a JSON element.
    2. Returns true if the contents are valid JSON, and the file contains no other text (other than whitespace).
    3. Returns false and prints an appropiate error message to stderr if:
      1. The file does not exist.
      2. The file does not contain a valid JSON element.
      3. The file starts with a valid JSON element but has non-whitespace trailing characters.
    4. Caller is responsible for freeing memory in the returned element by calling free_element(…) whenever read_json(…) returns true.
    5. Whenever read_json(…) returns false, do not modify *a_element , and free any heap memory that was allocated prior to discovery of the error.
    6. You are free to use malloc(…) as needed for parsing the element, as long as you have no memory leaks. The caller is only responsible for freeing the element (if valid), any other memory you allocate is the responsiblity of read_json(…) to free.
    7. Hint: you should use parse_element(…) to parse the JSON rather than duplicating logic. We talked in lecture about how to read all text from a file into a string.
    8. The last requirement for making a file valid JSON is different from the her parse functions. It is possible that read_json(…) will return true for a string, and the same string as a file will lead read_json(…) to return false.
    9. Do not add the .json extension automatically. The caller will pass in the full filename they wish to create. If the caller passes in "example", create a file named example, not example.json.
    test_json.c functions
    main(int argc, char✶ argv[])
    return type: int
    Test your all of the above functions using your miniunit.h..
    • This should consist primarily of calls to mu_run(_test_▒▒▒).
    • 100% code coverage is required.
    • Your main(…) must return EXIT_SUCCESS.
  2. You may ignore any trailing characters in the input string in any function starting with parse_, as long as the input starts with a well-formed JSON element.
    • Acceptable: 123AAA, "12"AAA, "12",[,
    Do not ignore trailing characters in read_json(…), other than whitespace.
  3. You only need to support the specific features of JSON that are explicitly required in this assignment description. You do not need to support unicode (e.g., "萬國碼", "يونيكود", "യൂണികോഡ്"), objects/dictionaries (e.g., {"a":1, "b":2}), backslash escapes (e.g., "\n"), embedded quotes (e.g., "He said, \"Roar!\""), floating point numbers (e.g., 3.1415), non-decimal notations (e.g., 0xdeadbeef, 0600)
  4. Do not add helper functions to json.h.
  5. There may be no memory faults (e.g., leaks, invalid read/write, etc.), even when parse_▒▒▒(…) or read_json(…) return false.
  6. The following external header files, functions, and symbols are allowed.
    header functions/symbols allowed in…
    stdbool.h bool, true, false json.c, test_json.c
    stdio.h fprintf, fputc, fputs, stdout, fflush, fopen, fclose, fseek, ftell, rewind, fgetc, fread json.c, test_json.c
    assert.h assert json.c, test_json.c
    ctype.h isdigit, isspace json.c, test_json.c
    stdlib.h EXIT_SUCCESS, abs, malloc, free, size_t json.c, test_json.c
    string.h strncpy, strchr, strlen, strcmp json.c, test_json.c
    limits.h INT_MIN, INT_MAX json.c, test_json.c
    miniunit.h anything test_json.c
    clog.h anything json.c, test_json.c
    For miniunit.h and clog.h, you can use anything from HW05. You are welcome to change them to your liking, and/or add more in the same spirit. All others are prohibited unless approved by the instructor. Feel free to ask if there is something you would like to use.
  7. Submissions must meet the code quality standards and the course policies on homework and academic integrity.

Submit

To submit HW15 from within your hw15 directory, type 264submit HW15 json.c json.h test_json.c miniunit.h clog.h Makefile *.json

If your code does not depend on miniunit.h or clog.h, those may be omitted. Your json.h will most likely be identical to the starter. Makefile will not be checked, but including it may help in case we need to do any troubleshooting. Make sure to submit all required json files required by your tests. It is your responsibility to ensure the tester has all the files required to test. You are encouraged to modify your Makefile to do so.

Pre-tester

The pre-tester for HW15 has been released and is ready to use.

Q&A

  1. How can I structure my tests?

    Here's a start. (We may add to this at some point.)
    // OK TO COPY / ADAPT this snippet---but ONLY if you understand it completely.
    // ⚠ Do not copy blindly.
    // 
    // This test is nowhere near adequate on its own.  It is provided to illustrate how to
    // use helper functions to streamline your test code.
    
    #include <stdio.h>
    #include <stdlib.h>
    #include "json.h"
    #include "miniunit.h"
    
    // helper to create a file in a test. You are also free to manually create files
    // as long as you include them in your submission
    static void _write_string_to_file(char const* filename, char const* contents) {
      FILE* file = fopen(filename, "w");
      fputs(contents, file);
      fclose(file);
    }
    
    static char* _read_file_to_string(char const* filename) {
      // TODO: will implement this in class, and add example back here after class
    }
    
    // all tests from JSON 3, do not delete old tests!!!
    
    static int _test_read_int_zero() {
      mu_start();
      //──────────────────────────────────────────────────────
      // create a text file with the contents 0
      char const* filename = "_test_read_int_zero.json";
      _write_string_to_file(filename, "0");
    
      // test the read function
      Element result;  // will be initialized in read_json(…)
      bool is_success = read_json(filename, &result);
      mu_check(is_success);   // because the input is valid
      mu_check(result.type == ELEMENT_INT);
      mu_check(result.as_int == 0);
      free_element(element);
      //──────────────────────────────────────────────────────
      mu_end();
    }
    
    static int _test_write_int_zero() {
      mu_start();
      //──────────────────────────────────────────────────────
      Element element = { .type = ELEMENT_INT, .as_int = 0 };
    
      // create the JSON using our logic
      const char* filename = "_test_write_json_int_zero.json";
      write_json(filename, element);
      free_element(element);
    
      // read the file in to validate its contents
      char* contents = _read_file_to_string(filename);
      mu_check_strings_equal(contents, "0");
      free(contents);
      //──────────────────────────────────────────────────────
      mu_end();
    }
    
    int main(int argc, char* argv[]) {
      mu_run(_test_read_int_zero);
      mu_run(_test_write_int_zero);
      return EXIT_SUCCESS;
    }
    
  2. That's a lot of duplication! Can we make our tests more concise?

    You could create helper functions to make your tests easier to write. Some examples are below, but do not feel limited to these examples. You may copy/adapt this code—but only if you understand it completely. ⚠ Do not copy blindly.
    // FANCY way of testing, using a helper function that returns a mu_check result.
    //
    // Okay to copy/adapt, but ONLY IF YOU UNDERSTAND THIS CODE COMPLETELY.
    // ⚠ Do not copy blindly.
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include "json.h"
    #include "miniunit.h"
    
    // other helpers
    
    static bool _compare_elements(Element a, Element b) {
      // check element types match, check values match
      // will be recursive for lists
      // feel free to print additional debug information about what did not match,
      // this function is to help you find bugs
    }
    
    static void _write_string_to_file(char const* filename, char const* contents) {
      FILE* file = fopen(filename, "w");
      fputs(contents, file);
      fclose(file);
    }
    
    // functions to generate tests
    
    static int _make_valid_read_test(char const* filename, char const* jsonString, Element expected) {
       mu_start();
    
       _write_string_to_file(filename, jsonString);
    
       Element actual;
       bool is_success = read_json(filename, actual);
       mu_check(is_success);
       mu_check(_compare_elements(expected, actual));
       free_element(actual);
       // you can leave out freeing expected, if you ensure expected is only stack allocated data
       // its possible to stack allocate a linked list, just a bit more awkward
       free_element(expected);
    
       mu_end();
    }
    
    static int _make_invalid_read_test(char const* filename, char const* jsonString) {
      mu_start();
    
       _write_string_to_file(filename, jsonString);
    
       Element actual;
       bool is_success = read_json(filename, actual);
       mu_check(!is_success);
    
       mu_end();
    }
    
    
    // actual tests
    
    static int _test_read_int_zero() {
      return _make_valid_read_test("_test_read_int_zero.json", "0", (Element){
        .type = ELEMENT_INT,
        .value = 0
      });
    }
    
    static int _test_read_string_multicharacter() {
      return _make_valid_read_test("_test_read_string_multicharacter.json", ""hello"", (Element){
        .type = ELEMENT_STRING,
        .value = _copy_string("hello") // heap allocate so free_element works
      });
    }
    
    
    static int _test_read_invalid_trailing() {
      return _make_invalid_read_test("_test_read_invalid_trailing.json", "-123ABC");
    }
    
    int main(int argc, char* argv[]) {
        mu_run(_test_read_int_zero);
        mu_run(_test_read_string_multicharacter);
        mu_run(_test_read_invalid_trailing);
        return EXIT_SUCCESS;
    }
    
    If you do not understand this code, do not use it.

    The code above covers just the read_json(…) part of this assignment, but you could create something similar for other functions.

  3. Why does read_json(…) not allow trailing characters?

    Most of the parse functions support trailing characters to make it easier to implement nested JSON structures such as lists and objects. Allowing trailing characters means "1, 2]" is a valid int, allowing parse_list(…) to simply call parse_element(…).
    Since read_json(…) is not recurive, to be consistent with the JSON spec it disallows trailing character.
  4. But how am I supposed to use parse_element(…) in read_json(…)? it supports trailing characters!

    With how pos works, its trivial to check if we reached the end of a string from parse_element(…). If you are stuck, think about how we have been testing it with miniunit.
  5. What is "an appropiate error message" for read_json(…) returning false?

    Print a message that describes the reason you returned false, there are no strict requirements. As an example, suppose you encounter an invalid element, you could print something like:
          Unexpected %c at position %d in JSON file %s.
        
    Using pos and the full file contents to determine the character that caused the error and its position.
  6. What is stderr? How do I print to it?

    When writing output to the console, we can actually write to either stdout or stderr. Both are redirected to the console by default. But you can for example redirect stdout to one file and stderr to a second file. If you understand stdout, using stderr should be trivial.
    However, do not get hung up on stderr. If you cannot figure it out, print errors to stdout.

Updates

7/29/2022
  • Clarified requirement of read_json to state trailing whitespace is acceptable.
  • Added note to free_element to clarify what changed.