Data Structures

Fall 2023 ECE 36800 :: Purdue University

Due 11/21

B-Tree 1

Learning goals

You should learn how to:

  1. Write code that manipulates b-tree structures.
  2. Test code for non-trivial data structures.

Overview

In class, we discussed an implementation of the b-tree data structure. For this assignment, you will extend that code with a few capablities:

  1. Find the smallest/largest (min/max value) key in the b-tree.
  2. Find the smallest/largest k keys in the b-tree (as an array).
  3. Delete a node.

The first two parts are essentially a warm-up. Most of the effort here will be the deletion.

⚠ Implement and test the support functions and get them working before doing the delete function.

The algorithm for deleting a node is described here:
https://www.geeksforgeeks.org/delete-operation-in-b-tree/

You may follow the same approach shown in the example code at that web site, or any other. However, you may not "copy" code from anywhere. Do not use copy-paste of any kind, even if you modify it afterward. But you may look at code and follow a similar flow, if it helps you.

Getting Started

Get the starter code

Change to your directory for this course.

you@ecegrid-thin1 ~/ $ cd 368

Get the starter code.

you@ecegrid-thin1 ~/368/ $ 368get hw05
This will write the following files: … Ok? (y/n)y … files were written
you@ecegrid-thin1 ~/368/ $ cd hw05
you@ecegrid-thin1 ~/368/hw05 $

For your convenience, we are providing a Makefile to make it more convenient to compile, test, and submit your code. To see the commands that are available, open it up.

you@ecegrid-thin1 ~/368/hw05 $ vim Makefile

In summary:

  • make … compiles your code.
  • make test … compiles your code and runs ./test_btree.
  • make submit … submits the required files. (Double-check!!!)
  • make valgrind … compiles your code (if needed) and checks for memory faults using Valgrind.

Requirement: incremental submissions

You are required to submit at least 6 times, at least 10 minutes apart with significant differences in the code—and the amount of working functionality—between these submissions. Submitting even more often is strongly encouraged as it saves backups of your work, in case something goes wrong.

Requirements

  1. Your submission must contain all files included in the starter code. We will only check the following two files:
    file contents
    btree_query.c functions
    query min(BTreeNode✶ root)
    return type: int
    Return the smallest (minimum value) key in the b-tree.
    • Do not use recursion.
    • Running time must be Θ(h).
    • You may assume the tree is not empty.
    query max(BTreeNode✶ root)
    return type: int
    Return the largest (maximum value) key in the b-tree.
    • Do not use recursion.
    • Running time must be Θ(h).
    • You may assume the tree is not empty.
    query k smallest(BTreeNode✶ root, size t k)
    return type: KeyQueryResult
    Return an array of the smallest (up to) k keys in the b-tree.
    • Running time must be Θ(k + h).
      • That is equivalent to Θ(max(k, h)).
    • You may use recursion for this function (or any other, unless specifically disallowed).
    query k largest(BTreeNode✶ root, size t k)
    return type: KeyQueryResult
    Return an array of the largest (up to) k keys in the b-tree.
    • Running time must be Θ(k + h).
      • That is equivalent to Θ(max(k, h)).
    • You may use recursion for this function (or any other, unless specifically disallowed).
    btree_delete.c functions
    get in order successor(KeyLocation key location)
    return type: KeyLocation
    Return the key-location of the key that directly follows the given key-location in an in-order traversal.
    get in order predecessor(KeyLocation key location)
    return type: KeyLocation
    Return the key-location of the key that directly precedes the given key-location in an in-order traversal.
    shift key forward(BTreeNode✶ parent, int src subtree idx)
    return type: void
    • You may shift with the root (i.e., last key of src becomes root key, root key becomes first key of dst) or without the root (i.e., last key of src becomes first key of dst).
    • You may assume this will not cause overflow or underflow (i.e., src has > MIN_KEYS keys and dst has < MAX_KEYS keys).
    • Okay to modify assert statements.
      • Do not modify the function signature.
    merge subtree with next(BTreeNode✶ parent, int subtree 1 idx)
    return type: void
    shift key backward(BTreeNode✶ parent, int src subtree idx)
    return type: void
    delete key(BTreeNode✶✶ a root, int key)
    return type: bool
    Delete the given key from the b-tree rooted at *a_root.
    • If key does not exist in the b-tree…
      • Return false.
    • If key exists in the b-tree…
      • Delete the key.
      • Perform any necessary restructuring to ensure that the b-tree is still valid.
      • Set *a_root to the new root (if the root of the b-tree changes).
      • Return true.
    • After delete_key(…) returns, the deleted key should not be anywhere in the btree, and the btree should be a valid btree. assert_node_correct(*a_root) should pass. That function simply ensures that for every node in the btree, the following conditions are met:
      • .num_keys is between MIN_KEYS and MAX_KEYS for every node, except the root, which may have as few as 1 keys.
      • Keys are in ascending order in every node
      • Every key is less than all keys within the subtree to the right of it.
      • Every key is greater than all keys within the subtree to the left of it.
      • (These are just the normal b-tree constraints. This is nothing new.)
    • Do not call assert_node_correct(…) from within delete_key(…).
    • There is some room for variation in implementations. Be reasonable and follow the spirit. You do not have to follow geeksforgeeks method exactly.
    • Your delete should be O(log n).
    We ask that you include the other files from the start to aid in troubleshooting.
  2. Your code must work with the original starter files (except btree_query.c and btree_delete).
  3. You are responsible for testing your code. You should put any tests in test_btree.c.
  4. In case of any corrections announced by email, apply them and ensure that your code works.
  5. Follow all instructions given in the starter files. For example:
    • Do not modify files that have a // DO NOT MODIFY… comment at the top.
    • Do not directly access the .subtrees field of any node. Use get_subtree(…), set_subtree(…), or get_addresss_of_subtree(…), instead.
  6. Do not access (read or write) more nodes than necessary for the given operation.
  7. Only the following external header files, functions, and symbols are allowed in btree.c.
    header functions/symbols allowed in…
    stdbool.h bool, true, false *
    stdio.h * test_btree.c
    stdio.h printf is_partition_correct(…) (in btree.c)
    assert.h assert *
    stdlib.h EXIT_SUCCESS test_btree.c
    All others are prohibited unless approved by the instructor. Feel free to ask if there is something you would like to use.
    1. malloc(…) is not allowed anywhere in btree.c.
  8. Follow the policies (including the base requirements) and academic integrity. For example:
    • Do not copy code from generated by Chat-GPT (or other GenAI tools).
    • Do not copy code found on the web.

Submit

To submit HW05 from within your hw05 directory, type

make submit

That should result in the following command: 368submit HW05 btree_query.c btree_delete.c btree.h btree.c test_btree.c btree_assert.c btree_print.c btree_insert.c btree_wrappers.c Makefile

Q&A

Updates

There are no updates to HW05 so far.