CS24 Week 8 Lab: Binary Search Trees

Due Date: Tuesday, August 19 at 11:59:59 PM

In this lab, you will implement a variety of core routines on a binary search tree (BST). For the majority of these routines, a recursive solution is likely to be the most straightforward, though iterative solutions are possible with a substantial amount of additional code.

After completing this lab, you should:

Pair Programming

You may work with a partner for this lab. If you choose a partner, both members must contribute to the same work. That is, you cannot simply divide the lab in half, as everyone is responsible for all the content. Instead, you should use a pair programming style. In pair programming, one person acts as a driver, who takes control of the keyboard. The other person acts as a navigator, who reviews what the driver writes and offers advice. Oftentimes, the driver is occupied with details of the problem (e.g., variables, control flow, etc.), whereas the navigator is concerned with bigger-picture problems (e.g., functions, design, interactions). This works well for problems which are simply too large to fit into one person's head. If you're unfamiliar with the concept of pair programming, you may wish to watch this video on the topic.

With pair programming, oftentimes it works best if both participants are of approximately the same skill level. Generally, if the disparity in skill is large, the end result is frustrating for both parties - it really only works when both parties operate at approximately the same speed. With this in mind, you should select someone of approximately the same skill level.

Ground Rules and Tips

Unlike the previous lab where you were intentionally restricted to use a recursive solution, you may implement iterative solutions here. That said, you are strongly encouraged to use recursive solutions. For most of the operations, for an iterative solution you'll need to implement a stack data structure in addition to everything else, whereas recursive solutions need no such data structures.

Because you are relatively unrestricted here, you may add in additional instance, global, or static variables as you see fit. Again, we highly recommend sticking to a pure recursive style without these features, since such solutions are altogether less complex (and easier to debug). However, you're free to do as you see fit.

Notice that we have already provided implementations for the destructor, the copy constructor, and code to print the tree out (printTree). Many of the operations you must implement will have similar structure, so you may want to study this provided code carefully before you try implementing anything.

You may not change the signatures of any of the provided methods. You may, however, add in your own private helper methods, if you so desire. If you're implementing things recursively, almost assurredly you'll need to add in such private helpers.

We are working with true binary search trees here, which do not allow duplicate elements.

Note on assert: the tests in the provided main.cpp make use of assert to do their work. If you get an assertion failure for one of these, it means that something has gone wrong with a test. You may want to comment these out for testing simpler parts of your code.

Suggested Plans of Attack

There are two general approaches one can take here, with pros and cons of each:

  1. Implement from least to most complex
  2. Implement in a manner to be able to test fully as soon as possible

With the first approach, you should go in the same order as the lab is written. The advantage of this is that the difficulty ramps up slowly, and in many places you'll find that code you wrote previously can help guide subsequent code. The downside of this approach is that one of the most complex methods, insert, will end up being implemented last, and insert is needed in order to test much of the code. If you end up choosing this approach, then for early testing you'll need to use the doNonInsertTesting function in the provided main.cpp, which will construct a BST directly from a sorted input array. This way, you can test operations on your BST without needing insert.

With the second approach, the goal is to get full testing up as soon as possible. To this end, you should implement insert first, which will enable downstream testing. The problem here is that insert is arguably the most difficult method to implement, so you may be more likely to get stuck with this approach. If you do get stuck, we recommend implementing some of the simpler methods first (as with the other suggested plan of attack), especially contains.

Part A: Implement isEmpty

Part A-1: Create Directories and Copy Files

-bash-4.2$ cd
-bash-4.2$ pwd

/cs/student/yourusername

-bash-4.2$ cd cs24
-bash-4.2$ pwd

/cs/student/yourusername/cs24
-bash-4.2$ mkdir lab8
-bash-4.2$ cd lab8
-bash-4.2$ pwd

/cs/student/yourusername/cs24/lab8

Now copy over the files for this week. They are kept in the directory: /cs/student/kyledewey/cs24/labs/8

-bash-4.2$ cp /cs/student/kyledewey/cs24/labs/8/* .

Note that there is a '.' at the end of that line. cp means copy. /cs/student/kyledewey/cs24/labs/8/* means all of the files contained in the directory /cs/student/kyledewey/cs24/labs/8. The '.' means to copy from that location to the current location.

Now check to make sure the copy worked correctly:

-bash-4.2$ ls

You should see a set of files listed that will be used for this warm-up.

Part A-2: Implement isEmpty

In the files BST.h and BST.cpp, you'll find a variety of unimplemented methods which you will need to implement, ordered by difficulty level. The first method you should implement, which is of the lowest complexity, is that of isEmpty. isEmpty should return whether or not the tree is empty in a bool (true for empty and false if not). This method doesn't need to traverse the tree at all, and can be implemented with a one-liner.

Part B: Implement maxElement

maxElement should return the largest element in the tree, or -1 if the tree is empty (even though this is not a good solution). While this will require traversing the tree, you can usually avoid traversing the whole tree. Keep in mind for a binary search tree, the element on the right of an internal node is always greater than the element on the left. This applies recursively, so that the element at the furthest right position of the tree should always be greater than any other element in the tree.

Part C: Implement numInternalNodes

numInternalNodes returns the number of internal nodes are in the tree. To be absolutely clear, in this definition, only the leaf nodes (i.e., non-internal nodes) are NULL. Even if both of a node's children are NULL, it should still be considered an internal node towards this count.

Part D: Implement contains

contains determines if the provided element exists within the tree, returning true if it is contained within, else false. While this could be implemented by exhaustively searching the whole tree, for full credit you must take advantage of the structure of the tree. That is, you must only search those portions of the tree which could possibly contain the element. For example, if we are looking for 7, and the current node holds value 6, then we need to check only the rightmost node of 6, since 7 > 6.

Part E: Implement insert

insert attempts to insert a provided element into the tree at its appropriate position. If the tree already has the element within, then it should simply return false and do nothing to the structure to the tree. Otherwise it should put the provided element at the proper position and return true.

To be clear, insert should never overwrite any elements. That is, the number of nodes in the tree should always be increased by one whenever we provide an element to insert which is not already in the tree.

Part F: Implement sortedElements

sortedElements is by far the most complex operation you must implement. sortedElements will return an array of ints holding the elements of the tree in sorted order. For full credit, you may not simply add the elements in any order and then sort them through some auxilliary means, but instead put the elements into the array directly in sorted order.

For tackling this operation, we recommend defining a recursive helper method with the following signature:

int sortedElements(Node* node, int* array, int pos) const;

The parameters represent the following:

  1. node specifies the starting node to process from. This is different for each recursive call.
  2. array specifies the array which we are incrementally adding elements to. It is assumed that array is just long enough to hold all the elements in the tree.
  3. pos specifies the next open position in the array. That is, this is the next position where we can put something into the array.

The value returned from the above function represents the next available position in the array. The basic idea with this signature is to chain along the next available array position, using an in-order traversal to put elements into the array. Pseudocode demonstrating this idea follows:

def sortedElements(node, array, pos):
  if node is NULL:
    return pos
  else
    newPos = sortedElements(getLeft(node), array, pos)
    array[newPos] = getElem(node)
    return sortedElements(getRight(node), array, newPos + 1)

As for setting up the recursive call to sortedElements, initially we should start at the root of the BST and at the first (0th element) of the array. length is passed as an int pointer here because sortedElements wants to return two things - the sorted array and the length of the sorted array. Because C++ does not let us easily return multiple things at the same time, we instead return the array directly and then “return” the length by setting what the length pointer points to to the length. In this setup, to “return” a length of 7, then we would say:

*length = 7;

To get a better idea of how this works in action, you should consult the portion of the provided main.cpp where sortedElements is called.

Submitting your Work

First, fill out the required fields in the provided README.txt file. Once you have filled this in, you may submit everything with the following command:

-bash-4.2$ turnin lab8@cs24 README.txt Node.h Node.cpp BST.h BST.cpp main.cpp

Acknowledgements

This lab is based on another graciously provided by Professor Diana Franklin.