Lecture Notes 17: 07/18/00 Michael Brudno Trees ----- A tree is a graph where there is only one path from the root to any node. (A graph is a collection of nodes connected by edges.) Files and Directories Java Libraries C:\ (root) java / \ / | \ Games Windows lang util io A tree node with no children is known as a leaf, all others are known as internal nodes. The depth of a node is how many links it is below the root of the tree. The maximum depth of all nodes in a tree is known as the height. In defining a Java Tree class, each node will have an int label and some number of children: class Tree { /** A tree node with label l and k empty children */ public Tree (int l, int k) {...} /** Same as above, but no children (leaf) */ public Tree (int l) {...} /** The label of this node */ public int label() {...} /** The number of non-empty children of this node */ public int degree() {...} /** The maximum child number (arg k to the constructor) */ public int maxChild() {...} /** Child number k of this */ public Tree getChild(int k) {...} /** Set child number k of this to c. C must not already be in this tree, or vice-versa. */ public void setChild(Tree c) {...} } Traversals ---------- Now that we have a tree we may want to go through all of the elements of this tree. The process is known as a traversal. First, lets introduce a pre-order traversal: we "visit" each label before we visit any of the children: void preOrder(Tree t) { visit(t.label()); for (int i = 0; i < t.degree(); i++) preOrder(t.getChild(i)); } Secondly, we have postOrder traversals: each label gets visited after all of its children: void postOrder(Tree t) { for (int i = 0; i < t.degree(); i++) postOrder(t.getChild(i)); visit(t.label()); } Although many other traversal are possible, the only other one worthy of note here is the level-order traversal, one where all nodes at depth i are visited before any elements at depth i+1. The implementation is non-trivial, and will be covered when we talk about graphs. Binary Trees & Inorder Traversals ------------ - ------- ---------- A binary tree has at most two children per node. In order to simplify the notation, we will define a new BinaryTree class: class BinaryTree { public BinaryTree(int l, BinaryTree left, BinaryTree right) {...} public BinaryTree getLeft() {...} public void setLeft(BinaryTree t) {...} public BinaryTree getRight() {...} public void setRight(BinaryTree t) {...} } With Binary trees we have one more type of traversal: in-order. Here we visit the label after visiting the left child, and before the right one: void inOrder(Tree t) { if (t.getLeft() != null) inOrder(t.getLeft()) visit(t.label()); if (t.getRight() != null) inOrder(t.getRight()) } Expression Trees ---------- ----- We can represent mathematical operations such as R = (6 + 1) * 7 as a binary tree: * / \ + 7 / \ 6 1 In order to evaluate the result, we evaluate all of the nodes: a literal evaluates as its value, an expressions as the result of applying the operation to the children. Note the results of doing the different traversals on expressions trees, where visit will just print the label. For clarity we will put parentheses around all nodes with expressions: preorder: (* (+ 6 1) 7) postorder: ((6 1 +) 7 *) inorder: ((6 + 1) * 7) Note that the first method gives polish (prefix syntax), the second gives reverse-polish (RPN, post-fix) and the third standard (in-fix) notation. Binary Search Trees ------ ------ ----- A binary search tree (BST) is a binary tree defined as follows. For any node, all of its left children are less than or equal to the current node and all of its right children are greater than the current node. BSTs are useful because we can insert and remove items in O(log n) time, in the best case. In the worst case, insertion and removal will take Theta(n) time. To insert an element, determine if it is greater than the current node. If so, call insert recursively on the right side. If not, on the left. If you are at a leaf just insert the node as either the left or the right child. Example: after inserting 10 after inserting 14 9 9 9 / \ / \ / \ 5 13 5 13 5 13 /\ / / \ / / \ / \ 3 6 12 3 6 12 3 6 12 14 / / 10 10 Removals take a little more thought. What if we want to remove 9? We can't simply remove a node that is not a leaf, so we must replace it. There are two ways to replace it and still maintain the BST properties. -- Use the next smallest element, which is the rightmost element in the left subtree, or -- Use the next largest element, which is the rightmost element in the right subtree. In this example, we could replace 9 with 10. What do we do if the right subtree's leftmost child has a right child? We would reattach the right child to the leftmost child's parent, as follows: 9 10 / \ / \ 5 13 becomes 5 13 /\ / \ / \ / \ 3 6 12 14 3 6 12 14 / / 10 11 \ 11 Putting all of this into code: class BST extends BinaryTree() { public BST(int l, BST left, BST right) { super(l,left,right); } public void insert (int newlabel) { if (newlabel < t.label()) { if ( getLeftChild() == null) setLeftChild(new BST(newlabel, null, null); else getLeftChild.insert(newlabel); } else { if ( getRightChild() == null) setRightChild(new BST(newlabel, null, null); else getRightChild.insert(newlabel); } } /** find is similar to above */ public boolean find (int findlabel) {...} ... } Balancing --------- A binary search tree is defined to be balanced if, for each of its nodes, the height of the node's subtrees differ at most by one. If a tree is balanced (or almost balanced) it is possible to search, insert, etc. in log time. If it becomes unbalanced this is no longer possible. For instance, lets' insert 5, 4, 3, 2, 1 into a BST in that order. The resulting tree will be 5 -> 4 -> 3 -> 2 -> 1 This is not good, as the search time will be linear. Hence we need to balance the tree. Balancing a tree involves an operation called rotation. If we take the tree shown on the left and rotate it right, we end up with the tree shown on the right. A B / \ / \ B C rotate right --> D A / \ / \ D E E C To rotate right, we take the root node's left child (B) and make it the new root. The old root (A) becomes B's right child, and what was formerly B's right child (E) becomes A's left child. The resulting tree still maintains the BST properties, because A must be greater than B, and E is less than A but greater than B. A symmetric operation can be performed to rotate the tree to the left. Given the tree above, we get: 5 4 3 / / \ / \ 4 3 5 2 4 / rotate --> / rotate --> / \ 3 right 2 right 1 5 / / 2 1 tree is balanced / 1 Although it is hard to balance a previously unbalanced tree, it is relatively easier to keep a tree balanced by adjusting after each insertion. One of the most famous of these "self-balancing" trees are the AVL trees, named after the inventors, Adelson-Velskiy and Landis. There are many algorithms for keeping trees balanced, one of which we will look at tomorrow or Thursday.