Don't like this style? Click here to change it! blue.css

Class 20: Balanced Binary Search Trees

Most of the operations that we care about in a binary search tree have \(\mathcal{O(h)}\) complexity. The goal of a balanced binary search tree is to keep \(h = \mathcal{O}(\log{n})\).

I'll start with two approaches for perfectly balancing any BST. Just to be clear my goal is to go from something like this:

Unbalanced Binary Search Tree

To something like this:

Balanced Binary Search Tree

Approach 1: Use an array

Pretty simple idea. Put the data into an array, sort it, and make the middle element the root of a new BST (recurse to make the root of the left and right childten).

Trace Task 1: Take the above unbalanced BST. Write down the numbers in order. Sketch the balanced BST which is mild from this middle-root idea.

Complexity Task 2: Given a bad BST what is the worst case complexity of this approach? What happens when you add new elements? How much space/memory does this approach take?

Approach 2: Rotations

This is a neat idea (which is the heart of today's lecture) for shifting the balance of power at a node. We can use a rotation to preserve the Binary Search Tree property but shift (in constant time) some height from the left subtree to the right subtree at a node.

Warm-up Task 3: Take a look at this image and sort the following nodes in part (a), then again in part (b): Q, Ch, Gr, Par, R, P

Rotation

So the image we just analyzed is a right rotation at the node Par. It is basically a swap that preserves the Binary Search Tree Property.

The mechanics of it are that the grandparent will have it's right child point to the child node. The child node will have its right child become the parent node. And the parent will have it's left child become the child's right child.

This next task let us watch a rotation take place.

Visual Task 4: Use Visualgo (embedded or link). Click "AVL" at the top of the screen (more on what that means later). Clear the tree. Insert 10, 5, 30, 20, 35, 1, 15, 25 at high speed. Then slow it down or pause and insert 14. This will let you watch a rotation at node 30.

The DSW balancing algorithm: given an unbalanced BST first make a backbone. Here is how: create a temp pointer to the root. Until you are pointing at null check if there is a left child under your pointer rotate right and move "up" else move to the right child.

Creating a Backbone

Trace Task 5: At which nodes (and in what order) were there right rotations to create this backbone?

Now we can walk down the backbone making left rotations at every other node (stopping and restarting when the math tells us to) and we'll have a perfect binary search tree.

Balancing A Backbone

The official rule is to figure out how many "extra elements" beyond perfect you'll have and do that many to start. Then do one for each layer of your perfect tree and half the number each iteration.

Here is my impementation of rotate_right, it is a method called on a node, and it returns its replacement.

Coding Warm-Up Task 6: Implement rotate_left()

Backbone Task 7: Now Implement a make_backbone() method on a node which uses right rotation to flatten the tree (return the replacement node).

AVL Trees

Many of our data structures are property-based. That is, we take a normal construct like an array or linked-list then toss in a defining characteristic that we want. That constraint dictates how we do the standard actions (insert, remove, find).

An AVL tree is no different, the magic property to work with is: the height of the left and right subtree cannot differ by more than 1.

To see that this property is enough to give us a balanced BST think about the following:

The fewest number of nodes that a height \(h\) AVL tree can have (\(n_h\)) is at least \(n_{h-1} + n_{h-2} + 1\). (By the way this is also called the Leonardo sequence.)

Conceptual Task 8: How fast does the fibonacci sequence grow asymptotically? Is the size of \(n_h\) larger or smaller than the \(h\)th Fibonacci number?

So just like the other property-preserving data structures an AVL tree needs to have a special insert and a special remove to maintain its nearly equal height child policy.

In our case we can do the normal BST insertion/removal followed by a rebalancing at any place which has a 2 or -2 (it can't be worse than that).

For rebalancing there are four cases to deal with (conceptually two) I lean left and my child leans left (LL), I lean left and my child leans rigth (LR). (The other two are RR, RL.)

From ZyBooks Data Structures