Deque (Deck), Double-Ended Queue

Build Task 4: Take the two implementations from the previous class and combine them. Make a container which has $\mathcal{O}(1)$ versions of push_front, push_back, pop_front and pop_back.

Here is the STL version of deque:

Class 15: Heaps, Heapsort, Priority Queues

OK, so we've explored a Stack (LIFO) and a Queue (FIFO). What if you are running a restauraunt in Hollywood and want to seat someone, not based on arrival time but based on fame level? Now I need the ability to pop_min() or pop_max() quickly.

This abstract data type is called a max-priority queue (likewise a min-priority queue can be kept). The interface is:

Insert(x)
Max() read the max value.
Extract-Max() remove and return the max entry.

Here is the STL version with interface push, top, and pop:

Invention? Task 5: Did we just discover a new sorting algorithm? Based on what you know about sorting, what can you say about the cost of pop, top, push? (You don't yet know how C++ does its priority_queue.)

NOTE TO SINGLE STUDENTS: this is the right ADT to store phone numbers for your little black book!

Another note: You can adjust the criteria for evaluating the inputs. Including just setting a priority for each inserted item. In that case you might consider one additional method: update_priority(x, p) to adjust the priority of item x to value p.

Design Task 6: If you were only given a priority queue how would you implement a stack? A queue?

Heap: a priority data structure

So this is a new data structure for you guys, custom built for priority queues. It uses a single array but sees itself as a little "binary tree" (we'll study these far more in-depth during the last third of this course).

Heap Picture

The idea is simple-ish:

The HEAP property: Any element must be bigger than its "children"
The PACKED property: There should be no empty spot
Fast child lookup The element at index i has children at 2*i and 2*i + 1.
Fast parent lookup The parent of an element at index i is $\lfloor \frac{i}{2} \rfloor$

So to move from parent to child is to "double" and to move from child to parent is to "half". Now we just have to work to preserve the HEAP and PACKED properties.

Convince Your Neighbor Task 7: Pick a partner, one of you be the "skeptic" and one of you be the "cool dude" (AKA the mathematician). The cool dude needs to convince the skeptic that the "root" of any "sub-tree" of a heap contains the largest element in that subtree. (The skeptic needs to be hard to convince without being belligerent.)

Think Task 8: Does a reverse sorted array make a heap?

Order Task 9: Where might the smallest element exist (assume distince values)?

Pay attention to these properties as the rest of the lecture is going to be working through how we can keep a heap.

Insert (Bubble Up)

Let's imagine you've got a heap already. For instance:

[100, 19, 36, 17, 3, 25, 1, 2, 7]

We want it to stay packed and heaped. Let's add the number 53.

Add to the back of the array
While the number is larger than parent, swap with parent.

Heap Problem

Trace Task 10: So the above diagram is the heap problem I provided. Insert 53 and "bubble it" to the top. Which indices does it visit on the way up? What is the final state after bubbling up?

Here is a heap with "bubble up". It will always be "HEAPED" and "PACKED".

Complexity Task 11: What insertion order is best for bubbling up? What insertion order is worst? What is the specific number of worst-case steps in a bubble-up into a heap of size n?

Watch the puny elements fight to be "king in the castle":

Borat

Remove/Pop/Bubble Down

Ashes to ashes, dust to dust, the once great must eventually fall:

All must fall

To remove the top element from a heap we have to pack everything back in. We really don't want to shift all of the elements over one. Also we don't want to lose our HEAP property. So here is the way we do it:

Swap the first and last elements.
Remove/return the new last element (the biggest).
The new first is not large.
So swap it with the largest child until it is larger than both children.

Bubble Down Arrows

Here is a bubble down gif for a min-heap:

Bubble Down Min

Bubble Down Task 12: Remove the top element from [9, 8, 7, 6, 5, 4, 3, 2, 1] and then perform bubble down. What is the final result?

Here is my simple Heap:

Heap Sort!

Now we know how to build a heap and pop over and over to sort:

Load a heap (fast way in a minute)
While not empty, pop the largest
(Note that the way we implemented it the array is actually sorted in-place from smallest to largest)

So here is the animated gif for heap sort:

Heap Sort Gif

Complexity Task 13: So how expensive do you think it is to create a heap from an array of things to sort? How expensive is each Pop() as we find the max each iteration? What's our worst-case complexity?

Here is the proof that the array ends up sorted "in-place":

Speed Load a Heap

If you happen to be passed an array of elements to begin, you can actually load your heap in time $\mathcal{O}(n)$ and not $\mathcal{O}(n\log{n})$

This trick won't get you a linear time sort (that's kind of impossible, more later). But it can get you a $\mathcal{O}(n + k\log{n})$ top $k$ element algorithm!

Imagine Task 14: Imagine I give you an array of 32 unsorted elements. Which ones are guaranteed to be "roots" of a sub-heap already?

So the algorithm for fast loading is this:

Start with i at the half way point ($\lfloor n/2 \rfloor$)
Call BubbleDown(i)
i--
When i == 0 the array is heaped.

To see why this works, just confirm for yourself that after each BubbleDown every index after i is the root of a sub-heap.

Now the real trick is the mathematical analysis of the complexity:

$$\sum_{k=0}^{\infty} x^k = \frac{1}{1-x}, |x| \lt 1 $$

A simple derivative gives:

$$\sum_{k=0}^{\infty} kx^{k-1} = \frac{1}{(1-x)^2}, |x| \lt 1 $$

Multiply by $x$ and replace $x$ with $1/2$ to get:

$$\sum_{k=0}^{\infty} \frac{k}{2^k} = \frac{1/2}{(1-1/2)^2 = 2}$$

Hmmm… OK so here is how this works: the cost of BubbleDown is $\mathcal{O}(h)$ if the sub-heap has height $h$. So how many sub-heaps have height $h$ for each value of $h$ from 0 to $\log{n}$?

Think Task 15: In our array of 32 elements how many are on top of heaps of size 1, 2, 3, 4, and 5?

So the total cost is this:

$$ \sum_{h=1}^{\log{n}} \lceil \frac{n}{2^{h}} \rceil \cdot \mathcal{O}(h) = \mathcal{O}(n \sum \frac{h}{2^h})$$

Finally that gives us:

$$ \lt \mathcal{O}(n \sum_{h=1}^{\infty} \frac{h}{2^h}) = \mathcal{O}(n)$$

So you're not going to have to recreate that analysis on a test, but please do know that you can load a heap in linear time!

Coding Task 16: Code a function which consumes an array of integers (and its length) and an integer k. Return the k largest values in your array. Use $\mathcal{O}(n + k \log{n})$ steps.

Warm-Up: