12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (2024)

12.17.1. Heaps and Priority Queues

There are many situations, both in real life and in computingapplications, where we wish to choose the next “most important”from a collection of people, tasks, or objects.For example, doctors in a hospital emergency room often choose to seenext the “most critical” patient rather than the one who arrivedfirst.When scheduling programs for execution in a multitaskingoperating system, at any given moment there might be several programs(usually called jobs) ready to run.The next job selected is the one with the highestpriority.Priority is indicated by a particular value associated with the job(and might change while the job remains in the wait list).

When a collection of objects is organized by importance or priority,we call this a priority queue.A normal queue data structure will not implement a priority queueefficiently because search for the element with highest priority willtake \(\Theta(n)\) time.A list, whether sorted or not, will also require \(\Theta(n)\)time for either insertion or removal.A BST that organizes records by priority could be used, with the totalof \(n\) inserts and \(n\) remove operationsrequiring \(\Theta(n \log n)\) time in the average case.However, there is always the possibility that the BST will becomeunbalanced, leading to bad performance.Instead, we would like to find a data structure that is guaranteed tohave good performance for this special application.

This section presents the heap 1 data structure.A heap is defined by two properties.First, it is a complete binary tree,so heaps are nearly always implemented using thearray representation for complete binary trees.Second, the values stored in a heap arepartially ordered.This means that there is a relationship between the value stored atany node and the values of its children.There are two variants of the heap, depending on the definition ofthis relationship.

1

Note that the term “heap” is also sometimes used to refer tofree store.

A max heap has the property that every node stores avalue that is greater than or equal to the value of either ofits children.Because the root has a value greater than or equal to its children,which in turn have values greater than or equal to their children, theroot stores the maximum of all values in the tree.

A min heap has the property that every node stores avalue that is less than or equal to that of its children.Because the root has a value less than or equal to its children, whichin turn have values less than or equal to their children, the rootstores the minimum of all values in the tree.

Note that there is no necessary relationship between the value of anode and that of its sibling in either the min heap or the max heap.For example, it is possible that the values for all nodes in the leftsubtree of the root are greater than the values for every node of theright subtree.We can contrast BSTs and heaps by the strength of their orderingrelationships.A BST defines a total order on its nodes in that,given the positions for any two nodes in the tree, the one to the“left” (equivalently, the one appearing earlier in an inordertraversal) has a smaller key value than the one to the “right”.In contrast, a heap implements a partial order.Given their positions, we can determine the relative order for thekey values of two nodes in the heap only if one is adescendant of the other.

Min heaps and max heaps both have their uses.For example, the Heapsort uses the max heap,while the Replacement Selection algorithm used for external sortinguses a min heap.The examples in the rest of this section will use a max heap.

Be sure not to confuse the logical representation of a heapwith its physical implementation by means of the array-based completebinary tree.The two are not synonymous because the logical view of the heap isactually a tree structure, while the typical physical implementationuses an array.

Here is an implementation for max heaps.The class uses records that support the Comparable interface toprovide flexibility.

// Max-heap implementation// use `java -ea` to enable assertions that check valid heap positionsclass MaxHeap { private Comparable[] heap; // Pointer to the heap array private int maxSize; // Maximum size of the heap private int n; // Number of things now in heap // Constructor supporting preloading of heap contents MaxHeap(Comparable[] h, int heapSize, int max) { heap = h; n = heapSize; maxSize = max; buildHeap(); } // Return current size of the heap public int heapSize() { return n; } // Return true if pos a leaf position, false otherwise public boolean isLeaf(int pos)  { return (n / 2 <= pos ) && (pos < n); } // Return position for left child of pos public static int leftChild(int pos)  { return 2 * pos + 1; } // Return position for right child of pos public static int rightChild(int pos)  { return 2 * pos + 2; } // Return position for parent public static int parent(int pos)  { return (pos - 1) / 2; } // Insert val into heap public void insert(Comparable key) { assert n < maxSize : "Heap is full; cannot insert"; heap[n] = key; n++; siftUp(n - 1); } // Heapify contents of Heap private void buildHeap() { for (int i = parent(n - 1); i >= 0; i--) { siftDown(i); } } // Moves an element down to its correct place private void siftDown(int pos) { assert (0 <= pos && pos < n) : "Invalid heap position"; while (!isLeaf(pos)) { int child = leftChild(pos); if ((child + 1 < n) && isGreaterThan(child + 1, child)) { child = child + 1; // child is now index with the greater value } if (!isGreaterThan(child, pos)) { return; // stop early } swap(pos, child); pos = child; // keep sifting down } } // Moves an element up to its correct place private void siftUp(int pos) { assert (0 <= pos && pos < n) : "Invalid heap position"; while (pos > 0) { int parent = parent(pos); if (isGreaterThan(parent, pos)) { return; // stop early } swap(pos, parent); pos = parent; // keep sifting up } } // Remove and return maximum value public Comparable removeMax() { assert n > 0 : "Heap is empty; cannot remove"; n--; swap(0, n); // Swap maximum with last value siftDown(0); // Put new heap root val in correct place return heap[n]; } // Remove and return element at specified position public Comparable remove(int pos) { assert (0 <= pos && pos < n) : "Invalid heap position"; n--; swap(pos, n); // Swap with last value update(pos); // Move other value to correct position return heap[n]; } // Modify the value at the given position public void modify(int pos, Comparable newVal) { assert (0 <= pos && pos < n) : "Invalid heap position"; heap[pos] = newVal; update(pos); } // The value at pos has been changed, restore the heap property private void update(int pos) { siftUp(pos); // priority goes up siftDown(pos); // unimportant goes down } // swaps the elements at two positions private void swap(int pos1, int pos2) { Comparable temp = heap[pos1]; heap[pos1] = heap[pos2]; heap[pos2] = temp; } // does comparison used for checking heap validity private boolean isGreaterThan(int pos1, int pos2) { return heap[pos1].compareTo(heap[pos2]) > 0; }}

This class definition makes two concessions to the fact that anarray-based implementation is used.First, heap nodes are indicated by their logical position within theheap rather than by a pointer to the node.In practice, the logical heap position corresponds to the identicallynumbered physical position in the array.Second, the constructor takes as input a pointer to the array to beused.This approach provides the greatest flexibility for using the heapbecause all data values can be loaded into the array directlyby the client.The advantage of this comes during the heap construction phase,as explained below.The constructor also takes an integer parameter indicating the initialsize of the heap (based on the number of elements initially loadedinto the array) and a second integer parameter indicating the maximumsize allowed for the heap (the size of the array).

Method heapsize returns the current size of the heap.H.isLeaf(pos) returns TRUE if positionpos is a leaf in heap H, and FALSE otherwise.Members leftchild, rightchild,and parent return the position (actually, the array index)for the left child, right child, and parent of the position passed,respectively.

One way to build a heap is to insert the elements one at a time.Method insert will insert a new element \(V\) intothe heap.

You might expect the heap insertion process to be similar to theinsert function for a BST, starting at the root and working downthrough the heap.However, this approach is not likely to work because the heap mustmaintain the shape of a complete binary tree.Equivalently, if the heap takes up the first\(n\) positions of its array prior to the call toinsert,it must take up the first \(n+1\) positions after.To accomplish this, insert first places \(V\) atposition \(n\) of the array.Of course, \(V\) is unlikely to be in the correct position.To move \(V\) to the right place, it is compared to itsparent’s value.If the value of \(V\) is less than or equal to the value of itsparent, then it is in the correct place and the insert routine isfinished.If the value of \(V\) is greater than that of its parent, thenthe two elements swap positions.From here, the process of comparing \(V\) to its (current)parent continues until \(V\) reaches its correct position.

Since the heap is a complete binary tree, its height is guaranteed tobe the minimum possible.In particular, a heap containing \(n\) nodes will have a height of\(\Theta(\log n)\).Intuitively, we can see that this must be true because each level thatwe add will slightly more than double the number of nodes in the tree(the \(i\) th level has \(2^i\) nodes,and the sum of the first \(i\) levels is \(2^{i+1}-1\)).Starting at 1, we can double only \(\log n\) times to reach avalue of \(n\).To be precise, the height of a heap with \(n\) nodes is\(\lceil \log n + 1 \rceil\).

Each call to insert takes \(\Theta(\log n)\) time in theworst case, because the value being inserted can move at most thedistance from the bottom of the tree to the top of the tree.Thus, to insert \(n\) values into the heap, if we insert themone at a time, will take \(\Theta(n \log n)\) time in theworst case.

12.17.2. Building a Heap

If all \(n\) values are available at the beginning of thebuilding process, we can build the heap faster than justinserting the values into the heap one by one.Consider this example, with two possible ways to heapify an initialset of values in an array.

Settings

12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (3) Saving... 12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (4)
Server Error
Resubmit

Figure 12.17.1: Two series of exchanges to build a max heap.(a) This heap is built by a series of nine exchanges in the order(4-2), (4-1), (2-1), (5-2), (5-4), (6-3), (6-5), (7-5), (7-6).(b) This heap is built by a series of four exchanges in the order(5-2), (7-3), (7-1), (6-1).

From this example, it is clear that the heap for any givenset of numbers is not unique, and we see that some rearrangements ofthe input values require fewer exchanges than others to build theheap.So, how do we pick the best rearrangement?

One good algorithm stems from induction.Suppose that the left and right subtrees of the root are alreadyheaps, and \(R\) is the name of the element at the root.This situation is illustrated by this figure:

Figure 12.17.2: Final stage in the heap-building algorithm.Both subtrees of node \(R\) are heaps.All that remains is to push \(R\) down to its proper level inthe heap.

In this case there are two possibilities.

  1. \(R\) has a value greater than or equal to its twochildren. In this case, construction is complete.

  2. \(R\) has a value less than one or both of its children.

\(R\) should be exchanged with the child that hasgreater value.The result will be a heap, except that \(R\)might still be less than one or both of its (new) children.In this case, we simply continue the process of “pushing down”\(R\) until it reaches a level where it is greater than itschildren, or is a leaf node.This process is implemented by the private methodsiftdown.

This approach assumes that the subtrees are already heaps,suggesting that a complete algorithm can be obtained by visitingthe nodes in some order such that the children of a node arevisited before the node itself.One simple way to do this is simply to work from the high index ofthe array to the low index.Actually, the build process need not visit the leaf nodes(they can never move down because they are already at the bottom), sothe building algorithm can start in the middle of the array, with thefirst internal node.

Here is a visualization of the heap build process.

Settings

12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (5) Saving... 12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (6)
Server Error
Resubmit

Method buildHeap implements the building algorithm.

What is the cost of buildHeap?Clearly it is the sum of the costs for the calls to siftdown.Each siftdown operation can cost at most the number oflevels it takes for the node being sifted to reach the bottom of thetree.In any complete tree, approximately half of the nodes are leavesand so cannot be moved downward at all.One quarter of the nodes are one level above the leaves, and so theirelements can move down at most one level.At each step up the tree we get half the number of nodes as were atthe previous level, and an additional height of one.The maximum sum of total distances that elements can go istherefore

\[\sum_{i=1}^{\log n} (i-1)\frac{n}{2^i}= \frac{n}{2}\sum_{i=1}^{\log n} \frac{i-1}{2^{i-1}}.\]

The summation on the right is knownto have a closed-form solution of approximately 2,so this algorithm takes \(\Theta(n)\) time in the worst case.This is far better than building the heap one element at a time,which would cost \(\Theta(n \log n)\) in the worst case.It is also faster than the \(\Theta(n \log n)\) average-casetime and \(\Theta(n^2)\) worst-case time required to build theBST.

Settings

12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (7) Saving... 12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (8)
Server Error
Resubmit

12.17.3. Removing from the heap or updating an object’s priority

Settings

12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (9) Saving... 12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (10)
Server Error
Resubmit

Because the heap is \(\log n\) levels deep, the cost of deletingthe maximum element is \(\Theta(\log n)\) in the average and worstcases.

Settings

12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (11) Saving... 12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (12)
Server Error
Resubmit

For some applications, objects might get their priority modified.One solution in this case is to remove the object and reinsert it.To do this, the application needs to know the position of the objectin the heap.Another option is to change the priority value of the object, and thenupdate its position in the heap.Note that a remove operation implicitly has to do this anyway, sincewhen the last element in the heap is swapped with the one beingremoved, that value might be either too small or too big for its newposition.So we use a utility method called update in both the removeand modify methods to handle this process.

12.17.4. Priority Queues

The heap is a natural implementation for the priority queue discussedat the beginning of this section.Jobs can be added to the heap (using their priority value as theordering key) when needed.Method removemax can be called whenever a new job is to beexecuted.

Some applications of priority queues require the ability to change thepriority of an object already stored in the queue.This might require that the object’s position in the heap representationbe updated.Unfortunately, a max heap is not efficient when searching for anarbitrary value; it is only good for finding the maximum value.However, if we already know the index for an object within the heap,it is a simple matter to update its priority (including changing itsposition to maintain the heap property) or remove it.The remove method takes as input the position of thenode to be removed from the heap.A typical implementation for priority queues requiring updating ofpriorities will need to use an auxiliary data structure that supportsefficient search for objects (such as a BST).Records in the auxiliary data structure will storethe object’s heap index, so that the object’s priority can be updated.Priority queues can be helpful for solving graph problems such assingle-source shortest pathsandminimal-cost spanning tree.

For a story about Priority Queues and dragons, see Computational Fairy Tales: Stacks, Queues, Priority Queues, and the Prince's Complaint Line.

12.17. Heaps and Priority Queues — OpenDSA Data Structures and Algorithms Modules Collection (2024)

References

Top Articles
Latest Posts
Article information

Author: Tish Haag

Last Updated:

Views: 5857

Rating: 4.7 / 5 (67 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Tish Haag

Birthday: 1999-11-18

Address: 30256 Tara Expressway, Kutchburgh, VT 92892-0078

Phone: +4215847628708

Job: Internal Consulting Engineer

Hobby: Roller skating, Roller skating, Kayaking, Flying, Graffiti, Ghost hunting, scrapbook

Introduction: My name is Tish Haag, I am a excited, delightful, curious, beautiful, agreeable, enchanting, fancy person who loves writing and wants to share my knowledge and understanding with you.