Categories
Computer Science Data Structures

Heaps and Heap Sort

Imagine a queue that doesn’t operate based on who got there first. It rather operates based on the importance of a person in the queue. 

For instance, think of a hypothetical passport control. A government official would take the highest priority, followed by a citizen, and then finally a foreigner. In essence, this isn’t a normal queue where people in the queue are processed in the order they enter the queue; the queue is processed based on the importance of a person in the queue.

As an example, say there are 5 people in a queue and assume that the order they lined up in the queue is as follows: A tourist, a foreign student, a citizen, another citizen, and a government official (so the tourist is first in line and the government official is last in line). 

When processing the queue, the government official would be taken care of first. Then, the two citizens. And then only would the foreign student and tourist get their passports checked.

Likewise, computers also have this type of queue, and they’re called priority queues. They’re like queues, but instead of being based on an item’s order in the line, items are processed by their level of importance.

Now, imagine using an array or a list to implement a priority queue. 

For example, let’s say our priority queue consisted of the elements: [1, 2, 5, 4, 3].

Let’s say we wanted to process the elements based on who has the highest value. If the array wasn’t sorted, we’d have to search the entire array and look for the maximum element. Once we found the maximum element, we’d then have to remove that element from the array and then shift everything to its right to the left like so:

[1, 2, 5, 4, 3] -> [1, 2, 4, 3]

This may require up to 10 operations in the worst case. 5 because we iterate through the entire array looking for the biggest element, then assume the biggest element was the first one in the array. Everything to its right would have to be shifted to the left. So, deleting the element would be one operation and shifting the remaining 4 would be 4 operations for a total of 10 operations.

Not that efficient, right?

So, let’s explore heaps.

In essence, a heap is a complete binary tree which has the highest or lowest value element in the root.

If you’re not sure what a complete binary tree is, it’s basically a tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible. (Definition gotten from here).

Notice how all the levels but the last one are fully filled. And they’re all filled from left to right.

Here’s another example:

The tree on the left is not a complete binary tree because c’s left child is missing. The tree on the right is a complete binary tree because only c’s right child is missing. Basically, remember that children on the lowest level must exist from left to right. If there’s a gap between a left and a right child, that tree won’t be a complete tree. Or, if there’s a gap on any level up, that automatically makes it not a complete binary tree.

Moving on, let’s view an example of a min heap and a max heap.

The one on the left is a min heap, and the one on the right is a max heap. As you can see, the lowest item is at the top in the min heap, and the highest item is on the top in the max heap. 

Basically, in a heap, we only care about the biggest or smallest element depending on the type of heap it is. Whenever we add something to the heap, we want to make sure to place the element appropriately in the heap. For instance, in a min heap, if the item that we are adding is smaller than the smallest item, we’ll have to reposition the value that we added to the top of the heap. Or, if we’re deleting the minimum element from the heap, we’ll have to find the next smallest element and place it at the top.

Let’s walk through examples of both adding and deleting.

Example of adding an element to a heap.

We’ll assume we’re working with a min-heap. Let’s add 0 to the heap. When we add, we add to the first free child. So, we’ll add to the right child of 7.

The value we added isn’t in the right position, so let’s compare it with its parent. It looks like it’s smaller than its parent, so let’s swap.

Now, let’s compare 0 with the its parent. Again, it’s smaller, so let’s swap values.

And, we’re done. Note this procedure is calling “bubbling up.” (This is because we are swapping our way up).

Example of deleting an element from a heap.

Now, let’s work with the same example from above and delete 0. 

Our top root is now empty, let’s replace it with the lowest and right-most child.

Now, we have to reposition the element. We check the left and right child. The child on the right is smaller, so let’s swap.

Now, we check if any of 7’s children are less than it. The left child is greater and the right child doesn’t exist, so we’re done. This process is known as “bubbling down.” (This is because we are swapping our way down.)

And that’s the basics of heaps. Let’s try implementing one.

Heap Implementation

For this implementation of the heap, we’ll have the following functions:

peek() -> It’ll return the first (top-most) element in the heap.

swap(index1, index2) -> It’ll swap the elements of the two indices provided.

get_min_child_index(index) -> It’ll return the smaller child from a given index.

get_max_child_index(index) -> It’ll return the bigger child from a given index.

bubble_down() -> We’ll call this after deleting the first element from the heap. Remember, after deleting something from the heap, we set the first element equal to the last element of the heap then swap that element down to its appropriate position. We swap our way down, thus the name “bubbling down”.

bubble_up() -> We’ll call this after adding a new element to the heap. Since the new item will be added to the end of the heap, we’ll have to set it to an appropriate position in the heap. Since we’ll swap our way up, this function will be called “bubble up.”

add(value) -> This will add the value provided to the heap.

pop() -> This will remove the heap head, reassign a new heap head, and return the previous heap head. Note that when I mean head, I just mean the first value of the heap. The first value would either be the smallest or bigger value in the heap depending on whether it’s a max or min heap.

print_heap() -> This will provide a visual representation of the heap.

heap_sort() -> This will return a sorted array with the elements in the heap.

Cool. With all that being said, let’s dive right into the code.

Constructor

Let’s construct our heap and create our Heap class.

class Heap:
    def __init__(self, isMinHeap=True):
        """
        Constructor for the heap.
        :param isMinHeap: If true, it'll construct a min-heap; else, it'll construct a max-heap.
        """
        self.heap: List[None or int] = [None]
        self.capacity = 0
        self.isMinHeap = isMinHeap

Our self.heap variable will hold the list of elements in the heap. Note how the first one is None. In order for the math to work out as a complete binary tree, we’ll have to start our list from index 1. That’s why we have None in index 0.

Our self.capacity will hold the size of the heap.

And finally, our self.isMinHeap boolean will determine whether it’s a min heap or a max heap. 

Peek
def peek(self) -> int or None:
    """
    If there's an item in the heap, it'll return the smallest or biggest element depending on the type of heap.
    :return: Smallest or biggest element in the heap (if there's at least one item in the heap).
    """
    if self.capacity > 0:
        return self.heap[1]

This will first check if our heap holds at least one item. If it does, it’ll return the first item.

Swap
def swap(self, index1, index2):
    """
    Swaps two elements with indices provided.
    :param index1: Item to be swapped with index2.
    :param index2: Item to be swapped with index1.
    :return: None
    """
    temp = self.heap[index1]
    self.heap[index1] = self.heap[index2]
    self.heap[index2] = temp

This will just swap the two indices’ values. For instance, if our array is [1, 2, 3, 4, 5], and we wanted to swap the values from index 1 and index 3, we would call swap(1, 3). The result is: [1, 4, 3, 2, 5]. (Remember, indices start from 0. 2 is in index 1 and 4 is in index 3)

Get Min Child Index
def get_min_child_index(self, index) -> int:
    """
    Returns the smaller child from a given index.
    :param index: Parent index to check its children.
    :return: Child index with the smaller value.
    """
    if index * 2 + 1 > self.capacity:
        return index * 2
    else:
        if self.heap[index * 2] > self.heap[index * 2 + 1]:
            return index * 2 + 1
        else:
            return index * 2

For instance, take a look at the following heap.

Say we wanted to get the min child of 0. 7 is less than 9, so the function would return the index of the right child (since 7 is the right child, and 9 is the left child)

To return a left child index, we multiply the parent’s index by 2. To return a right child index, we multiply the parent’s index by 2 then add 1. Since 7 is the right child and 0’s index is 3, 7’s index is 3 * 2 + 1 which is 7.

Get Max Child Index
def get_max_child_index(self, index) -> int:
    """
    Returns the bigger child from a given index.
    :param index: Parent index to check its children.
    :return: Child index with the bigger value.
    """
    if index * 2 + 1 > self.capacity:
        return index * 2
    else:
        if self.heap[index * 2] < self.heap[index * 2 + 1]:
            return index * 2 + 1
        else:
            return index * 2

This is the same logic as above, but it’ll return the bigger child’s index. With the same example of parent 0 in mind, this function would instead return 9’s index. (9’s index would be 6. It’s 6 because 0’s index is 3, and since this is the left child, it’d return 3 * 2 which is 6).

Bubble Down
def bubble_down(self) -> int:
    """
    After deleting something and inserting the last element to the root, we'll reposition that
    item to its appropriate place.
    :return: None
    """
    index = 1
    while index * 2 <= self.capacity:
        if self.isMinHeap:
            childIndex = self.get_min_child_index(index)
            if self.heap[index] > self.heap[childIndex]:
                self.swap(index, childIndex)
        else:
            childIndex = self.get_max_child_index(index)
            if self.heap[index] < self.heap[childIndex]:
                self.swap(index, childIndex)
        index = childIndex

Recall that once an item is deleted from the heap, the last element is swapped with the first item. This function will then reposition that item down to its correct position in the heap. Note how there is a condition that checks whether it’s a min or max heap. If it’s a min heap, it’ll bubble down if the new item is bigger than its parents; however, if it’s a max heap, it’ll bubble down if the new item is smaller than its parents.

Bubble Up
def bubble_up(self) -> int:
    """
    After we add something, we'll reposition that item up to its appropriate position.
    :return: None
    """
    index = self.capacity
    while index // 2 > 0:
        if self.isMinHeap:
            if self.heap[index] < self.heap[index // 2]:
                self.swap(index, index // 2)
        else:
            if self.heap[index] > self.heap[index // 2]:
                self.swap(index, index // 2)
        index = index // 2

This is the opposite of bubbling down. Once we add a new item to the last position in the heap, we’ll reposition that element up to its appropriate position. Note how there is a condition that checks whether it’s a min or max heap. If it’s a min heap, it’ll bubble up if the new item is smaller than its parents; however, if it’s a max heap, it’ll bubble up if the new item is bigger than its parents.

Add
def add(self, value):
    """
    Adds a value to the min-heap.
    :param value: Value to add to the min heap.
    :return: None
    """
    self.heap.append(value)
    self.capacity += 1
    self.bubble_up()

This function will add the value provided to the heap, increase the capacity variable by one, then bubble the new element up to its appropriate position (if it has to). 

Pop
def pop(self) -> int:
    """
    Pops smallest value from the min-heap and returns it.
    :return: Smallest element in the min-heap.
    """
    if self.capacity > 0:
        value = self.heap[1]
        self.heap[1] = self.heap[self.capacity]
        self.capacity -= 1
        self.heap.pop()
        self.bubble_down()
        return value

This will first check if the heap contains at least one item. If it does, then it’ll first save the heap’s first value (as we won’t have access to it later to return it as we’re deleting this value). It’ll then set the heap’s first value equal to the heap’s last value. It’ll then get rid of the last value then bubble that newly position value down to its appropriate position. Finally, it’ll return the temporary variable value we saved.

Print Heap
def print_heap(self):
    """
    Prints the heap.
    """
    print(self.heap[1:])

This will just give us a visual representation of the heap.

Heap Sort
def heap_sort(self, reverse=False) -> list:
    """
    Returns a sorted array with current items from the heap.
    :param reverse: Boolean that'll determine whether data is sorted in ascending or descending order. If reverse is
    False, then data will be returned in ascending order. If it's True, data will be returned in descending order.
    :return: Sorted list with heap elements.
    """
    tempHeap = copy.deepcopy(self)
    tempList = [tempHeap.pop() for _ in range(tempHeap.capacity)]

    if reverse:
        if tempHeap.isMinHeap:
            return tempList[::-1]
        else:
            return tempList
    else:
        if tempHeap.isMinHeap:
            return tempList
        else:
            return tempList[::-1]

Did you notice how a heap always has the biggest or smallest item in the front? We can use this fact as an advantage to have it sort values for us. 

So, to create a sort function out of this, all we have to do is keep popping off the heap until it’s empty and then add all the popped values to a new list.

That new list should then have the values sorted. If it’s a min heap, it’ll be sorted in ascending order. If it’s a max heap, it’ll be sorted in descending order (this is because the biggest elements get popped off first).

The function also has a parameter for reverse. If reverse is True, the function will return the list sorted in descending order.

Test Run
>>> h = Heap()
>>> h.add(10)
>>> h.add(9)
>>> h.add(8)
>>> h.add(7)
>>> h.add(6)
>>> h.add(5)
>>> h.heap_sort()
[5, 6, 7, 8, 9, 10]
>>> h.heap_sort(reverse=True)
[10, 9, 8, 7, 6, 5]
>>> h.pop()
5
>>> h.pop()
6
>>> h.pop()
7
>>> h.pop()
8
>>> h.pop()
9
>>> h.pop()
10

We add 10 through 5 to this min heap, then call heap_sort(). It returns values from the heap in ascending order as expected.

Next, we call heap_sort() with reverse equal to True. It returns values from the heap in descending order as expected.

Finally, since this is a min heap, all the elements that we pop are popped based on whatever is smallest as expected.

Full Code
from typing import List
import copy


class Heap:
    def __init__(self, isMinHeap=True):
        """
        Constructor for the heap.
        :param isMinHeap: If true, it'll construct a min-heap; else, it'll construct a max-heap.
        """
        self.heap: List[None or int] = [None]
        self.capacity = 0
        self.isMinHeap = isMinHeap

    def peek(self) -> int or None:
        """
        If there's an item in the heap, it'll return the smallest or biggest element depending on the type of heap.
        :return: Smallest or biggest element in the heap (if there's at least one item in the heap).
        """
        if self.capacity > 0:
            return self.heap[1]

    def swap(self, index1, index2):
        """
        Swaps two elements with indices provided.
        :param index1: Item to be swapped with index2.
        :param index2: Item to be swapped with index1.
        :return: None
        """
        temp = self.heap[index1]
        self.heap[index1] = self.heap[index2]
        self.heap[index2] = temp

    def get_min_child_index(self, index) -> int:
        """
        Returns the smaller child from a given index.
        :param index: Parent index to check its children.
        :return: Child index with the smaller value.
        """
        if index * 2 + 1 > self.capacity:
            return index * 2
        else:
            if self.heap[index * 2] > self.heap[index * 2 + 1]:
                return index * 2 + 1
            else:
                return index * 2

    def get_max_child_index(self, index) -> int:
        """
        Returns the bigger child from a given index.
        :param index: Parent index to check its children.
        :return: Child index with the bigger value.
        """
        if index * 2 + 1 > self.capacity:
            return index * 2
        else:
            if self.heap[index * 2] < self.heap[index * 2 + 1]:
                return index * 2 + 1
            else:
                return index * 2

    def bubble_down(self):
        """
        After deleting something and inserting the last element to the root, we'll reposition that
        item to its appropriate place.
        :return: None
        """
        index = 1
        while index * 2 <= self.capacity:
            if self.isMinHeap:
                childIndex = self.get_min_child_index(index)
                if self.heap[index] > self.heap[childIndex]:
                    self.swap(index, childIndex)
            else:
                childIndex = self.get_max_child_index(index)
                if self.heap[index] < self.heap[childIndex]:
                    self.swap(index, childIndex)
            index = childIndex

    def bubble_up(self):
        """
        After we add something, we'll reposition that item up to its appropriate position.
        :return: None
        """
        index = self.capacity
        while index // 2 > 0:
            if self.isMinHeap:
                if self.heap[index] < self.heap[index // 2]:
                    self.swap(index, index // 2)
            else:
                if self.heap[index] > self.heap[index // 2]:
                    self.swap(index, index // 2)
            index = index // 2

    def add(self, value):
        """
        Adds a value to the min-heap.
        :param value: Value to add to the min heap.
        :return: None
        """
        self.heap.append(value)
        self.capacity += 1
        self.bubble_up()

    def pop(self) -> int:
        """
        Pops smallest value from the min-heap and returns it.
        :return: Smallest element in the min-heap.
        """
        if self.capacity > 0:
            value = self.heap[1]
            self.heap[1] = self.heap[self.capacity]
            self.capacity -= 1
            self.heap.pop()
            self.bubble_down()
            return value

    def print_heap(self):
        """
        Prints the heap.
        """
        print(self.heap[1:])

    def heap_sort(self, reverse=False) -> list:
        """
        Returns a sorted array with current items from the heap.
        :param reverse: Boolean that'll determine whether data is sorted in ascending or descending order. If reverse is
        False, then data will be returned in ascending order. If it's True, data will be returned in descending order.
        :return: Sorted list with heap elements.
        """
        tempHeap = copy.deepcopy(self)
        tempList = [tempHeap.pop() for _ in range(tempHeap.capacity)]

        if reverse:
            if tempHeap.isMinHeap:
                return tempList[::-1]
            else:
                return tempList
        else:
            if tempHeap.isMinHeap:
                return tempList
            else:
                return tempList[::-1]

Conclusion

If you made it this far, thank you for reading. As always, if you have any questions, feel free to let me know down in the comments below. 

Leave a Reply

Your email address will not be published. Required fields are marked *