Data structure - Heap

Heap is a type of Binary tree data structure that exhibits the following properties:

  1. Max of 2 child nodes.
  2. A complete tree, a Binary tree that is balanced (all leaf nodes are filled before next layer is filled).
  3. Parent node is always greater or equal that of the leaf nodes. (max-heap)

Note that the heap’s the left/right nodes do not need to be ordered. (right node does not have to be greater than left node)

A Heap data structure can be represented in an array. The first element of the array, array[0] is the root element of the Heap. The parent/child nodes can be represented as follows:

  • Parent i/2
  • key/current: i
  • Left: 2i + 1
  • Right 2i + 2

Below shows the Heap data structure as an array.

Heap data structure represented in an array

Building a Heap data structure

Heaps have the following operations:

  1. Build max heap, buildMaxHeap - converts any array into Heap array.
  2. max heapify, maxHeapify - operation used to fix a single node in the Heap tree.
  3. max/min, getMax - retrieves max element in the Heap.
  4. heapSize, getCount - retrieves size of the Heap.

maxHeapify()

To ensure that a Heap conforms to the max-heap property, all parent nodes must remaining greater than left and right child nodes. Therefore whenever a Heap tree is changed, the Heap tree must ensure tree that any max-heap violations are “fixed”. This operation we call maxHeapify, also known as heapify.

The following example shows how we can fix heap data structure [16,4,10,14,7,9,3,2,8,1] using maxHeapify(array, 2).

Example showing how maxHeapify operations fixes max-heap property.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
function maxHeapify(array = [], heapSize, i) {
let largest = i;
let left = 2 * i + 1;
let right = 2 * i + 2;

// check if you have reached leaf nodes or nodes with no childeren.
// check if left / right node is bigger than parent node (considered as root in this recursion)
if (left < heapSize && array[left] > array[largest]) {
largest = left;
}
if (right < heapSize && array[right] > array[largest]) {
largest = right;
}

// If largest element is not the root element, i.e.: Heap tree changed
if (largest != i) {
const swap = array[i];
array[i] = array[largest];
array[largest] = swap;
// Recursively fix the subtree
maxHeapify(array, heapSize, largest);
}
};

(() => {
const test_1 = [1, 2, 5];
// i=0, since i want the parent(root of the itteration) to be at index i=0.
maxHeapify(test_1, test_1.length, 0);
console.log(test_1); // [5, 1, 2]

const test_2 = [16, 4, 10, 14, 7, 9, 3, 2, 8, 1];
// i=2, since i want the parent(root of the itteration) to be where I start to fix for max-heap.
maxHeapify(test_2, test_2.length, 1);
console.log(test_2); // [16, 14, 10, 8, 7, 9, 3, 2, 4, 1]
})();

Note that the above function is recursive, the operation is called for downstream subtrees for each changed node level.

maxHeapify() Time complexity

The maxHeapify() operation time complexity is O(log(2)n). This is because the number of maxHeapify operation is proportional to the height of the tree.


build_max_heap() Psuedocode

To convert an array into a max-heap array, each element in the array has to be incorporated into the Heap. This means each element has to exhibit max-heap property. To achieve this, we pass each element into the maxHeapify() operation. So buildMaxHeap(array) should do the following …

  1. Itterate through all elements of array[n] starting from array[0]…
  2. Call maxHeapify(array, array.length, 0).

buildMaxHeap operation with optimization involves passing all elements through maxHeapify, starting with maxHeapify(array, array.length, 0). With optimization all non-leaf nodes pass through maxHeapify, starting with n/2 and moving towards the root.

Note the code below show buildMaxHeap() without optmization (optimized code is shown below as commented out code).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
function buildMaxHeap(array) {
// Non optimized buildMaxHeap operation.
for (let i = 0; i < array.length; i++) {
maxHeapify(array, array.length, i);
}
// Optimized buildMaxHeap
// const startIndex = (array.length/2 -1)
// for (let i = startIndex; i>=0; i--) {
// maxHeapify(array, array.length, i);
// }
};

(() => {
const test_1 = [1, 2, 5];
buildMaxHeap(test_1);
console.log(test_1); // [5, 1, 2]

const test_2 = [16, 4, 10, 14, 7, 9, 3, 2, 8, 1];
buildMaxHeap(test_2);
console.log(test_2); // [16, 14, 10, 8, 7, 9, 3, 2, 4, 1]

const test_3 = [4, 1, 3, 2, 16, 9, 10, 14, 8, 7];
buildMaxHeap(test_2);
console.log(test_2); // [16, 14, 10, 8, 7, 9, 3, 2, 4, 1]
})();

build_max_heap() optimization

A Heap data structure has most of the nodes represented as leaf nodes. Since a leaf node by definition do not have childeren, leaf nodes do not need maxHeapify “fixing”. This means we can limit maxHeapify operations to none leaf nodes.

To optimize the buildMaxHeap, we can start with an elements of n where n is the start of leaf nodes. since we are finding the location in the array where leaf nodes start, we need to traverse through all the non-leaf nodes towards root.

Leaf nodes (shown in red) in relation to Heap data structure array.

The above image visually shows that location of leaf nodes in relation to the array positions in a Heap data structure.

build_max_heap() Time complexity

Building of a heap takes time complexity of O(n) and not O(nlog(2)n). This can be derived from the observation that maxHeapify takes constant time for each level of nodes above the leaf. The higher the level of node, the more maxHeapify operations, but there are less nodes. For example [4, 1, 3, 2, 16, 9, 10, 14, 8, 7], the array forms a Heap with 3 levels of node above the leaf, so …

Total work in the for loop of buildMaxHeap:
n/4(c) + n/8(c) + n/16(c)+ ... + 1(log(2)n (c))

c is a constant representing a contant amount of work for that given level. The expression can be rewriting by substituting n/4 = 2^k (helps shows mathmatical expression of the operation..)

Total work in the for loop of buildMaxHeap:
c2^k (1/2^0 + 2/2^1 + 3/2^2 + .... (k+1)/2^k)

From this, we can see that 1/2^0 + 2/2^1 + 3/2^2 + ...(k+1)/2^k is really a constant with a converging value between ~2 to ~3. So this means n is bounded by a constant since 2^k is n, so …

c.2^k (constant with value between ~2 to ~3), where 2^k is n/4


delete_node()

To delete a node in the Heap data structure we will need to identify which element we would like to delete. The nodes of the Heap downstream of the changed node will need to be fixed when the Heap data structure is changed. The parent nodes of the changed node will remain unchanged. In summary:

Deleting a node form a heap.

  1. swap the array[indexToDelete] with the last node, array[n]
  2. Run maxHeapify(array, array.length, indexToDelete) to fix heap
  3. Reduce the heapSize by 1 so that the node is de-referenced.

Note that when we apply the deleteNode() operation to all elements in the tree, a sequence of ordered nodes/elements in produced. This leads to what we know as the algorithm-sort-heap.

delete_node() Time complexity

The time complexity of a deleting a single node operation is O(1 + log(2)n).

  • 1, for the constant time swap operation
  • log(2)n for fixing the heap.