# Data structure - Heap

Heap is a type of Binary tree data structure that exhibits the following properties:

1. Max of 2 child nodes.
2. A complete tree, a Binary tree that is balanced (all leaf nodes are filled before next layer is filled).
3. Parent node is always greater or equal that of the leaf nodes. (max-heap)

Note that the heap’s the left/right nodes do not need to be ordered. (right node does not have to be greater than left node)

A Heap data structure can be represented in an array. The first element of the array, array is the root element of the Heap. The parent/child nodes can be represented as follows:

• Parent `i/2`
• key/current: `i`
• Left: `2i + 1`
• Right `2i + 2`

Below shows the Heap data structure as an array. ## Building a Heap data structure

Heaps have the following operations:

1. Build max heap, buildMaxHeap - converts any array into Heap array.
2. max heapify, maxHeapify - operation used to fix a single node in the Heap tree.
3. max/min, getMax - retrieves max element in the Heap.
4. heapSize, getCount - retrieves size of the Heap.

## maxHeapify()

To ensure that a Heap conforms to the max-heap property, all parent nodes must remaining greater than left and right child nodes. Therefore whenever a Heap tree is changed, the Heap tree must ensure tree that any max-heap violations are “fixed”. This operation we call maxHeapify, also known as heapify.

The following example shows how we can fix heap data structure `[16,4,10,14,7,9,3,2,8,1]` using `maxHeapify(array, 2)`. Note that the above function is recursive, the operation is called for downstream subtrees for each changed node level.

### maxHeapify() Time complexity

The maxHeapify() operation time complexity is `O(log(2)n)`. This is because the number of maxHeapify operation is proportional to the height of the tree.

### build_max_heap() Psuedocode

To convert an array into a max-heap array, each element in the array has to be incorporated into the Heap. This means each element has to exhibit max-heap property. To achieve this, we pass each element into the maxHeapify() operation. So buildMaxHeap(array) should do the following …

1. Itterate through all elements of array[n] starting from array…
2. Call maxHeapify(array, array.length, 0). Note the code below show buildMaxHeap() without optmization (optimized code is shown below as commented out code).

### build_max_heap() optimization

A Heap data structure has most of the nodes represented as leaf nodes. Since a leaf node by definition do not have childeren, leaf nodes do not need maxHeapify “fixing”. This means we can limit maxHeapify operations to none leaf nodes.

To optimize the buildMaxHeap, we can start with an elements of n where n is the start of leaf nodes. since we are finding the location in the array where leaf nodes start, we need to traverse through all the non-leaf nodes towards root. The above image visually shows that location of leaf nodes in relation to the array positions in a Heap data structure.

### build_max_heap() Time complexity

Building of a heap takes time complexity of `O(n)` and not `O(nlog(2)n)`. This can be derived from the observation that maxHeapify takes constant time for each level of nodes above the leaf. The higher the level of node, the more maxHeapify operations, but there are less nodes. For example `[4, 1, 3, 2, 16, 9, 10, 14, 8, 7]`, the array forms a Heap with 3 levels of node above the leaf, so …

Total work in the for loop of buildMaxHeap:
`n/4(c) + n/8(c) + n/16(c)+ ... + 1(log(2)n (c))`

c is a constant representing a contant amount of work for that given level. The expression can be rewriting by substituting `n/4 = 2^k` (helps shows mathmatical expression of the operation..)

Total work in the for loop of buildMaxHeap:
`c2^k (1/2^0 + 2/2^1 + 3/2^2 + .... (k+1)/2^k)`

From this, we can see that `1/2^0 + 2/2^1 + 3/2^2 + ...(k+1)/2^k` is really a constant with a converging value between ~2 to ~3. So this means n is bounded by a constant since `2^k` is n, so …

`c.2^k (constant with value between ~2 to ~3), where 2^k is n/4`

### delete_node()

To delete a node in the Heap data structure we will need to identify which element we would like to delete. The nodes of the Heap downstream of the changed node will need to be fixed when the Heap data structure is changed. The parent nodes of the changed node will remain unchanged. In summary: 1. swap the array[indexToDelete] with the last node, array[n]
2. Run maxHeapify(array, array.length, indexToDelete) to fix heap
3. Reduce the heapSize by 1 so that the node is de-referenced.

Note that when we apply the deleteNode() operation to all elements in the tree, a sequence of ordered nodes/elements in produced. This leads to what we know as the algorithm-sort-heap.

### delete_node() Time complexity

The time complexity of a deleting a single node operation is `O(1 + log(2)n)`.

• `1`, for the constant time swap operation
• `log(2)n` for fixing the heap.