Fenwick Trees


In this post I will talk about Fenwick Trees aka Binary Indexed Trees. I got to use them recently while solving a problem in a coding challenge.

Problem:
Let’s say we have an array of N positive integers. We need to be able to serve the following two operations efficiently:

  • GetPrefixSum(i) – Gets the cumulative sum of all numbers from index [0] to index [i] in the given array.
  • UpdateElement(int i, int k) – Updates the element at index [i] by delta “k” where k need not be necessarily positive.

Example Array: {2, 5, 1, 3, 3, 7, 9, 12, 12, 13, 15, 11, 16, 21, 8} where N = 15

Solution 1:

GetPrefixSum(i) – run a loop till the given index and calculate the running sum.

GetPrefixSum(i)
{
  sum = 0;
  for(count=0; count<=i; count++)
  {
     sum += array[count];
  }  
  return count;
}

UpdateElement(int i, int k) – simply do array[i] += k.

Time Complexity:
GetPrefixSum(i) – O(N); need to iterate through the array and compute the sum.
UpdateElement(int i, int k) – O(1); constant time update of a value in the array.

Solution 2:

Maintain an auxiliary array of same size and store the prefix/cumulative sums. Let’s say the auxiliary array is prefixsum[]. So prefixsum[i] is the cumulative sum of elements in array from index [0] to index [i].

For the above mentioned array, our auxiliary array would be:

{2, 7, 8, 11, 14, 21, 30, 42, 54, 67, 82, 93, 109, 130, 138}

GetPrefixSum(i) – return the value of prefixsum[i]
UpdateElement(int i, int k) – array[i] += k, and update all the prefix sums from index [i] to [N-1] in prefixsum array.

UpdateElement(int i, int k)
{
  array[i] += k;

  for(count=i; count<=N-1; count++)
  {
     prefixsum[count] += k;
  }
}

Time Complexity:
GetPrefixSum(i) – O(1); the operation no longer needs to iterate through the array and compute the sum.
UpdateElement(int i, int k) – O(N); update becomes costly as we need to update the prefix sums for all the subsequent elements after index [i].

Solution 3: Using Fenwick Trees

Fenwick Trees can help us serve both types of operations in logarithmic O(logN) time.
The main idea behind Fenwick Trees is that any natural number “K” can be obtained from sum of other natural numbers (less than K), where these natural numbers are powers of 2. That is:

K = K1 + K2 + K3 where:
K1 = (2^p)
K2 = (2^q)
K3 = (2^r)
p, q, r >= 0

7 = 4 + 2 + 1
19 = 16 + 2 + 1
11 = 8 + 2 + 1
13 = 8 + 4 + 1
and so on.

In solution 1, updates took O(1), but GetPrefixSum(i) was costly(O(N)). In solution 2, GetPrefixSum(i) took O(1) at the expense of costly O(N) updates. If we are planning to do both these in O(logN), some initial thoughts that we should have are:

  1. The algorithm and data structure should be designed in a way such that GetPrefixSum(i) should get us the cumulative sum till index [i] by calculating cumulative sums for indexes (<= i) just enough to keep the O(logN) as the upper bound.
  2. Consequently, an update to an element at index [i] should not require us to update prefix sums of all elements from index [i] all the way to [N-1]. Instead, only as many prefix sums should be updated as are required to restrict the upper bound of update operation to O(logN) as well.
  3. Having said this, its obvious that there has to be a way for both the operations to know what all indexes should be used to,
    1. Compute the cumulative sum till index [i] during a GetPrefixSum(i) operation.
    2. Update the prefix sums for elements at index > [i] during UpdateElement(i, k) operation.

The natural number theory mentioned above will help us in designing data structure and algorithms for both GetPrefixSum() and UpdateElement() operation with the required time complexity.

Fenwick Tree is implemented using an auxiliary array of size one more than the given array. This is similar to what we did in solution 2.  However, unlike solution 2, any given index [i] in our auxiliary array doesn’t necessarily contain the prefix/cumulative sum up to that index. Instead:

  1. If an index [i] is a power of 2, it will store the complete cumulative sum; from index [1] to index [i].
  2. If an index [i] is not a power of 2, it will store the partial cumulative sum of some group of elements from index[1] to index[i].
  3. And any index [i] in the auxiliary array will store the cumulative sum of “m” elements where “m” is a power of 2.

Let’s call this auxiliary array as FTree[N+1]. Index [0] in FTree[] is just a dummy index that does not store anything. Actual values are stored from index [1] to [N] just to simplify index calculation using the natural number theory mentioned above. Doing this is not a mandatory requirement though.

Suppose we need to serve the operation GetPrefixSum(7); i.e get the prefix sum of all elements from index [1] to index [7].
We know that [7 = 4 + 2 + 1]. This equation actually means that:

GetPrefixSum(7) = FTree[4] + FTree[6] + FTree[7], where:

FTree[4] = prefix sum of elements from index [1] to [4].
FTree[6] = prefix sum of elements from index [5] to [6].
FTree[7] = the element at index [7].

Suppose we now update the element at index [7]; i.e UpdateElement(7, k). We will of course do array[7] += k. But unlike solution 2, we are not going to update the prefix sums of all elements from [7] to [15] in the auxiliary array FTree[]. Instead we just need to update FTree[7] and FTree[8].

Why is that so ?

FTree[7] is updated for obvious reason. The reason for updating only FTree[8] and not other subsequent indexes (9, 10, 11, 12…..) is again the equations I mentioned above. Taking the equation [9 = 8 + 1], we can write:

GetPrefixSum(9) = FTree[8] + FTree[9], where:
FTree[8] = prefix sum of elements from index [1] to [8].
FTree[9] = the element at index [9].

Similarly:

10 = 8 + 2, and GetPrefixSum(10) = FTree[8] + FTree[10], where:
FTree[8] = prefix sum of elements from index [1] to [8].
FTree[10] = prefix sum of elements from index [9] to [10].

11 = 8 + 2 + 1, and GetPrefixSum(11) = FTree[8] + FTree[10] + FTree[11], where:
FTree[8] = prefix sum of elements from index [1] to [8].
FTree[10] = prefix sum of elements from index [9] to [10].
FTree[11] = the element at index [11].

12 = 8 + 4, and GetPrefixSum(12) = FTree[8] + FTree[12], where:
FTree[8] = prefix sum of elements from index [1] to [8].
FTree[12] = prefix sum of elements from index [9] to [12].

We can write the similar equations for other indexes. We see that prefix sums for all the indexes (9, 10, 11, 12…) are anyways using the prefix sum up to index [8], and FTree[8] is already updated with the change “k”. So there is no reason for us to update the subsequent indexes. The algorithm has to just do the math to get to the required indexes for both GetPrefixSum(i) and UpdateElement(i, k) operations.

How do we really compute the required indexes using the natural number rule ? The binary representation of the index will be used to compute the next desired index for both GetPrefixSum(i) and UpdateElement(i, k) operations. This is why Fenwick Tree is also referred to as Binary Indexed Tree.

Computing Indexes when going forward – We need to do this for UpdateElement(i, k) operation to update the prefix sums of subsequent elements.

1. UpdateElement(7, k)
The binary representation of 7 is [00000111]. To get to the next index that we need to update in FTree[], we add the decimal value corresponding to the first set bit (from right) in the binary representation of current index to the current index. We repeat this process until the resulting index goes beyond the array.

current index = 7 (00000111)
Decimal value for first set bit = 1
next index = 7 + 1 = 8 (00001000)
current index = 8
Decimal value for first set bit = 8
next index = 8 + 8 = 16 (STOP)

We just need to update FTree[7] and FTree[8].

So the algorithm for UpdateElement(i, k) using Fenwick Tree becomes:

UpdateElement(index, k, boolean construct)
{
   if(!construct) array[index] += k;

   // FTree[] has data from index [1].
   index += 1; 

   while(index <= N)
   {
      FTree[index] += k;
      index += (index & (-index));
   }
}

Note that there is a small change in the function signature. A boolean variable is added as a third argument. I will explain the reason in implementation section.

2. UpdateElement(9, k)

current index = 9 (00001001)
Decimal value for first set bit = 1
next index = 9 + 1 = 10 (00001010)
current index = 10
Decimal value for first set bit = 2
next index = 10 + 2 = 12 (00001100)
current index = 12
Decimal value for first set bit = 4
next index = 12 + 4 = 16 (STOP)

The algorithm does not update more than logN indexes in the array. Hence, the time complexity of UpdateElement() is O(logN).

Computing Indexes when going backward – We need to do this for GetPrefixSum(i) operation to compute the cumulative sum of elements up to index [i].

1. GetPrefixSum(7)
Here we do the opposite. We subtract the decimal value corresponding to the first set bit (from right) in the binary representation of current index from the current index. This is also equivalent to clearing the first set bit (from right) in the current index. Repeat the process as long as index is greater than 0.

current index = 7 (00000111)
Decimal value for first set bit = 1
next index = 7 – 1 = 6 (00000110)
current index = 6
Decimal value for first set bit = 2
next index = 6 – 2 = 4 (00000100)
current index = 4
Decimal value for first set bit = 4
next index = 4 – 4 = 0 (STOP)

Hence, we can write GetPrefixSum(7) = FTree[4] + FTree[6] + FTree[7]

So the algorithm for GetPrefixSum(i) using Fenwick Tree becomes:

GetPrefixSum(index)
{
   sum = 0;

   // FTree[] has data from index [1].
   index += 1;  
 
   while(index > 0)
   {
      sum += FTree[index];
      index -= (index & (-index));
   }
   return sum;
}

The algorithm does not touch more than logN indexes to compute the prefix sum. Hence, the time complexity of GetPrefixSum() is O(logN).

Why are trying to manipulate the indexes using the first set bit ? It is because:

  1. Our goal is to exponentially increase the index for UpdateElement() operation as opposed to updating prefix sums for all increasing indexes.
  2. Our goal is to exponentially decay the index in powers of 2 for GetPrefixSum() operation as opposed to computing cumulative sum for all decreasing indexes.

And there are at most logN set bits in a binary representation of a number N. So the upper bound has to be logN for both the operations on the data structure.

Implementation

Given an array of size N, we first construct the FTree array of size N+1 using the UpdateElement(i, k) operation for every element in the given array.

int *FTree;
void constructFTree(int a[], int N)
{
   // initialize the FTree[] to 0.
   FTree = (int *)calloc(sizeof(int) * (N+1));
   for(i=0; i<N; i++)
   {
       UpdateElement(i, a[i], true);
   }
}

Time complexity for building the data structure is O(NlogN) as UpdateElement() is called for all of the elements in the array.

The reason for passing the “construct” variable as “true” is that the UpdateElement() operation should not update the given array index when building the FTree[] array itself. It implies that the FenwickTree API that receives the UpdateElement request from client should definitely pass the “construct” variable as false.

Visualization in the form of a tree

Until now I have just discussed the Fenwick Tree data structure, but haven’t really shown it. The abstraction for this tree is different from other known trees like BST, AVL Tree, Trie, n-ary trees etc. It is because the tree is inherently an array and there is no actual parent-child relationship. In fact, the way we look at the tree depends on the operation we are performing. I believe the diagrams shown below will now make much more sense as I have already discussed the principles and logic behind a Fenwick Tree.

Once we have constructed the FTree[] array using the above function, it can be visualized in the form of a tree as:

update

The forward computation of indexes should become clear with the above diagram. For example, when updating the value at index [1], we need to update prefix sums for indexes (1, 2, 4, 8) in FTree[] array. Similarly for index [5], we need to do this for indexes (5, 6, 8). Basically update all the nodes along the path to dummy root node.

The way we will look at the tree for GetPrefixSum() operation will be different from what is shown above. It is because this operation is based on backward computation of indexes.

update1

The above tree shows us that in order to GetPrefixSum(i), we need to sum the values on path from root node to node with index [i]. For example, to do GetPrefixSum(11), we would calculate Ftree[8] + FTree[10] + FTree[11].

Public API

void update(int index, int k)
{
   UpdateElement(index, k, false);
}

int GetSum(int index)
{
   return GetPrefixSum(index);
}

You can also look here for more information and other sources.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: