Pages

Sunday, October 9, 2011

A Simple Merge Sort

Its easier to understand merge sort if you try to write a simple code on how you'd merge two sorted arrays into a new array which needs to be in sorted order. Lets try to code that first

A. Merge two sorted array
The complex part of merge sort is the merge routine/method, especially having to follow multiple indices inside this routine. So lets try to simplify that first.

Lets say we have 2 SORTED arrays

int[] leftArray;
int[] rightArray;

Now we have to merge these 2 into a new array called
int resultArray[];

The length of resultArray, would be

(length of leftArray) + (length of rightArray).

Let's try to write a function/method/routine to do the merge now, the signature of this method would look something like this

int[] sampleMerge(int[] leftArray, int[] rightArray)

Now that we have defined the signature, lets define the indices for each of these 3 arrays which we'll use later in the code
  • '*Begin' marks the beginning of each array index and 
  • '*End' marks where that index would end (end of that array).
int leftArrayBegin = 0;
int rightArrayBegin = 0;
int resultArrayBegin = 0;
 
int leftArrayEnd = leftArray.length - 1;
int rightArrayEnd = rightArray.length - 1;

int[] resultArray = new int[leftArray.length + rightArray.length];

Now that we have defined these 3 indices, lets move the the logic where we merge the left and the right array into the result array. The merge logic is simple as follows
  1. Take an element from leftArray and an element from rightArray and compare both these elements.
  2. Whichever element is less, put that element into the resultArray and advance the begin index of the array from which we picked the element.
  3. Repeat the above step till atleast one of "*Begin" array index reaches the end i.e "*End" index.
  4. Copy the rest of the elements of the other array whose index is not reached the end into the result array.
Here is the sample case with the diagrams that shows the step by step movement of indices in the merge routine.

Step 1 : In the diagram below, we have 2 arrays, left and right arrays. We pick the first element i.e leftArray[0] and rightArray[0] of both arrays and compare them. Since left array value (1) is less than right array value (2), We move left array value (1) to resultArray and increment the leftArrayBegin index [to 1].

The below diagram in step 2 has the result array and index position after step 1 is complete.

Step 2 : We pick the current begin index position element of left array i.e. leftArray[1] and begin index position element of right array i.e. rightArray[0] and compare them. Since right array value (2) is less than left array value (3), we move right array value (2) to resultArray and increment the rightArrayBegin index [ to 1].

The below diagram in step 3 has the result array and index position after step 2 is complete.
(Rest of the below steps are repeat of the above 2 steps till we reach the end of leftArray in this case, since leftArray has less number of elements than rightArray - So feel free to skip rest of the steps and go directly to Step 7).


Step 3 : We pick the current begin index position element of left array i.e leftArray[1] and current begin index position element of right array i.e rightArray[1] and compare them. Since left array value (3) is less than right array value (4), we move left array value (3) to result array and increment the leftArrayBegin index [ to 2].

The below diagram in step 4 has the result array and index position after step 3 is complete.

Step 4 : We pick the current begin index position element of left array i.e leftArray[2] and current begin index position element of right array i.e rightArray[1] and compare them. Since right array value (4) is less than left array value (5), we move right array value (4) to resultArray and increment the rightArrayBegin index [ to 2].

The below diagram in step 5 has the result array and index position after step 4 is complete.

Step 5 : We pick the current begin index position element of left array i.e leftArray[2] and current begin index position element of right array i.e rightArray[2] and compare them. Since left array value(5) is less than right array value(6), we move left array value(5) to resultArray and increment the leftArrayBegin index [ to 3].

The below diagram in step 6 has the result array and index position after step 5 is complete,

Step 6 : We pick the current begin index position element of left array i.e. leftArray[3] and current begin index position element of right array i.e. rightArray[2] and compare them. Since right array  value(6) is less than left array value(7), we move right array value(6) to resultArray and increment the rightArrayBegin index [ to 3 ].

The below diagram in step 7 has the result array and index position after step 6 is complete.

Step 7 : We pick the current begin index position element of left array i.e. leftArray[3] and current begin index position element of right array i.e rightArray[3] and compare them. Since left array value(7) is less than right array value(8), we move left array value(7) to resultArray and increment the leftArrayBegin index [ to 4] (which is more the left array size).

The below diagram in step 8 has the result array and index position after step 7 is complete.

Step 8 : The leftArray index has ended or merged into the resultArray. Now we have few pending elements only in the rightArray. So we start copying the rightArray elements into the resultArray. The below diagram we copy pending element rightArray[3] of value 8 to the resultArray and increment the rightArrayBegin index to 4.

The below diagram in step 9 has the result array and index position after step 8 is complete.

Step 9 : We copy the next right array element value (10) into resultArray and increment the rightArrayBegin index [to 5], which is the end of the rightArray.
The below diagram in step 9 has the result array and index position after step 9 is complete.
The merge ends, since there are no more elements to be merged either in the left or the right array. We have sucessfully merged both the sorted arrays into the result array which in itself is sorted.

B. Sample merge code
Here is a simple code that implements the above sample merge.
/*
     * A sample merge method to help understand the merge routine.
     * This below function is not used by the merge sort
     *
     * This is here only for explanation purpose
     */
    public int[] sampleMerge(int[] leftArray, int[] rightArray) {

        int leftArrayEnd = leftArray.length - 1;
        int rightArrayEnd = rightArray.length - 1;
        int leftArrayBegin = 0;
        int rightArrayBegin = 0;

        int numElements = leftArray.length + rightArray.length;
        int[] resultArray = new int[numElements];
        int resultArrayBegin = 0;

        // Find the smallest element in both these array and add it to the temp
        // array i.e you may have a array of the form [1,5] [2,4]
        // We need to sort the above as [1,2,4,5]
        while (leftArrayBegin <= leftArrayEnd && rightArrayBegin <= rightArrayEnd) {
            if (leftArray[leftArrayBegin] <= rightArray[rightArrayBegin]) {
                resultArray[resultArrayBegin++] = leftArray[leftArrayBegin++];
            } else {
                resultArray[resultArrayBegin++] = rightArray[rightArrayBegin++];
            }
        }

        // After the main loop completed we may have few more elements in
        // left array so copy them
        while (leftArrayBegin <= leftArrayEnd) {
            resultArray[resultArrayBegin++] = leftArray[leftArrayBegin++];
        }

        // After the main loop completed we may have few more elements in
        // right array so copy them
        while (rightArrayBegin <= rightArrayEnd) {
            resultArray[resultArrayBegin++] = rightArray[rightArrayBegin++];
        }

        return resultArray;
    }

Now we have understood the merge routine/method, lets look at merge sort algorithm itself.

C. Merge Sort algorithm
Here is how merge sort is often explained.
  1. Divide the unsorted array, of size N, into two halves.
  2. Sort each sub array, each of size N/2.
  3. Merge the two sorted sub arrays into the sorted, full array.
Note the base case is : If the array is of length 0 or 1, then it is already sorted.

Merge sort, sorts based on the above principle i.e an array of length 0 or 1 is already sorted. We keep splitting the input array until we reach a state where the array has only one element. Then we merge 2 of these single element arrays where merge routine takes care of sorting.  Then we merge two of these '2' element arrays and so on... until we merge two halves of the input array.

Now lets try to implement merge sort algorithm in the simplest possible way as explained above.

D. Merge sort code
 
/*
     * The mergeSort algorithm implementation
     */
    private void mergeSort(int[] array, int left, int right) {

        if (left < right) {

            //split the array into 2
            int center = (left + right) / 2;

            //sort the left and right array
            mergeSort(array, left, center);
            mergeSort(array, center + 1, right);

            //merge the result
            merge(array, left, center + 1, right);
        }
    }
    
Done, that's it. All we are doing in the above code is as explained in the algorithm i.e if an array is of length greater than 1 then divide the array into two sub arrays and sort them independently and merge.

As you can see the algorithm can be implemented in few lines, easy to remember once you know how to implement merge routine.

Here is the implementation of the merge() routine/method used in the above mergeSort() method. The code is similar as that of the above sampleMerge() - The difference is, rather than taking two arrays and returning a sorted array, it takes a single array which has both the leftArray and rightArray is start/end indices of these arrays are either passed/calculated by the merge routine/method.

Now we know the basics spend some time understanding the indices and the arrays in the below code.

/*
     * The merge method used by the mergeSort algorithm implementation.
     */
    private void merge(int[] array, int leftArrayBegin,
            int rightArrayBegin, int rightArrayEnd) {

        int leftArrayEnd = rightArrayBegin - 1;

        int numElements = rightArrayEnd - leftArrayBegin + 1;
        int[] resultArray = new int[numElements];
        int resultArrayBegin = 0;

        // Find the smallest element in both these array and add it to the result
        // array i.e you may have a array of the form [1,5] [2,4]
        // We need to sort the above as [1,2,4,5]
        while (leftArrayBegin <= leftArrayEnd && rightArrayBegin <= rightArrayEnd) {
            if (array[leftArrayBegin] <= array[rightArrayBegin]) {
                resultArray[resultArrayBegin++] = array[leftArrayBegin++];
            } else {
                resultArray[resultArrayBegin++] = array[rightArrayBegin++];
            }
        }

        // After the main loop completed we may have few more elements in
        // left array copy them.
        while (leftArrayBegin <= leftArrayEnd) {
            resultArray[resultArrayBegin++] = array[leftArrayBegin++];
        }

        // After the main loop completed we may have few more elements in
        // right array copy.
        while (rightArrayBegin <= rightArrayEnd) {
            resultArray[resultArrayBegin++] = array[rightArrayBegin++];
        }

        // Copy resultArray back to the main array
        for (int i = numElements - 1; i >= 0; i--, rightArrayEnd--) {
            array[rightArrayEnd] = resultArray[i];
        }
    }


E. What is the complexity of the above algorithm ?

E.1 Time Complexity
Merge Sort is a divide and conquer algorithm and it uses recursion which makes it difficult for us to quickly guess the complexity of the given code by checking the nested loops as explained in how to find the complexity of the given code.

The time complexity of merge sort is
O(n log(n))

How ?
To calculate complexity of a recursive function, We need to consider two things
  1. The number of recursive calls required to solve the problem.
  2. Amount of work done in each recursive call.
Number of recursive call in merge sort
Consider a case if the input array is of size 4, then the number of recursive calls made by merge sort is 7.

Here is how it is, each mergeSort() call results in two recursive more calls - one call for each halves of the array since we split the array into two.
  • The first call starts off by making 2 calls. 
  • Each of these 2 calls in-turn make 2 more calls which makes it 4.
  • and so forth. till we hit the base case (which is single element array). 
So here is the pictorial representation of the above recursive calls for merge sort, you'll see that its like a balanced binary tree. An easy way to visualize merge sort is as a tree of recursive calls.

Here is the tree for input array of size 4.

So we see that for an array of four elements, we have a tree of depth 3.

Now try doing it for doubled the number of elements in the array i.e array size of 8, We'll have the same tree with one more level with depth increased to 4.

This suggests that the total depth of the tree is log(n), is the number of times we need to call sort routing/method recursively to reach the base case.

Now what is the work we do in each recursion ?
In the code above the only work we do is to call merge routine/method - So what is the complexity of merge routine/method?

Considering that given 2 arrays we are merging two array into the result array. If we just take the merge routine/method and see the complexity it would be O(n1 + n2). 

where
n1 is length of left array and
n2 is length of right array.

based on the above we may think the complexity of merge routine to be O(n) at every merge, but its NOT true.

We are dividing the array along the recursive path. So length n1 and n2 keeps changing and the complexity of merge routine/method at each call.
  1. It is 'n' on the first call to recursion (leftArray of size n/2 and rightArray of size n/2) 
  2. It is 'n/2' on the second call to recursion. (we merge two n/4 arrays) 
  3. It is 'n/4' on the third call to recursion. (we merge two n/8 arrays) 
  4. ....... and so on ..... 
  5.  till we merge just 2 items ( leftArray of size 1 and rightArray of size 1) and base case there is just one item, so no need to merge.
As you can see the first call to recursion is the only time the entire array is merged together at each level, a total of 'n' operations take place in the merge routine.

Based on the above,
  1. there are log(n) levels 
  2. a total of 'n' operations takes place in the merge routine/method. 
Hence the time complexity is O(n * log(n)).

E2. Space Complexity
Merge sort requires additional memory check the merge routine where we allot new memory to store the merged result.

Hence the space complexity of merge sort is
O(n) 

Check the other article on in this series in the Table of Contents page.

F. Source code 
Here is the complete merge sort sample java file used in this post.

/*
 *  You are free to use this code anywhere, copy, modify and redistribute at your
 *  own risk.
 *  Your are solely responsibly for any damage that may occur by using this code.
 *
 *  This code is not tested for all boundary conditions. Use it at your own risk
 */
package codenlearn;

public class MergeSort {

    public void sort(int[] array) {
        mergeSort(array, 0, array.length - 1);
    }

    /*
     * The mergeSort algorithm implementation
     */
    private void mergeSort(int[] array, int left, int right) {

        if (left < right) {

            //split the array into 2
            int center = (left + right) / 2;

            //sort the left and right array
            mergeSort(array, left, center);
            mergeSort(array, center + 1, right);

            //merge the result
            merge(array, left, center + 1, right);
        }
    }

    /*
     * The merge method used by the mergeSort algorithm implementation.
     */
    private void merge(int[] array, int leftArrayBegin,
            int rightArrayBegin, int rightArrayEnd) {

        int leftArrayEnd = rightArrayBegin - 1;

        int numElements = rightArrayEnd - leftArrayBegin + 1;
        int[] resultArray = new int[numElements];
        int resultArrayBegin = 0;

        // Find the smallest element in both these array and add it to the result
        // array i.e you may have a array of the form [1,5] [2,4]
        // We need to sort the above as [1,2,4,5]
        while (leftArrayBegin <= leftArrayEnd && rightArrayBegin <= rightArrayEnd) {
            if (array[leftArrayBegin] <= array[rightArrayBegin]) {
                resultArray[resultArrayBegin++] = array[leftArrayBegin++];
            } else {
                resultArray[resultArrayBegin++] = array[rightArrayBegin++];
            }
        }

        // After the main loop completed we may have few more elements in
        // left array copy them first
        while (leftArrayBegin <= leftArrayEnd) {
            resultArray[resultArrayBegin++] = array[leftArrayBegin++];
        }

        // After the main loop completed we may have few more elements in
        // right array copy them
        while (rightArrayBegin <= rightArrayEnd) {
            resultArray[resultArrayBegin++] = array[rightArrayBegin++];
        }

        // Copy resultArray back to the main array
        for (int i = numElements - 1; i >= 0; i--, rightArrayEnd--) {
            array[rightArrayEnd] = resultArray[i];
        }
    }

    /*
     * A sample merge method to help understand the merge routine.
     * This below function is not used by the merge sort
     *
     * This is here only for explanation purpose
     */
    public int[] sampleMerge(int[] leftArray, int[] rightArray) {

        int leftArrayEnd = leftArray.length - 1;
        int rightArrayEnd = rightArray.length - 1;
        int leftArrayBegin = 0;
        int rightArrayBegin = 0;

        int numElements = leftArray.length + rightArray.length;
        int[] resultArray = new int[numElements];
        int resultArrayBegin = 0;

        // Find the smallest element in both these array and add it to the temp
        // array i.e you may have a array of the form [1,5] [2,4]
        // We need to sort the above as [1,2,4,5]
        while (leftArrayBegin <= leftArrayEnd && rightArrayBegin <= rightArrayEnd) {
            if (leftArray[leftArrayBegin] <= rightArray[rightArrayBegin]) {
                resultArray[resultArrayBegin++] = leftArray[leftArrayBegin++];
            } else {
                resultArray[resultArrayBegin++] = rightArray[rightArrayBegin++];
            }
        }

        // After the main loop completed we may have few more elements in
        // left array copy them first
        while (leftArrayBegin <= leftArrayEnd) {
            resultArray[resultArrayBegin++] = leftArray[leftArrayBegin++];
        }

        // After the main loop completed we may have few more elements in
        // right array copy them
        while (rightArrayBegin <= rightArrayEnd) {
            resultArray[resultArrayBegin++] = rightArray[rightArrayBegin++];
        }

        return resultArray;
    }

    public static void main(String args[]) {

        System.out.println("Running mergeSort....");
        System.out.println("Running merge sort on..{7, 1, 8, 2, 0, 12, 10, 7, 5, 3}..");
        int array[] = {7, 1, 8, 2, 0, 12, 10, 7, 5, 3};

        MergeSort mergeSort = new MergeSort();

        mergeSort.sort(array);

        dumpArray(array);

        System.out.println("Now demo a simple merge routine....");
        System.out.println("Merging..{1, 3, 5, 7} and ..{2, 4, 6, 8, 10}..");
        int leftArray[] = {1, 3, 5, 7};
        int rightArray[] = {2, 4, 6, 8, 10};

        int[] mergedArray = mergeSort.sampleMerge(leftArray, rightArray);

        dumpArray(mergedArray);
    }

    /*
     * Utility for dumping the array
     */
    public static void dumpArray(int[] array) {

        for (int value : array) {
            System.out.println(value);
        }
    }
}

16 comments:

  1. Very well explained. Good Job

    ReplyDelete
  2. Excellent explanation. Few sites have this simplicity. Congrats.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Great job in explaining.

    ReplyDelete
  5. the best explanation I have ever come across.

    ReplyDelete
  6. Best explanation of merge sort. love it.

    ReplyDelete
  7. I have a question after calling the last mergesort when the left=0 and center=1,then c=0+1/2 ,c=0,
    hence mergesort(arr,0,0) what will this result end int0

    ReplyDelete
  8. Here in the last call to mergesort l=0 and c=1,
    mergesort(arr,0,1)
    l<r
    hence c=0+1/2 c=0
    mergesort(arr,0,0) what will happen after this

    ReplyDelete
  9. very nice article.Once I had written merge for two arrays.Algo in itself become so much clear.Earlier i used to feel very uncomfortable in implementing merge sort.After this article it is the work of few minutes.Thanks a lot sir.

    ReplyDelete
  10. Best one for merge sort :) Thanks!!

    ReplyDelete
  11. thank you very much for explaining merge sort this much easily...

    ReplyDelete
  12. THANK YOU SO MUCH! This really helped

    ReplyDelete
  13. This is by far the best explanation. thank you very much sir.

    ReplyDelete