Pages

Thursday, July 7, 2011

A Simple Counting Sort

Counting sort is one of the most efficient sorting algorithm, yet most books and articles choose to ignore this algorithm because of it simplicity or they assume that you know already know this algorithm. Here is a attempt to emphasize the importance and elegance of this sorting mechanism

A. Know your input
Counting sort exploits the fact, that you have some knowledge about the input array we are going to sort, especially the range of the elements in the array.

B. Coins example
Say we have thousand coins and each coin can be one of these values, say 5 cents, 10 cents, 25 cents (quarter dollar) or 100 cents (one dollar). Now we need to sort it

About the input, in the above case we know that we have 4 types of coins, So if we count
  1. the number of 5 cents coins (say in a variable counter1)
  2. the number of 10 cents coins (say in a variable counter2)
  3. the number of 25 cents coins (say in a variable counter3)
  4. the number of 100 cents coins (say in a variable counter4)
We can sort the array.

How ?

After counting we simply overwrite the original array with the counter values we have above . Here is how we do it.

After counting, take input array and
  1. Make the first counter 1 number of elements as 5 ( i.e. array indices 0 to counter1 - 1 have value 5)
  2. Make the next counter 2 number of elements as 10 ( i.e. array indices counter1 to counter2 - 1 have value 10)
  3. Make the next counter 3 number of elements as 25 ( i.e. array indices counter2 to counter3 - 1 have value 25)
  4. Make the next counter 4 number of elements as 100 ( i.e. array indices counter3 to counter4 - 1 have value 100)
Hence we have a sorted output by just counting each elements in array. A simple and elegant sorting solution.
Note: We did not compare any element with other element in the array.

C. Counting elements
The key to understand counting sort is first to write a simple function that would find the number of times an element 'e' occurs in an array. Here is a simple code to do it.
/*
     * Finds the number of times an element occurs in an array
     *
     * @ returns the count or number of times an element occurs
     */
    public int countOccurences(int value, int array[]) {

        int count = 0;

        for (int i = 0; i < array.length; i++) {
            if (value == array[i]) {
                count++;
            }
        }

        return count;
    }

D. Counting Sort algorithm : 
As explained in the earlier sample, counting sort algorithm does two things
  1. Count the number of times each type of element occurs in an array. 
  2. Overwrite the input array to have elements based on the above counter.
In the coins example above we have 4 types of coins ( 5, 10, 25 & 100 all in cents) - To count each type of coin, we could have 4 variables/counters and count them, rather than doing it that way.,

We'll do it with an array i.e. Say we declare an array as
int counter[] = new int[4];
and
  1. Use counter[0] as a counter to have number of 5 cents.
  2. use counter[1] as a counter to have number of 10 cents.
  3. use counter[2] as a counter to have number of 25 cents.
  4. use counter[3] as a counter to have number of 100 cents.
E. Code
Here is the code that implements the above algorithm.
/*
     * Does the counting sort on the input array
     */
    public void sort(int array[]) {

        // Step 1 : Count the elements
        int counter[] = new int[4];

        for (int i = 0; i < array.length; i++) {
            int value = array[i];
            if (value == 5) {
                counter[0]++;
            } else if (value == 10) {
                counter[1]++;
            }
            if (value == 25) {
                counter[2]++;
            }
            if (value == 100) {
                counter[3]++;
            }
        }

        // Step 2: Now lets overwrite the input array
        int index = 0;
        for (int j = 0; j < counter.length; j++) {
            for (int i = 0; i < counter[j]; i++) {
                if (j == 0) {
                    array[index] = 5;
                } else if (j == 1) {
                    array[index] = 10;
                } else if (j == 2) {
                    array[index] = 25;
                } else if (j == 3) {
                    array[index] = 100;
                }
                index++;
            }
        }
    }

Wow what is happening in Step 2 ? 

For easier understanding, step 2 is nothing but an optimization of below code (rather than pasting the same code 4 times, We have made it as a nested loop)
// Make first few elements as 5  
        int index = 0;
        for (int i = 0; i < counter[0]; i++) {
            array[index] = 5;
            index++;
        }
        
        // Make next few elements as 10
        for (int i = 0; i < counter[1]; i++) {
            array[index] = 10;
            index++;
        }
        
        // Make first few elements as 25
        for (int i = 0; i < counter[2]; i++) {
            array[index] = 25;
            index++;
        }
        
        // Make first few elements as 100
        for (int i = 0; i < counter[3]; i++) {
            array[index] = 100;
            index++;
        }

The above counting sort code is useless, Its not a general purpose Counting Sort.

Yes, the intent of this post is to help you understand counting sort esp basics. Now that you know the algorithm, you can write your own which finds the minimum value in the array the maximum value in the array and allot number of counters for this range and do the rest.

F. What is the complexity of the above algorithm ?

Time Complexity :
O(n)

How ? 

Even though there is a doubly-nested loop, the upper loop is on counter length which say is 'k' and inner loop on number of element 'n', iterates over it only once.

If you closely check the loop, you'll see that even though there is a nested loop, we'd be iterating over the input array in step 2 only once. The exploded sample under section "What is happening in Step2" would make that obvious.

Space Complexity :
O(k)

Checkout the other article on How to find the complexity of a code.

G. Source code

Here is the complete counting sort sample java file used in this post.

2 comments:

  1. How about if we increase the area size to include {4,9,8,12,13,21,23,43,21,1, 23,45,24,6,5}, are we gonna create a counter to each value?

    ReplyDelete
  2. We cannot generalize counting sort to all sorting problems, However if your input dataset is from a finite set (i.e you know the range or number of distinct input elements) - yes it would be a perfect fit

    ReplyDelete