Counting Sort | Dheerendra Varma

Counting sort skips comparisons entirely by tallying how many times each value appears, then using those tallies to place every element directly into its final position. Every comparison-based algorithm must make at least Ω(n log n) comparisons to sort n arbitrary elements; counting sort sidesteps that lower bound by exploiting a constraint — the input values are non-negative integers bounded by some known maximum k. When k is small relative to n, this buys a dramatic speedup.

Think of it as counting votes. You do not need to compare ballot A against ballot B for every pair — you just tally each candidate's total and read results in order.

Step-by-Step Trace

Input: [4, 2, 2, 8, 3, 3, 1], k = 8 (max value)

Pass 1 — build count array (index = value, cell = occurrences)

value:  0  1  2  3  4  5  6  7  8
count:  0  1  2  2  1  0  0  0  1

Pass 2 — prefix-sum transform (each cell now holds "last output position + 1" for that value)

count[i] += count[i-1] for i in 1..=k

value:  0  1  2  3  4  5  6  7  8
count:  0  1  3  5  6  6  6  6  7

Each count[v] is now the number of elements ≤ v, which equals the 1-based index of the last position that value v can occupy in the output.

Pass 3 — place elements in reverse order (right-to-left through input)

Process 1  → count[1]=1 → output[0]=1  → count[1] becomes 0
Process 3  → count[3]=5 → output[4]=3  → count[3] becomes 4
Process 3  → count[3]=4 → output[3]=3  → count[3] becomes 3
Process 8  → count[8]=7 → output[6]=8  → count[8] becomes 6
Process 2  → count[2]=3 → output[2]=2  → count[2] becomes 2
Process 2  → count[2]=2 → output[1]=2  → count[2] becomes 1
Process 4  → count[4]=6 → output[5]=4  → count[4] becomes 5

output: [1, 2, 2, 3, 3, 4, 8]

Iterating right-to-left preserves the original relative order of equal elements — making the sort stable. Stability is required when counting sort is used as the inner pass of radix sort.

Naive Implementation

The naive version just reconstructs the output by iterating over the count array. It is not stable — when multiple identical values exist, you cannot tell which original element goes first.

fn counting_sort_naive(input: &[usize], k: usize) -> Vec<usize> {
    // k is the maximum value (inclusive)
    let mut count = vec![0usize; k + 1];
 
    // Pass 1: tally occurrences
    for &x in input {
        count[x] += 1;
    }
 
    // Pass 2: reconstruct output by reading count array in order
    let mut output = Vec::with_capacity(input.len());
    for (value, &freq) in count.iter().enumerate() {
        for _ in 0..freq {
            output.push(value);
        }
    }
 
    output
}
 
fn main() {
    let data = vec![4usize, 2, 2, 8, 3, 3, 1];
    let sorted = counting_sort_naive(&data, 8);
    println!("{:?}", sorted); // [1, 2, 2, 3, 3, 4, 8]
}

This works for plain integers but loses stability: if the input were pairs (key, payload), you could not preserve payload order for equal keys.

Optimized Implementation

The optimized version uses the prefix-sum trick so that each element's exact output index is computed before placement. Iterating right-to-left through the input then guarantees stability.

fn counting_sort_stable(input: &[usize], k: usize) -> Vec<usize> {
    let n = input.len();
    let mut count = vec![0usize; k + 1];
 
    // Pass 1: tally
    for &x in input {
        count[x] += 1;
    }
 
    // Pass 2: prefix-sum — count[v] becomes number of elements <= v
    for i in 1..=k {
        count[i] += count[i - 1];
    }
 
    // Pass 3: place — iterate input right-to-left for stability
    let mut output = vec![0usize; n];
    for &x in input.iter().rev() {
        // count[x] is a 1-based count; subtract 1 for 0-based index
        count[x] -= 1;
        output[count[x]] = x;
    }
 
    output
}
 
fn main() {
    let data = vec![4usize, 2, 2, 8, 3, 3, 1];
    let sorted = counting_sort_stable(&data, 8);
    println!("{:?}", sorted); // [1, 2, 2, 3, 3, 4, 8]
}

Why reverse iteration? After the prefix-sum pass, count[v] points one past the last slot allocated for value v. The rightmost occurrence of v in the input should land in count[v] - 1, then the next occurrence (moving left) lands in count[v] - 2, and so on. Iterating right-to-left through the input fills slots from right to left for each value, preserving the original order among duplicates.

Time Complexity

Case	Time	Explanation
Best	O(n + k)	Single pass to count, single pass to reconstruct
Average	O(n + k)	No branching on values; always three linear passes
Worst	O(n + k)	Same three passes regardless of input order

The constant hidden inside O(n + k) is very small — roughly 3n + k operations. When k is O(n), this reduces to O(n), beating any comparison sort. When k grows much larger than n — for example, sorting 100 numbers in the range [0, 10⁹) — the k term dominates and counting sort becomes impractical, both in time and memory.

Space Complexity

O(n + k): the count array of size k + 1 plus the output array of size n. Neither can be avoided with this approach. For large k (say, 32-bit integers) you would need 4 GB just for the count array, which is why counting sort is typically restricted to small k or used digit-by-digit inside radix sort.