Bucket Sort | Dheerendra Varma

Bucket sort distributes elements across a fixed number of range-partitioned buckets, sorts each bucket independently, then concatenates them into a sorted result. It exploits the distribution of input values rather than their exact integer form. When elements are drawn from a uniform distribution, each bucket receives roughly one element, making each bucket sort trivial and the total work linear. When the distribution is skewed, performance degrades — making distribution awareness the central tradeoff.

Step-by-Step Trace

Input: [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]

Ten elements, range [0, 1). Use 10 buckets of width 0.1.

Scatter — assign each element to bucket floor(x * 10)

bucket 0 [0.0, 0.1): []
bucket 1 [0.1, 0.2): [0.17, 0.12]
bucket 2 [0.2, 0.3): [0.26, 0.21, 0.23]
bucket 3 [0.3, 0.4): [0.39]
bucket 4 [0.4, 0.5): []
bucket 5 [0.5, 0.6): []
bucket 6 [0.6, 0.7): [0.68]
bucket 7 [0.7, 0.8): [0.78, 0.72]
bucket 8 [0.8, 0.9): []
bucket 9 [0.9, 1.0): [0.94]

Sort each non-empty bucket with insertion sort

bucket 1: [0.17, 0.12] → [0.12, 0.17]
bucket 2: [0.26, 0.21, 0.23] → [0.21, 0.23, 0.26]
bucket 7: [0.78, 0.72] → [0.72, 0.78]

Gather — concatenate buckets in order

[] ++ [0.12, 0.17] ++ [0.21, 0.23, 0.26] ++ [0.39] ++ []
  ++ [] ++ [] ++ [0.68] ++ [0.72, 0.78] ++ [] ++ [0.94]

result: [0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]  ✓

Most buckets hold 0–3 elements, so each insertion sort is O(1)–O(9) — tiny. The total work is dominated by the scatter and gather passes, both O(n).

Naive Implementation

The standard formulation targets floats in [0, 1) with n buckets.

fn insertion_sort(bucket: &mut Vec<f64>) {
    let n = bucket.len();
    for i in 1..n {
        let key = bucket[i];
        let mut j = i;
        while j > 0 && bucket[j - 1] > key {
            bucket[j] = bucket[j - 1];
            j -= 1;
        }
        bucket[j] = key;
    }
}
 
fn bucket_sort_naive(data: &[f64]) -> Vec<f64> {
    let n = data.len();
    if n == 0 {
        return vec![];
    }
 
    // n buckets partitioning [0.0, 1.0)
    let mut buckets: Vec<Vec<f64>> = vec![Vec::new(); n];
 
    // Scatter: element x goes into bucket floor(x * n)
    for &x in data {
        let idx = (x * n as f64).floor() as usize;
        // Clamp to handle x == 1.0 edge case
        let idx = idx.min(n - 1);
        buckets[idx].push(x);
    }
 
    // Sort each bucket and gather
    let mut result = Vec::with_capacity(n);
    for bucket in &mut buckets {
        insertion_sort(bucket);
        result.extend_from_slice(bucket);
    }
 
    result
}
 
fn main() {
    let data = vec![0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68];
    let sorted = bucket_sort_naive(&data);
    println!("{:.2?}", sorted);
    // [0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]
}

This works cleanly for floats in [0, 1) but fails for arbitrary ranges or integer inputs without preprocessing.

Optimized Implementation

The optimized version handles arbitrary ranges by normalizing inputs, and uses a configurable number of buckets rather than always n. It also handles integer inputs by converting them to a floating-point ratio over the observed range.

fn insertion_sort_f64(bucket: &mut Vec<f64>) {
    for i in 1..bucket.len() {
        let key = bucket[i];
        let mut j = i;
        while j > 0 && bucket[j - 1] > key {
            bucket[j] = bucket[j - 1];
            j -= 1;
        }
        bucket[j] = key;
    }
}
 
fn bucket_sort(data: &[f64], num_buckets: usize) -> Vec<f64> {
    let n = data.len();
    if n == 0 || num_buckets == 0 {
        return data.to_vec();
    }
 
    // Find actual value range
    let min = data.iter().cloned().fold(f64::INFINITY, f64::min);
    let max = data.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
 
    // All elements are identical — already sorted
    if (max - min).abs() < f64::EPSILON {
        return data.to_vec();
    }
 
    let range = max - min;
    let mut buckets: Vec<Vec<f64>> = vec![Vec::new(); num_buckets];
 
    // Scatter: normalize each element to [0, 1) then map to a bucket index
    for &x in data {
        let normalized = (x - min) / range;
        let idx = (normalized * num_buckets as f64).floor() as usize;
        // Clamp so that x == max lands in the last bucket
        let idx = idx.min(num_buckets - 1);
        buckets[idx].push(x);
    }
 
    // Sort each bucket and gather
    let mut result = Vec::with_capacity(n);
    for bucket in &mut buckets {
        insertion_sort_f64(bucket);
        result.extend_from_slice(bucket);
    }
 
    result
}
 
// Convenience wrapper for integer inputs
fn bucket_sort_ints(data: &[i64], num_buckets: usize) -> Vec<i64> {
    let floats: Vec<f64> = data.iter().map(|&x| x as f64).collect();
    let sorted = bucket_sort(&floats, num_buckets);
    sorted.iter().map(|&x| x as i64).collect()
}
 
fn main() {
    // Float input — arbitrary range
    let data = vec![0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68];
    let sorted = bucket_sort(&data, 10);
    println!("{:.2?}", sorted);
 
    // Integer input
    let ints = vec![34i64, 7, 23, 32, 5, 62];
    let sorted_ints = bucket_sort_ints(&ints, 6);
    println!("{:?}", sorted_ints); // [5, 7, 23, 32, 34, 62]
}

The normalization step (x - min) / range maps any range onto [0, 1), making the bucket index formula identical regardless of the original value domain. Choosing num_buckets close to n keeps the expected bucket size near 1.

Time Complexity

Case	Time	Explanation
Best	O(n)	All buckets receive exactly one element; no sorting needed inside each
Average	O(n + k)	k buckets, each with n/k elements; insertion sort on size-s list is O(s²), expected total is O(n) when uniform
Worst	O(n²)	All n elements fall into one bucket; insertion sort on that bucket is O(n²)

The average-case analysis assumes elements are drawn from a uniform distribution over the range. Under that assumption, the expected number of elements per bucket is n/k, and the expected total insertion-sort work across all buckets is k × O((n/k)²) = O(n²/k). Setting k = n gives O(n) expected time.

The worst case is not a degenerate input distribution — it is any input that sends all elements to the same bucket, such as all elements being identical. The normalization in the optimized version handles the all-equal edge case separately to avoid O(n²) in that situation.

Space Complexity

O(n + k): the k bucket structures hold all n elements at peak, and the output array is size n. When k equals n (the standard choice), total space is O(n). The bucket vectors themselves also carry some pointer overhead per bucket, which is O(k).