Implementing Bucket Sort for Floating Point Numbers in Python

Bucket sort is an efficient sorting algorithm that works well for floating point numbers within a specific range. It distributes elements into several buckets, sorts each bucket individually, and then concatenates the results. This method can significantly improve sorting performance for large datasets of floating point numbers.

Understanding Bucket Sort

Bucket sort assumes that the input consists of uniformly distributed floating point numbers within a known range, typically [0, 1). The algorithm involves three main steps:

  • Dividing the range into buckets
  • Distributing the elements into these buckets
  • Sorting each bucket individually

Implementing Bucket Sort in Python

Here’s a simple implementation of bucket sort for floating point numbers in Python:

def bucket_sort(arr):
    n = len(arr)
    if n == 0:
        return arr

    # Create n empty buckets
    buckets = [[] for _ in range(n)]

    # Insert elements into their respective buckets
    for num in arr:
        index = int(num * n)
        if index == n:
            index = n - 1
        buckets[index].append(num)

    # Sort individual buckets and concatenate
    sorted_array = []
    for bucket in buckets:
        sorted_bucket = sorted(bucket)
        sorted_array.extend(sorted_bucket)

    return sorted_array

# Example usage:
numbers = [0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51]
sorted_numbers = bucket_sort(numbers)
print(sorted_numbers)

Advantages of Bucket Sort

Bucket sort offers several benefits:

  • Efficient for uniformly distributed data
  • Can be faster than comparison-based sorts like quicksort or mergesort
  • Easy to implement and understand

Limitations and Considerations

Despite its advantages, bucket sort has some limitations:

  • Works best with known, uniform distributions
  • Performance degrades with non-uniform data
  • Requires additional space for buckets

Understanding these factors helps in deciding when to use bucket sort effectively.