Effective Use of Array#bsearch

This was the tip of the week in the April 1, 2021 Ruby Weekly Newsletter.

Binary search is a commonly taught algorithm for finding an element of a sorted array in O(log(n)) time where n is the size of the array. In the context of an array, it takes a sorted array, makes a comparison to a middle element, and then based on that comparison recurses in either the right or left ‘half’ of the array until it either finds an element where a comparison returns true, or has no more elements to look over.

Ruby has an Array#bsearch method to return an element selected by a binary search. Its functionality can be a little tricky to grasp at first. Take a look at this example:

["tip", "of", "the", "week"].sort.bsearch { _1 == "of" }
=> nil

Initially, this result may be counterintuitive. Let’s dig into why this happens, so we can effectively use Array#bsearch going forward!

There are assumptions Array#bsearch makes about the array it’s searching, and the condition on which it’s searching:

The array is sorted.
There are two types of comparisons we can make with the condition against elements of the array:
- If boolean: all false evaluations precede all true evaluations.
- If integer: all positive evaluations precede all zero evaluations which precede all negative evaluations.

It doesn’t check these assumptions, however, and can’t guarantee results when these assumptions aren’t met. So if we look again at our example from above, we see it meets the first assumption: the array is sorted. But on the second assumption, it has boolean results which do not all evaluate to false before true. Let’s take a closer look:

sorted_array = ["tip", "of", "the", "week"].sort
=> ["of", "the", "tip", "week"]

# This violates the second assumption because `false`
# evaluations don't always precede `true` evaluations
sorted_array.map { _1 == "of" }
=> [true, false, false, false]

# There are a few options for how we could fix this:

# If we reverse the array, it'll work
sorted_array.reverse.map { _1 == "of" }
=> [false, false, false, true]

# Tada!
sorted_array.reverse.bsearch { _1 == "of" }
=> "of"

# Or we can use the <=> operator which returns 1,0,-1
# Note we need "of" <=> _1 to hold the ordering property we seek
sorted_array.map { "of" <=> _1 }
=> [0, -1, -1, -1]

# Tada!
sorted_array.bsearch { "of" <=> _1 }
=> "of"

And we can see that as long as both of the assumptions hold, we can rely on Array#bsearch to efficiently and effectively search an array.

An added bonus, if we’re just looking for the index of an element selected by binary search, we can use Array#bsearch_index with the same block as above:

sorted_array.bsearch_index { "of" <=> _1 }
=> 0