The Binary Search Algorithm is fundamental in Computer Science. It is a very clever algorithm which reduces the time needed to search for items in large datasets dramatically compared to less efficient approaches.
It is important to note that in order to use binary search, your data must be sorted. Some people get mixed up with sorting algorithms and searching algorithms, grouping them together in their thinking, but it is well worth taking a moment to organise your “algorithm toolkit” a little and make sure that searching and sorting each have their own section. Just as a reminder, see these lists for some examples of each:
- Linear search
- Binary search
- Bubble sort
- Insertion sort
- Merge sort
- Quick sort
How Does the Binary Search Algorithm Work?
This article is about binary search and how to write it using Python. Before writing it in code or pseudocode, you need to understand the process thoroughly, and this is best done by doing plenty of examples by hand on paper or on a whiteboard. You can get a very good intuition for how the algorithm works by playing a guessing game with a friend.
- One player thinks of a number between 1 and 64, and the other player tries to guess what it is.
- For each guess, the player who knows the number will tell the other player if their guess is correct, or whether the number they thought of is higher of lower than the guess.
If you play this game a few times you will quickly discover that some strategies for guessing the number are more effective than others. For example, imagine the original number is
13, and you guess
50. You are told your guess is too high. Guessing
49 next would not be good choice. In order to optimize your chances of guessing the number in as few tries as possible, the best strategy is to guess halfway between the known upper and lower bounds for the actual value. So, if you know the number is between
64, you should guess
32. If you are told that is too high, you should guess
16, and so on. Each time you are halving the search space meaning you are guaranteed to reach the answer in relatively few steps. That is is essence of of how binary search works.
Pseudocode for Binary Search
If you are studying Computer Science for an exam, you may need to write pseudocode for the Binary Search Algorithm. Below is a version which uses syntax which is compatible with the pseudocode guide for the OCR exam board. Personally I consider this step to be pedagogically redundant, as explained in my article entitled We Need to Talk About Pseudocode, but I include it here as it may help some of you.
// Binary search algorithm Pseudocode (OCR)
haystack = [7, 7, 22, 37, 47, 55, 57, 57, 86, 91] // MUST be sorted
needle = int(input("Enter the number you are searching for: "))
length = haystack.length
lower_bound = 0
upper_bound = length - 1
found = False
while found == False and lower_bound <= upper_bound:
mid_point = (lower_bound + upper_bound) DIV 2
if haystack[mid_point] == needle:
print("You number has been found.")
found = True
elif haystack[mid_point] < needle:
lower_bound = mid_point + 1
upper_bound = mid_point - 1
if found == False:
print("Your number is not in the list.")
The algorithm uses an important technique called divide and conquer as mentioned in the video. At each stage half of the data set is discarded, and the algorithm is re-applied to the remaining smaller data set until the search item is found or the exit condition is met.
This exit condition is key for this algorithm, and understanding why it is what it is is a good sign that you understand the whole algorithm. In essence, we have a high pointer and a low pointer, and we check the item in the middle of these two pointers to see if it is our search item. If it is, great, we exit, otherwise we move either the high or the low pointer in such as way as to “pincer-in” on our value.
Binary Search in Python
Once you have written and understood the pseudocode, it’s time to write the algorithm in a real programming language, such as Python. You should always have a specific example in mind to test that your program behaves as expected. So, define a list (I often call this
haystack), and a search item (
needle), and see if you can get your program to find the needle in the haystack. And remember: the list must be sorted first!
Have a go now, and when you finish, or get stuck and need help, check out my version below.
Comments welcome below, and don’t forget to sign up to our mailing list for news and offers.