Published on

Binary Search Algorithm

Authors
  • avatar
    Name
    Simon Kurbiel
    Twitter

Introduction

The sequential search works on an unsorted array and runs in O(n)O(n) time. But if the array is sorted, it can be done much more efficiently in O(log2n)O(\log_2n) time though a divide and conquer algorithm known as binary-search.

How it works

The algorithm is very simple. Given a sorted array, array[0:n1]array[0:n-1], and a serach key, the algorithm starts comparing the search key against the middle element of the array, array[m]array[m]. Then it follows these steps.

  1. if KEY=array[m]KEY = array[m], simply return m.
  2. if KEY>array[m]KEY > array[m], recursively search through the right half othe array.
  3. if KEY<array[m]KEY < array[m], recursively search through the left half othe array.

As we can see, after one operation, the size is reduced to n2\frac {n}{2}. After a second operation, the size is reduced to n4\frac {n}{4} and so on. So in the worst case scenario, the algorithms will make log2n\log_2n comparisons. Here is the code below.

Algorithm

def binary_search(array:list,left:int, right:int, key:int)->int:
    if left >= right:
        return -1
    m = int((left+right)/2)
    if key == array[m]:
        return m
    elif key > array[m]:
        return binary_search(array, left+1, right, key)
    return binary_search(array,left,m-1, key )

Analysis for special case

when n=2kn=2^k,

we will combine both key comparisons as one single comparison. That is because they are between the same pair of elements. Under the hood(assembly), the machine will probably do some optimization to count them as a single operation, thus we will consider them to be one. In that case

f(n)={1,n=11+f(n2)n2\begin{align*} f(n)= \begin{cases} 1, & \quad n=1 \\ 1+f( \frac {n}{2}) & \quad n≥2 \end{cases} \end{align*}

Solving this by repeated substitutions, we will obtain:

f(n)=1+f(n2)f(n)=1+1+f(n4)f(n)=1+1+1+f(n8)f(n)=1+1+1+1+f(n16)f(n)=4+f(n24)f(n)=k+f(n2k)f(n)=k+1f(n)=log2n+1\begin{align*} &f(n) = 1+f(\frac{n}{2}) \\ &f(n) = 1+1+f(\frac{n}{4}) \\ &f(n) = 1+1+1+f(\frac{n}{8}) \\ &f(n) = 1+1+1+1+f(\frac{n}{16}) \\ &f(n) = 4+f(\frac{n}{2^4}) \\ &f(n) = k+f(\frac{n}{2^k}) \\ &f(n) = k+1 \\ &f(n) = \log_2n+1 \\ \end{align*}

Analysis for general n

for the general case, the size of the recursive call is at most n2\lfloor \frac {n}{2} \rfloor. So,

f(n)={1,n=11+f(n2)n2\begin{align*} f(n)= \begin{cases} 1, & \quad n=1 \\ 1+f( \lfloor \frac {n}{2} \rfloor) & \quad n≥2 \end{cases} \end{align*}

We will prove by induction that the solution is f(n)=log2n+1f(n) = \lfloor \log_2n \rfloor +1

  1. base case, n=1 from the recurrence, f(1)=1 and the claimed solution is f(n)=log21+1=1n=1 \text{ from the recurrence, }f(1)=1 \text{ and the claimed solution is }f(n) = \lfloor \log_21 \rfloor+1=1 , so the basic is correct.

  2. induction step. Any nn can be expressed as follows, for some integer kk.

    2kn<2k+12^k \le n < 2^{k+1}

    This means that log2n=k \lfloor \log_2n \rfloor = k. Also,

    2k1n2<2k 2^{k-1} \le \lfloor \frac {n}{2} \rfloor < 2^k

    Thus, log2n2=k1 \lfloor \log_2 \lfloor \frac {n}{2} \rfloor \rfloor = k-1. To prove the claimed solution for any n2n \ge 2, suppose the solution is correct for all smaller values. That is,

    f(m)=logm+1,m<n f(m) = \lfloor \log m \rfloor +1, \quad \forall m < n

    In particular, for m=n2m = \lfloor \frac {n}{2} \rfloor ,

    f(n2)=log2n2+1=(k1)+1=k=log2n f(\lfloor \frac {n}{2} \rfloor) = \lfloor \log_2 \lfloor \frac {n}{2} \rfloor \rfloor +1 = (k-1)+1 = k = \lfloor \log_2n \rfloor

    Then,

    f(n)=f(n2)+1=k+1=log2n+1 f(n) = f(\lfloor \frac {n}{2} \rfloor) + 1 = k+1 = \lfloor \log_2n \rfloor + 1