Binary Search Algorithm

Introduction

The sequential search works on an unsorted array and runs in $O(n)$ time. But if the array is sorted, it can be done much more efficiently in $O(\log_2n)$ time though a divide and conquer algorithm known as binary-search.

How it works

How it works

The algorithm is very simple. Given a sorted array, $array[0:n-1]$ , and a serach key, the algorithm starts comparing the search key against the middle element of the array, $array[m]$ . Then it follows these steps.

if $KEY = array[m]$ , simply return m.
if $KEY > array[m]$ , recursively search through the right half othe array.
if $KEY < array[m]$ , recursively search through the left half othe array.

As we can see, after one operation, the size is reduced to $\frac {n}{2}$ . After a second operation, the size is reduced to $\frac {n}{4}$ and so on. So in the worst case scenario, the algorithms will make $\log_2n$ comparisons. Here is the code below.

Algorithm

def binary_search(array:list,left:int, right:int, key:int)->int:
    if left >= right:
        return -1
    m = int((left+right)/2)
    if key == array[m]:
        return m
    elif key > array[m]:
        return binary_search(array, left+1, right, key)
    return binary_search(array,left,m-1, key )

Analysis for special case

when $n=2^k$ ,

we will combine both key comparisons as one single comparison. That is because they are between the same pair of elements. Under the hood(assembly), the machine will probably do some optimization to count them as a single operation, thus we will consider them to be one. In that case

\begin{align*} f(n)= \begin{cases} 1, & \quad n=1 \\ 1+f( \frac {n}{2}) & \quad n≥2 \end{cases} \end{align*}

Solving this by repeated substitutions, we will obtain:

\begin{align*} &f(n) = 1+f(\frac{n}{2}) \\ &f(n) = 1+1+f(\frac{n}{4}) \\ &f(n) = 1+1+1+f(\frac{n}{8}) \\ &f(n) = 1+1+1+1+f(\frac{n}{16}) \\ &f(n) = 4+f(\frac{n}{2^4}) \\ &f(n) = k+f(\frac{n}{2^k}) \\ &f(n) = k+1 \\ &f(n) = \log_2n+1 \\ \end{align*}

Analysis for general n

for the general case, the size of the recursive call is at most $\lfloor \frac {n}{2} \rfloor$ . So,

\begin{align*} f(n)= \begin{cases} 1, & \quad n=1 \\ 1+f( \lfloor \frac {n}{2} \rfloor) & \quad n≥2 \end{cases} \end{align*}

We will prove by induction that the solution is $f(n) = \lfloor \log_2n \rfloor +1$

base case, $n=1 \text{ from the recurrence, }f(1)=1 \text{ and the claimed solution is }f(n) = \lfloor \log_21 \rfloor+1=1$ , so the basic is correct.
induction step. Any $n$ can be expressed as follows, for some integer $k$ .
$2^k \le n < 2^{k+1}$
This means that $\lfloor \log_2n \rfloor = k$ . Also,
$2^{k-1} \le \lfloor \frac {n}{2} \rfloor < 2^k$
Thus, $\lfloor \log_2 \lfloor \frac {n}{2} \rfloor \rfloor = k-1$ . To prove the claimed solution for any $n \ge 2$ , suppose the solution is correct for all smaller values. That is,
$f(m) = \lfloor \log m \rfloor +1, \quad \forall m < n$
In particular, for $m = \lfloor \frac {n}{2} \rfloor$ ,
$f(\lfloor \frac {n}{2} \rfloor) = \lfloor \log_2 \lfloor \frac {n}{2} \rfloor \rfloor +1 = (k-1)+1 = k = \lfloor \log_2n \rfloor$
Then,
$f(n) = f(\lfloor \frac {n}{2} \rfloor) + 1 = k+1 = \lfloor \log_2n \rfloor + 1$