Analysis Of Algorithms

Introduction

This will be the introduction to data structures and algorithms. Most of the concepts here can be applied to any classical algorithm. We willl focus mainly on the analysis and proof portions. For the programming parts, i have decided to write them in python. However, I might add c/go parts as well. Here is the table of contents.

Worst-case and Average-case Analysis
- Linear Search Algorithm
  - worst-case
  - average case
- Finding Max & Min of an Array
  - worst case
Asymptotic Complexity
Analysis
- Nested Loop
- Insertion Sort Algorithm
  - worst case

Worst-case and Average-case Analysis

Linear Search Algorithm

let us take a simple algorithm such as the linear search. We have a key $k$ , and we would like to find it's index. Here is the algoriithm

def linear_search(arr, key):
    for i in range(len(arr)):
        if arr[i] == key:
            return i
    return -1

worst-case

The worst case will be mostly what we are looking for, not only because it is easier, but also because it is a good reflection of overall performance. In this case, the worst case is abviously $n$ , if the key is the last item or if it doesn't exists at all. Because each iteration of the for loop will take some constant time $C$ , in this case, the worst-case time of the algorithm is

T(n) \le Cn+D

The constant $D$ represents the maximum amount of time for all statements that are execute only once, independent of the variable $n$

average case

Calculating the average case is a bit more involved, so we will skip it as worst-case is what is mostly sought after.

Finding Max & Min of an Array

Here is a python program to find the max and min of an Array

def max_min(array:list)->tuple:
    max:int =array[0]
    min:int = array[0]

    for i in range(1,len(array)):
        if array[i] > max:
            max = array[i]
        elif array[i] < min:
            min = array[i]

    return min,max

worst case

in this case, the worst case scenario is very simple to spot. The for loop will be executed at least $n-1$ times at skips the first index. However, the worst case scenario is if it is in a decending order, which means both if else blocks will be executed which makes our algorithm run $2(n-1)$ times

Asymptotic Complexity

intro

When we study the running time of algorithms, our focus is on the perfomance when the size gets very large, because for smaller numbers, the running time will be very small anyways.

Suppose there are two algorithms for a problem of size n, with these running tmes:

T_1(n) = 10n \\ T_2(n) = 2n^2

Which one of these is faster? Lets tabulate the results for some values of $n$

$n$	$10n$	$2n^2$
1	10	2
2	20	8
5	50	50
10	100	200
100	1,000	20,000
1,000	10,000	2,000,000
10,000	100,000	200,000,000
100,000	1,000,000	2 $\times$ 10^10

We can observe the following things

for $n \lt 5$ , $T_1$ is larger
at $n=5$ , they are both even
at $n \gt 5$ , $T_2$ grows faster

Since $T_2$ grows larger than $T_1$ for larger input $n$ , we say that $T_2$ is asymptotically larger than $T_1$ . Generally, when comparing algorithms, we drop the lowest coeffecients as it isnt a good indicator as seen above.

Upper Bound

suppose there are positive constants $C$ and $n_0$ such that

T(n) \le C \cdot f(n), \forall n \ge n_0

Then we say $T(n)$ is $O(f(n))$ The $O()$ is read as "big oh" of.

Example

Prove the following function is $O(n^2)$

T(n) = 5n^2+10n+100

Solution: Intuitively, when $n$ gets large, the total value will be ~ $5n^2$ because the remaining terms are negligible. Now we have to prove that $T(n)$ is $O(n^2)$ by finding the positive constants $C$ and $n_0$ such that $T(n) \le Cn^2, \forall n \ge n_0$

\begin{align*} &T(n) = 5n^2 + 10n+ 100 \\ &\le 5n^2 + 10n(n) + 100(n^2), n \ge 1 \\ &\le 115n^2, n \ge 1 \end{align*}

Example

Prove the following polynomial is $O(n^4)$

T(n) = 5n^4-10n^3+20n^2-50n+100

Proof : first we need to get rid of the negative terms

\begin{align*} &T(n) = 5n^4+20n^2+100, n ≥10, (n/10)≥1 \\ &≤5n^4+20n^2(n/10)^2+100(n/10)^4 \\ &5.21n^4, n≥10. \end{align*}

Lower Bound

Ω $()$ Lower bound is symmetrical to $O()$

Suppose there are positive constants $C$ and $n_o$ such that

T(n) ≥ C ⋅ f(n), ∀n ≥ n_o \\

Then we say $T(n)$ is Ω $(f(n))$

Example

Prove the following polynomial is Ω $(n^4)$

T(n) = 5n^4-10n^3+20n^2-50n+100

Proof : We must show that $T(n) ≥Cn^4, ∀n ≥n_o$ for some positive constants $C,n_o$ . Here we need to pick $n_o$ carefully so that the constant $C$ becomes positive.

\begin{align*} &T(n) = 5n^4-10n^3+20n^2-50n+100 \\ &≥5n^4-10n^3-50n \\ &≥ 5n^4-10n^3(n/100)-50n(n/100)^3, n ≥100 \\ &≥ 4.89995n^4 \end{align*}

Tight Bound

$θ()$ Tight Bound

Suppose there are positive constants $C_1, C_2, n_o$ such that

C_1f(n) ≤ T(n) ≤ C_2f(n), ∀n ≥ n_o

That is, $T(n)$ is both $O(f(n))$ and $Ω(f(n))$
Then we say that $T(n)$ is $Ω(f(n))$ .

Example

Prove the following summation is Θ $(n^2)$ without using the arithmetic sum formula

S(n) = 1+2+3...+n

Prove $O(n^2)$ $\begin{align*} &s(n) = 1+2+...+n \\ &≤ n+n+...+n \\ &≤n^2 \end{align*}$
Prove Ω $(n^2)$
$\begin{align*} &S(n) = 1+2+3...+n = \sum_{i=1}^{n}i \\ &≥ \sum_{i=\frac{n}{2}}^{n}i \\ &≥ \sum_{i=\frac{n}{2}}^{n} |\frac{n}{2}| \\ &≥ |\frac{n}{2}| \cdot |\frac{n}{2}| ≥ \frac{n^2}{4} \end{align*}$

Analysis

Nested Loop

suppose we have an algorithm that looks like this

c = 0
for i in range(n+1):
    for j in range(i,3n-1):
        c = c+1

For this problem, it's much easier if we simply just do a loop analysis. We simply use the sum-rule and product-rule.

we know that the inner loop is execute $3n-1$ times. Since the range of $i$ values is 1 to $n+1$ , then we know $3n-1$ is $O(n)$
The outer loop is executed $n+1$ times, which is $O(n)$ .
Therefore by product rule, the total running time is O(n^2)

Insertion Sort Algorithm

def insertion_sort(array:list)->None:
    for i in range(1,len(array)):
        j:int = i
        while j > 0 and array[j] < array[j-1]:
            temp:int = array[j]
            array[j] = array[j-1]
            array[j-1] = temp
            j = j-1

worst case

The worst case scenario of the insertion algorithm is if everything is in a descending order. In this case, since the first number is skipped, the number of comparisons will be,

f(n) = \sum_{i=1}^{n-1} i = \frac {n(n-1)}{2} = \frac {n^2-n}{2}

The result was obtained using the arithmetic sum formula. We conclude that the worst-case total time of the algorithm is $O(n^2)$