Published on

Analysis Of Algorithms

Authors
  • avatar
    Name
    Simon Kurbiel
    Twitter

Introduction

This will be the introduction to data structures and algorithms. Most of the concepts here can be applied to any classical algorithm. We willl focus mainly on the analysis and proof portions. For the programming parts, i have decided to write them in python. However, I might add c/go parts as well. Here is the table of contents.

Worst-case and Average-case Analysis

Linear Search Algorithm

let us take a simple algorithm such as the linear search. We have a key kk, and we would like to find it's index. Here is the algoriithm

def linear_search(arr, key):
    for i in range(len(arr)):
        if arr[i] == key:
            return i
    return -1

worst-case

The worst case will be mostly what we are looking for, not only because it is easier, but also because it is a good reflection of overall performance. In this case, the worst case is abviously nn, if the key is the last item or if it doesn't exists at all. Because each iteration of the for loop will take some constant time CC, in this case, the worst-case time of the algorithm is

T(n)Cn+DT(n) \le Cn+D

The constant DD represents the maximum amount of time for all statements that are execute only once, independent of the variable nn

average case

Calculating the average case is a bit more involved, so we will skip it as worst-case is what is mostly sought after.

Finding Max & Min of an Array

Here is a python program to find the max and min of an Array

def max_min(array:list)->tuple:
    max:int =array[0]
    min:int = array[0]

    for i in range(1,len(array)):
        if array[i] > max:
            max = array[i]
        elif array[i] < min:
            min = array[i]

    return min,max

worst case

in this case, the worst case scenario is very simple to spot. The for loop will be executed at least n1n-1 times at skips the first index. However, the worst case scenario is if it is in a decending order, which means both if else blocks will be executed which makes our algorithm run 2(n1)2(n-1) times

Asymptotic Complexity

intro

When we study the running time of algorithms, our focus is on the perfomance when the size gets very large, because for smaller numbers, the running time will be very small anyways.

Suppose there are two algorithms for a problem of size n, with these running tmes:

T1(n)=10nT2(n)=2n2T_1(n) = 10n \\ T_2(n) = 2n^2

Which one of these is faster? Lets tabulate the results for some values of nn

nn10n10n2n22n^2
1102
2208
55050
10100200
1001,00020,000
1,00010,0002,000,000
10,000100,000200,000,000
100,0001,000,0002 ×\times 10^10

We can observe the following things

  • for n<5n \lt 5, T1T_1 is larger
  • at n=5n=5, they are both even
  • at n>5n \gt 5, T2T_2 grows faster

Since T2T_2 grows larger than T1T_1 for larger input nn, we say that T2T_2 is asymptotically larger than T1T_1. Generally, when comparing algorithms, we drop the lowest coeffecients as it isnt a good indicator as seen above.

Upper Bound

suppose there are positive constants CC and n0n_0 such that

T(n)Cf(n),nn0T(n) \le C \cdot f(n), \forall n \ge n_0

Then we say T(n)T(n) is O(f(n))O(f(n)) The O()O() is read as "big oh" of.

Example

Prove the following function is O(n2)O(n^2)

T(n)=5n2+10n+100T(n) = 5n^2+10n+100

Solution: Intuitively, when nn gets large, the total value will be ~5n25n^2 because the remaining terms are negligible. Now we have to prove that T(n)T(n) is O(n2)O(n^2) by finding the positive constants CC and n0n_0 such that T(n)Cn2,nn0T(n) \le Cn^2, \forall n \ge n_0

T(n)=5n2+10n+1005n2+10n(n)+100(n2),n1115n2,n1\begin{align*} &T(n) = 5n^2 + 10n+ 100 \\ &\le 5n^2 + 10n(n) + 100(n^2), n \ge 1 \\ &\le 115n^2, n \ge 1 \end{align*}

Example

Prove the following polynomial is O(n4)O(n^4)

T(n)=5n410n3+20n250n+100T(n) = 5n^4-10n^3+20n^2-50n+100

Proof : first we need to get rid of the negative terms

T(n)=5n4+20n2+100,n10,(n/10)15n4+20n2(n/10)2+100(n/10)45.21n4,n10.\begin{align*} &T(n) = 5n^4+20n^2+100, n ≥10, (n/10)≥1 \\ &≤5n^4+20n^2(n/10)^2+100(n/10)^4 \\ &5.21n^4, n≥10. \end{align*}

Lower Bound

Ω()() Lower bound is symmetrical to O()O()

Suppose there are positive constants CC and non_o such that

T(n)Cf(n),nnoT(n) ≥ C ⋅ f(n), ∀n ≥ n_o \\

Then we say T(n)T(n) is Ω(f(n))(f(n))

Example

Prove the following polynomial is Ω(n4)(n^4)

T(n)=5n410n3+20n250n+100T(n) = 5n^4-10n^3+20n^2-50n+100

Proof : We must show that T(n)Cn4,nnoT(n) ≥Cn^4, ∀n ≥n_o for some positive constants C,noC,n_o. Here we need to pick non_o carefully so that the constant CC becomes positive.

T(n)=5n410n3+20n250n+1005n410n350n5n410n3(n/100)50n(n/100)3,n1004.89995n4\begin{align*} &T(n) = 5n^4-10n^3+20n^2-50n+100 \\ &≥5n^4-10n^3-50n \\ &≥ 5n^4-10n^3(n/100)-50n(n/100)^3, n ≥100 \\ &≥ 4.89995n^4 \end{align*}

Tight Bound

θ()θ() Tight Bound

Suppose there are positive constants C1,C2,noC_1, C_2, n_o such that

C1f(n)T(n)C2f(n),nnoC_1f(n) ≤ T(n) ≤ C_2f(n), ∀n ≥ n_o

That is, T(n)T(n) is both O(f(n))O(f(n)) and Ω(f(n))Ω(f(n))
Then we say that T(n)T(n) is Ω(f(n))Ω(f(n)).

Example

Prove the following summation is Θ(n2)(n^2) without using the arithmetic sum formula

S(n)=1+2+3...+nS(n) = 1+2+3...+n
  1. Prove O(n2)O(n^2)
    s(n)=1+2+...+nn+n+...+nn2\begin{align*} &s(n) = 1+2+...+n \\ &≤ n+n+...+n \\ &≤n^2 \end{align*}
  2. Prove Ω(n2)(n^2)
    S(n)=1+2+3...+n=i=1nii=n2nii=n2nn2n2n2n24\begin{align*} &S(n) = 1+2+3...+n = \sum_{i=1}^{n}i \\ &≥ \sum_{i=\frac{n}{2}}^{n}i \\ &≥ \sum_{i=\frac{n}{2}}^{n} |\frac{n}{2}| \\ &≥ |\frac{n}{2}| \cdot |\frac{n}{2}| ≥ \frac{n^2}{4} \end{align*}

Analysis

Nested Loop

suppose we have an algorithm that looks like this

c = 0
for i in range(n+1):
    for j in range(i,3n-1):
        c = c+1

For this problem, it's much easier if we simply just do a loop analysis. We simply use the sum-rule and product-rule.

  • we know that the inner loop is execute 3n13n-1 times. Since the range of ii values is 1 to n+1n+1, then we know 3n13n-1 is O(n)O(n)
  • The outer loop is executed n+1n+1 times, which is O(n)O(n).
  • Therefore by product rule, the total running time is O(n^2)

Insertion Sort Algorithm

def insertion_sort(array:list)->None:
    for i in range(1,len(array)):
        j:int = i
        while j > 0 and array[j] < array[j-1]:
            temp:int = array[j]
            array[j] = array[j-1]
            array[j-1] = temp
            j = j-1

worst case

The worst case scenario of the insertion algorithm is if everything is in a descending order. In this case, since the first number is skipped, the number of comparisons will be,

f(n)=i=1n1i=n(n1)2=n2n2f(n) = \sum_{i=1}^{n-1} i = \frac {n(n-1)}{2} = \frac {n^2-n}{2}

The result was obtained using the arithmetic sum formula. We conclude that the worst-case total time of the algorithm is O(n2)O(n^2)