02 Analysis

Report
COSC 3100
Fundamentals of Analysis of
Algorithm Efficiency
Instructor: Tanvir
Outline (Ch 2)
• Input size, Orders of growth,
Worst-case, Best-case, Average-case (Sec 2.1)
• Asymptotic notations and Basic efficiency
classes (Sec 2.2)
• Analysis of Nonrecursive Algorithms (Sec 2.3)
• Analysis of Recursive Algoithms (Sec 2.4)
• Example: Computing n-th Fibonacci
Number (Sec 2.5)
• Empirical Analysis, Algorithm Visualization
(Sec 2.6, 2.7)
Analysis of Algoithms
• An investigation of an algorithm’s
efficiency with respect to two
resources:
– Running time (time efficiency or time
complexity)
– Memory space (space efficiency or
space complexity)
Analysis Framework
• Measuring an input’s size
• Measuring Running Time
• Orders of Growth (of algorithm’s
efficiency function)
• Worst-case, Best-case, and Averagecase efficiency
Measuring Input Size
• Efficiency is defined as a function of
input size
• Input size depends on the problem
– Example 1: what is the input size for
sorting n numbers?
– Example 2: what is the input size for
evaluating   =    + −1  −1 +⋯+ 1 
+ 0
– Example 3: what is the input size for
multiplying two n×n matrices?
Units of measuring running
time
• Measure running time in milliseconds,
seconds, etc.
– Depends on which computer
• Count the number of times each operation
is executed
– Difficult and unnecessary
• Count the number of times an algorithm’s
“basic operation” is executed
Measure running time in
terms of # of basic
operations
• Basic operation: the operation that
contributes the most to the total
running time of an algorithm
• Usually the most time consuming
operation in the algorithm’s
innermost loop
Input size and basic
operation examples
Problem
Measure of
input size
Basic
operation
Search for a key in
a list of n items
# of items in the
list
Key comparison
Add two n×n
matrices
Dimensions of the
matrices, n
Addition
Polynomial
evaluation
Order of the
polynomial
Multiplication
Theoretical Analysis of Time
Efficiency
• Count the number of times the algorithm’s
basic operation is executed on inputs of
size n: C(n)
Input size
T(n) ≈ cop × C(n)
Running time
Execution time for
basic operation

Ignore cop,
Focus on
orders of
growth
# of times basic op.
is executed
If C(n) =   −  , how much longer it runs

if the input size is doubled?
Orders of Growth
• Why do we care about the order of growth
of an algorithm’s efficiency function, i.e.,
the total number of basic operations?
Euclid’s
Consecutive
Integer
Checking
gcd(60, 24)
3
13
gcd(31415, 14142)
10
14142
gcd(218922995834555169026,
135301852344706746049)
97
> 1020
How fast efficiency function grows
as n gets larger and larger…
Orders of Growth (contd.)
n
lgn
n
n×lgn
n2
n3
2n
n!
10
3.3
10
3.3×10
102
103
103
3.6×106
102
6.6
102
6.6×102
104
106
1.3×1030
9.3×10157
103
10
103
10×103
106
109
104
13
104
13×104
108
1012
105
17
105
17×105
1010
1015
106
20
106
20×106
1012
1018
Orders of Growth (contd.)
• Plots of growth…
• Consider only the leading term
• Ignore the constant coefficients
Worst, Best, Average Cases
• Efficiency depends on input size n
• For some algorithms, efficiency
depends on the type of input
• Example: Sequential Search
– Given a list of n elements and a search
key k, find if k is in the list
– Scan list, compare elements with k until
either found a match (success), or list is
exhausted (failure)
Sequential Search Algorithm
ALGORITHM SequentialSearch(A[0..n-1], k)
//Input: A[0..n-1] and k
//Output: Index of first match or -1 if no match is
//found
i <- 0
while i < n and A[i] ≠ k do
i <- i+1
if i < n
return i //A[i] = k
else
return -1
Different cases
• Worst case efficiency
– Efficiency (# of times the basic op. will be
executed) for the worst case input of size n
– Runs longest among all possible inputs of size n
• Best case efficiency
– Runs fastest among all possible inputs of size n
• Average case efficiency
– Efficiency for a typical/random input of size n
– NOT the average of worst and best cases
– How do we find average case efficiency?
Average Case of Sequential
Search
• Two assumptions:
– Probability of successful search is p (0 ≤
p ≤ 1)
– Search key can be at any index with
equal probability (uniform distribution)
Cavg(n) = Expected # of comparisons =
Expected # of comparisons for success +
Expected # of comparisons if k is not in the list
Summary of Analysis
Framework
• Time and space efficiencies are functions of input
size
• Time efficiency is # of times basic operation is
executed
• Space efficiency is # of extra memory units
consumed
• Efficiencies of some algorithms depend on type of
input: requiring worst, best, average case analysis
• Focus is on order of growth of running time (or
extra memory units) as input size goes to infinity
Asymptotic Growth Rate
• 3 notations used to compare orders
of growth of an algorithm’s basic
operation count
– O(g(n)): Set of functions that grow no
faster than g(n)
– Ω(g(n)): Set of functions that grow at
least as fast as g(n)
– Θ(g(n)): Set of functions that grow at
the same rate as g(n)
O(big oh)-Notation
c × g(n)
t(n)
Doesn’t
matter
n
n0
t(n) є O(g(n))
O-Notation (contd.)
• Definition: A function t(n) is said to
be in O(g(n)), denoted t(n) є O(g(n)),
if t(n) is bounded above by some
positive constant multiple of g(n) for
sufficiently large n.
• If we can find +ve constants c and n0
such that:
t(n) ≤ c × g(n) for all n ≥ n0
O-Notation (contd.)
• Is 100n+5 є O(n2) ?
• Is 2n+1 є O(2n) ? Is 22n є O(2n) ?
• Is
1
n(n-1)
2
є O(n2) ?
Ω(big omega)-Notation
t(n)
c × g(n)
Doesn’t
matter
n
n0
t(n) є Ω(g(n))
Ω-Notation (contd.)
• Definition: A function t(n) is said to
be in Ω(g(n)) denoted t(n) є Ω(g(n)),
if t(n) is bounded below by some
positive constant multiple of g(n) for
all sufficiently large n.
• If we can find +ve constants c and n0
such that
t(n) ≥ c × g(n) for all n ≥ n0
Ω-Notation (contd.)
• Is n3 є Ω(n2) ?
• Is 100n+5 є Ω(n2) ?
• Is
1
n(n-1)
2
є Ω(n2) ?
Θ(big theta)-Notation
c1 × g(n)
t(n)
c2 × g(n)
Doesn’t
matter
n
n0
t(n) є Θ(g(n))
Θ-Notation (contd.)
• Definition: A function t(n) is said to
be in Θ(g(n)) denoted t(n) є Θ(g(n)),
if t(n) is bounded both above and
below by some positive constant
multiples of g(n) for all sufficiently
large n.
• If we can find +ve constants c1, c2,
and n0 such that
c2×g(n) ≤ t(n) ≤ c1×g(n) for all n ≥ n0
Θ-Notation (contd.)
• Is
1
n(n-1)
2
є Θ(n2) ?
• Is n2+sin(n) є Θ(n2) ?
• Is an2+bn+c є Θ(n2) for a > 0?
• Is (n+a)b Θ(nb) for b > 0 ?
O, Θ, and Ω
(≥) Ω(g(n)): functions that
grow at least as fast as g(n)
g(n)
(=) Θ(g(n)): functions that
grow at the same rate as g(n)
(≤) O(g(n)): functions that
grow no faster that g(n)
Theorem
• If t1(n) є O(g1(n)) and t2(n) є O(g2(n)), then
t1(n) + t2(n) є O(max{g1(n),g2(n)})
• Analogous assertions are true for Ω and Θ
notations.
• Implication: if sorting makes no more than
n2 comparisons and then binary search
makes no more than log2n comparisons,
then efficiency is O(max{n2, log2n}) = O(n2)
Theorem (contd.)
• If t1(n) є O(g1(n)) and t2(n) є O(g2(n)),
then
t1(n) + t2(n) є O(max{g1(n),g2(n)})
t1(n) ≤ c1g1(n) for n ≥ n01 and t2(n) ≤ c2g2(n) for
n ≥ n02
t1(n) + t2(n) ≤ c1g1(n) + c2g2(n) for n ≥ max {
n01, n02 }
t1(n) + t2(n) ≤ max{c1, c2} g1(n) + max{c1, c2}
g2(n) for n ≥ max { n01, n02 }
t1(n) + t2(n) ≤ 2max{c1, c2} max{g1(n), g2(n) },
c
for n ≥ max{n01, n02}
n0
Using Limits for Comparing
Orders of Growth
0 implies that t(n) grows slower than g(n)
•
()
lim
→∞ ()
=
c implies that t(n) grows at the same order
as g(n)
∞ implies that t(n) grows faster than g(n)
1. First two cases (0 and c) means t(n) є O(g(n))
2. Last two cases (c and ∞) means t(n) є Ω(g(n))
3. The second case (c) means t(n) є Θ(g(n))
Taking Limits
• L’Hôpital’s rule: if limits of both t(n)
and g(n) are ∞ as n goes to ∞,
()
()
′()
lim
= lim
= lim 
→∞ ()
→∞ ′()
→∞ ()

• Stirling’s formula: For large n, we
have n! ≈ 2
 

Stirling’s Formula
• n! = n (n-1) (n-2) … 1
• ln(n!) = =1 ln()
•

ln
1
1
•

…
  =   −  + 1
• Trapezoidal rule:
•
n! ≈

ln
1
  ≈
 

11 2 3 4
n
ln
1
+ln
2
+
2
…+ln −1 +2ln()
1
Add ln() to both sides,
2
1
  −  + 1 + ln() ≈ =1 ln()
2
• Exponentiate both sides, n! ≈ C nne-nn1/2
=C 
 

Taking Limits (contd.)
• Compare orders of growth of 22n and
3n
• Compare orders of growth of lgn and

• Compare orders of growth of n! and
2n
Summary of How to Establish
Order of Growth of Basic
Operation Count
• Method 1: Using Limits
• Method 2: Using Theorem
• Method 3: Using definitions of O-,
Ω-, and Θ-notations.
Basic Efficiency Classes
Class
Name
Comments
1
constant
May be in best cases
lgn
logarithmic
Halving problem size at
each iteration
n
linear
Scan a list of size n
n×lgn
linearithmic
Divide and conquer
algorithms, e.g.,
mergesort
n2
quadratic
Two embedded loops,
e.g., selection sort
n3
cubic
Three embedded loops,
e.g., matrix
multiplication
2n
exponential
All subsets of nelements set
n!
factorial
All permutations of an nelements set
Important Summation Formulas
•

= 1
•

=1 
•
=−+1
(+1)
1 2
=
≈ 2
2
(+1)(2+1)

2
=1  =
6
•



=0
•



=1
=
+1 −1
(
−1
=
≈
1 3

3
≠ 1)
 +1 −1


−1


 = 

=


=
 ±  =
=

 ±
=

=
Analysis of Nonrecursive
Algorithms
ALGORITHM MaxElement(A[0..n-1])
//Determines largest element
maxval <- A[0]
Input size: n
for i <- 1 to n-1 do
Basic operation: > or <if A[i] > maxval
maxval <- A[i]
return maxval
C(n) = −
=  =  −  −  + 
=  −  є Θ(n)
General Plan for Analysis of
Nonrecursive algorithms
• Decide on a parameter indicating an input’s
size
• Identify the basic operation
• Does C(n) depends only on n or does it
also depend on type of input?
• Set up sum expressing the # of times
basic operation is executed.
• Find closed form for the sum or at least
establish its order of growth
Analysis of Nonrecursive
(contd.)
ALGORITHM UniqueElements(A[0..n-1])
//Determines whether all elements are
//distinct
for i <- 0 to n-2 do
for j <- i+1 to n-1 do
if A[i] = A[j]
return false
return true
Input size: n
Basic operation: A[i] = A[j]
Does C(n) depend on type of input?
UniqueElements (contd.)
for i <- 0 to n-2 do
for j <- i+1 to n-1 do
if A[i] = A[j]
return false
return true
Cworst(n) =
−2
=0
−1
=+1 1
=
−1 
2
≈
1 2

2
∈ Θ 2
Why Cworst(n) ∈   is better than
saying Cworst(n) ∈   ?
Analysis of Nonrecursive (contd.)
ALGORITHM MatrixMultiplication(A[0..n-1,
0..n-1], B[0..n-1, 0..n-1])
//Output: C = AB
T(n) ≈ cmM(n)+caA(n)
for i <- 0 to n-1 do
3+c n3
=
c
n
m
a
for j <- 0 to n-1 do
3
=
(c
+c
)n
m
a
C[i, j] = 0.0
for k <- 0 to n-1 do
C[i, j] = C[i, j] + A[i, k]×B[k, j]
return C
Input size: n
Basic operation: ×
− −
M(n) = −
= = = 
Analysis of Nonrecursive (contd.)
ALGORITHM Binary(n)
// Output: # of digits in binary
// representation of n
count <- 1
Input size: n
Basic operation:
while n > 1 do
C(n) = ?
count <- count+1
n <-

2
return count
Each iteration halves n
Let m be such that 2m ≤ n < 2m+1
Take lg: m ≤ lg(n) < m+1, so
m = () and m+1 = () +1 є Θ(lg(n))
Analysis of Nonrecursive (contd.)
ALGORITHM Mystery(n)
S <- 0
for i <- 1 to n do
S <- S + i×i
What does this algorithm compute?
return S
What is the basic operation?
How many times is the basic operation executed?
What’s its efficiency class?
Can you improve it further, or
Can you prove that no improvement
is possible?
Analysis of Nonrecursive (contd.)
ALGORITHM Secret(A[0..n-1])
r <- A[0]
What does this algorithm compute?
s <- A[0]
What is the basic operation?
for i <- 1 to n-1 do
if A[i] < r
How many times is the
basic operation executed?
r <- A[i]
if A[i] > s
What’s its efficiency class?
s <- A[i]
return s-r
Can you improve it further,
Or Could you prove that
no improvement is possible?
Analysis of Recursive
Algorithms
ALGORITHM F(n)
// Output: n!
if n = 0
return 1
else
return n×F(n-1)
Input size: n
Basic operation: ×
M(n) = M(n-1) + 1 for n > 0
to compute
F(n-1)
to multiply
n and F(n-1)
Analysis of Recursive (contd.)
M(n) = M(n-1) + 1
M(0) = 0
Recurrence relation
By Backward substitution
M(n) = M(n-1) + 1 = M(n-2)+ 1 + 1 = … …
= M(n-n) + 1 + … + 1 = 0 + n = n
n 1’s
We could compute it nonrecursively, saves
function call overhead
General Plan for Recursive
• Decide on input size parameter
• Identify the basic operation
• Does C(n) depends also on input
type?
• Set up a recurrence relation
• Solve the recurrence or, at least
establish the order of growth of its
solution
Analysis of Recursive
(contd.): Tower of Hanoi
Tower of Hanoi (contd.)
M(n) = M(n-1) + 1 + M(n-1)
M(1) = 1
1
Tower of Hanoi (contd.)
ALGORITHM ToH(n, s, m, d)
M(1) = 1
if n = 1
print “move from s to d”
else
M(n-1)
ToH(n-1, s, d, m)
print “move from s to d”
M(n-1)
ToH(n-1, m, s, d)
M(1)
Tower of Hanoi (contd.)
M(n) = 2M(n-1) + 1 for n > 1
M(1) = 1
M(n) = 2M(n-1) + 1 [Backward substitution…]
= 2[ 2M(n-2)+1] + 1 = 22M(n-2)+2+1
= 22[2M(n-3)+1]+2+1 = 23M(n-3)+ 22+2+1
………
= 2n-1M(n-(n-1))+2n-2+2n-3+……+22+2+1
= 2n-1+ 2n-2+2n-3+……+22+2+1 =
M(n) є Θ(2n)
−1 
=0 2
= 2n-1
Be careful! Recursive algorithm may look simple
But might easily be exponential in complexity.
Tower of Hanoi (contd.)
• Recursion tree (# of function calls)
n
n-1
n-1
n-2
n-2
n-2
…
…
2
1
n-2
2
2
1
1
…
1
1
C(n) =
− 
= 
2
1
= 2n-1
1
1
Analysis of Recursive (contd.)
ALGORITHM BinRec(n)
//Output: # of binary digits in n’s
//binary representation
if n = 1
Input size: n = 2k
Basic operation: +
return 1
else
return BinRec(  2 )+1
A(n) = A(2k) = A(2k-1) + 1 = [A(2k-2) + 1] + 1
= … = A(2k-k)+k = A(1) + k = k = lg(n)
A(n) є Θ(lg(n))
Analysis of Recursive (contd.)
• Maximum how many slices of pizza
can a person obtain by making n
straight cuts with a pizza knife?
L0 = 1
L2 = 4
L1 = 2
1
1,
L3 = 7
Bonus problem in HW2
2
2,
difference
3
4,
7, …
Ln = Ln-1 + n for n > 0
L0 = 1
EXAMPLE: nth Fibonacci
Number
• 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …
F(n) = F(n-1) + F(n-2) for n > 1
F(0) = 0, F(1) = 1
Let’s try backward substitution…
F(n) = F(n-2)+F(n-3)+F(n-3)+F(n-4)
= F(n-2)+2F(n-3)+F(n-4)
= F(n-3)+3F(n-4)+3F(n-5)+F(n-6) …
nth Fibonacci (contd.)
ax(n)+bx(n-1)+cx(n-2) = 0
Homogeneous linear second-order
recurrence with constant coefficients
Guess: x(n) = rn
arn+brn-1+crn-2 = 0
Characteristic equation
ar2+br+c = 0
THEOREM: If r1, r2 are two real distinct
roots of characteristic equation then
x(n) = αr1n+βr2n where α, β are arbitrary
constants
nth Fibonacci (contd.)
F(n)-F(n-1)-F(n-2) = 0
r2-r-1
= 0 => r1,2 =
F(n) = α
1+ 5
2
F(0) = α
1+ 5
2
F(1) = α
1+ 5
2
F(n) =

0
1
1 1+ 5
5
2
+β
1± 1−4 −1
2

1− 5
2
+β
1− 5
2
+β
1− 5
2

−
0
1
=
1± 5
2
= 0 => α+β=0
=1
1 1− 5
5
2

=
1
5
where  = 1 + 5
 − 
2 ≈ 1.61803
and  = -1 
nth Fibonacci (contd.)
• F(n) =
1
5
  −   where  ≈ 1.61803
and  ≈ -0.61803
A(n) = A(n-1)+A(n-2)+1
A(0) = 0, A(1) = 0
• F(n) є Θ(φn)
–[A(n-1)+1]-[A(n-2)+1]=0
ALGORITHM F(n) [A(n)+1]
Let B(n) = A(n)+1
if n ≤ 1
B(n) –B(n-1)-B(n-2)=0
B(0)=1, B(1)=1
return n
Notice, B(n) = F(n+1)
else
A(n) = B(n)-1=F(n+1)-1
1
= 5 +1 − +1 -1
return F(n-1)+F(n-2)
Basic op: +
A(n) є Θ(φn)
nth Fibonacci (contd.)
F(5)
F(3)
F(4)
F(3)
F(2)
F(1)
F(2)
F(1)
F(1)
F(1)
F(2)
F(0)
F(1)
F(0)
F(0)
Only n-1 additions, Θ(n)!
ALGORITHM Fib(n)
F[0] <- 0
F[1] <- 1
for i <- 2 to n do
F[i] <- F[i-1]+F[i-2]
return F[n]
ALGORITHM Fib(n)
f <- 0
fnext <- 1
for i <- 2 to n do
tmp <- fnext
fnext <- fnext+f
f <- tmp
return fnext
Empirical Analysis
• Mathematical Analysis
– Advantages
• Machine and input independence
– Disadvantages
• Average case analysis is hard
• Empirical Analysis
– Advantages
• Applicable to any algorithm
– Disadvantages
• Machine and input dependency
Empirical Analysis (contd.)
• General Plan
– Understand Experiment’s purpose:
• What is the efficiency class?
• Compare two algorithms for same problem
– Efficiency metric: Operation count vs. time
unit
– Characteristic of input sample (range, size,
etc.)
– Write a program
– Generate sample inputs
– Run on sample inputs and record data
– Analyze data
Empirical Analysis (contd.)
• Insert counters into the program to
count C(n)
• Or, time the program, (tfinish-tstart), in
Java System.currentTimeMillis()
– Typically not very accurate, may vary
– Remedy: make several measurements
and take mean or median
– May give 0 time, remedy: run in a loop
and take mean
Empirical Analysis (contd.)
• Increase sample size systematically, like
1000, 2000, 3000, …, 10000
– Impact is easier to analyze
• Better is to generate random sizes within
desired range
– Because, odd sizes may affect
• Pseudorandom number generators
– In Java Math.random() gives uniform random
number in [0, 1)
– If you need number in [min, max]:
• min + (int)( Math.random() * ( (max-min) + 1 ) )
Empirical Analysis (contd.)
• Pseudorandom in [min, max]
– Math.random() * (max-min) gives [0, max-min)
– min + Math.random() * (max-min) gives [min,
max)
– (int)[Math.random() * (max-min+1)] gives [0,
max-min]
– min+ (int)[Math.random() * (max-min+1)] gives
[min, max]
• To get [5, 10]
– (int)Math.random()*(10-5+1) gives [0, 10-5] or
[0, 5]
– 5+ (int)Math.random()*(10-5+1) gives [5, 10]
Empirical Analysis (contd.)
• Profiling: find out most time-consuming
portion, Eclipse and Netbeans have it
• Tabulate data
n
M(n)
g(n)
M(n)/g(n)
• If M(n)/g(n) converges to a +ve constant,
we know M(n) є Θ(g(n))
• Or look at M(2n)/M(n); e.g., if M(n) є
Θ(lgn) then M(2n)/M(n) will be low
Empirical Analysis (contd.)
• Scatterplot
count/time
count/time
M(n) ≈ can
lgM(n) ≈ nlga + lgc
n
n
lgn
n
count/time
nlgn or n2
n
Algorithm Visualization
(contd.)
• Sequence of still pictures or
animation
• Sorting Out Sorting on YouTube
– Nice animation of 9 sorting algorithms
Summary
• Time and Space Efficiency
• C(n): Count of # of times the basic operation is
executed for input of size n
• C(n) may depend on type of input and then we
need worst, average, and best case analysis
• Usually order of growth (O, Ω, Θ) is all that
matters: logarithmic, linear, linearithmic,
quadratic, cubic, and exponential
• Input size, basic op., worst case?, sum or
recurrence, solve it…
• We run on computers for empirical analysis
Homework 2
• DUE ON FEBRUARY 07 (TUESDAY)
IN CLASS

similar documents