### 概述 - 中山大学信息科学与技术学院

```Algorithms Design and its
Applications
Introduction

• Email: [email protected]
• Office: 信息学院楼526B
• Course homepage:
x.html
• BBS:
http://bbs.sysu.edu.cn/bbsdoc?board=ACMICPC
• Office Hours:

• 1. 平时作业：在线编程练习+书面作业
http://soj.me/
• 2. 研究型的小Project ：选一个问题，编程

• 3. 期末考试：ACM 方式机试, 自动测评
FAQ
• Q: 这门课是为了培养参加ACM / ICPC比赛的

• A: No.
Some important topics not included in ICPC
• NP-Completeness Proofs
• Randomized and Approximation Algorithms
• Quantum Algorithms

Why take this course?
You want to be a computer
scientist?
a mundane programmer?
Or a great leader and thinker?
Original Thinking
– Given today’s prices of pork, grain, sawdust, …
– Given constraints on what constitutes a hotdog.
– Make the cheapest hotdog.
• Um? Tell me what to code.
With more sophisticated software engineering systems,
the demand for mundane programmers will diminish.
• I learned this great algorithm that will work.
Soon all known algorithms
will be available in libraries.
• I can develop a new algorithm for you.
Great thinkers
will always be needed.
Course Content
• A list of algorithms.
– Learn their code.
– Trace them until you are convinced that they
work.
class InsertionSortAlgorithm extends
SortAlgorithm {
– Implement them.
void sort(int a[]) throws Exception {
for (int i = 1; i < a.length; i++) {
int j = i;
int B = a[i];
while ((j > 0) && (a[j-1] > B)) {
a[j] = a[j-1];
j--; }
a[j] = B;
}
}
Course Content
• A survey of algorithmic design techniques.
• Abstract thinking.
• How to develop new algorithms for any problem
that may arise.
Study:
• Many experienced programmers were asked to
code up binary search.
Study:
• Many experienced programmers were asked to
code up binary search.
80% got it wrong
What did they lack?
What did they lack?
• Formal proof methods?
What did they lack?
• Formal proof methods?
Yes, likely
Industry is starting to
realize that formal
methods are important.
But even without formal
methods …. ?
What did they lack?
• Fundamental understanding of the
algorithmic design techniques.
• Abstract thinking.
Course Content
Notations, analogies, and abstractions
for developing,
and describing algorithms
so correctness is transparent
A survey of fundamental
ideas and algorithmic
design techniques
For example . . .
Some Math
Time Complexity
t(n) = Q(n2)
Logs and Exps
2a × 2b = 2a+b
2log n = n
Classifying Functions
f(i) = nQ(n)
Time
Logic Quantifiers
g "b Loves(b,g)
"b g Loves(b,g)
Input Size
∑i=1 f(i).
Recurrence Relations
T(n) = a T(n/b) + f(n)
Iterative Algorithms
Loop Invariants
<preCond>
codeA
loop
<loop-invariant>
exit when <exit Cond>
codeB
codeC
<postCond>
Code
0
5 km
i-1
i
9 km
One step
at a time
Relay Race
Recursive Algorithms
Recursive Back Tracking
Greedy Algorithms
Dynamic Programing
4
5
3
2
2
8
9
5
8
4
4
6
2
3
5
7
5
6
1
3
2
5
4
8
Graph Search Algorithms
Network Flows
Reduction
=
Rudich www.discretemath.com
Useful Learning Techniques
You are expected to read the lecture notes before
the lecture.
This will facilitate more productive discussion
during class.
assignments
& tests.
Be Creative
• Why is it done this way and not that way?
Help me know what people
are not understanding
Slow down the slides
We do have a lot of material
for at least 10sec.
What is algorithm？
Informally, an algorithm is any well – defined
computational procedure that takes some
value, or set of values, as input and produces
some value, or set of values, as output .
-- Introduction to Algorithms
Sorting Problem
Input: A sequence of n keys a1, a2, …, an.
Output: The permutation of the input sequence
such that a1’≤a2’≤…≤an’.
• An instance (实例):
Inpute: <31, 41, 59, 26, 41, 58>
Output: <26, 31, 41, 41, 58, 59>
• 如果对每个输入的实例，算法都能正确的

• 不正确的算法对某些输入实例可能根本不

• 不正确的算法只要错误率可控有时可能是

Knuth said
Knuth (1968, 1973) has given a list of five properties that are widely
accepted as requirements for an algorithm:
Finiteness: "An algorithm must always terminate after a finite number of
steps ... a very finite number, a reasonable number"
Definiteness: "Each step of an algorithm must be precisely defined; the
actions to be carried out must be rigorously and unambiguously specified
for each case"
Input: "...quantities which are given to it initially before the algorithm
begins. These inputs are taken from specified sets of objects"
Output: "...quantities which have a specified relation to the inputs"
Effectiveness: "... all of the operations to be performed in the algorithm
must be sufficiently basic that they can in principle be done exactly and
in a finite length of time by a man using paper and pencil"
Finally …
The word algorithm does not have a generally
accepted definition.
From wikipedia
Analysis of algorithms
• The theoretical study of computer-program
performance and resource usage.
• What’s more important than performance?
• modularity
• correctness
• maintainability • functionality
• robustness
• user-friendliness
• programmer time • simplicity
• extensibility
• reliability

• Human Genome Project
• Internet
• Electronic commerce security
• Other manufacturing and commercial
setting
• Etc.

n=1000万

http://www.bilibili.tv/video/av362069/

Fibonacci number
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …,
Leonardo of Pisa (Fibonacci)
1170-1250
Fibonacci number
• An exponential algorithm
int fib1(n)
{
if ( n==0 ) return 0;
if ( n==1 ) return 1;
return fib1(n-1) + fib1(n-2);
}
How much time does it take?
Let T(n) be the number of computer steps needed to
compute fib1(n).
T(n)<=2 for n<=1
T(n)=T(n-1) + T(n-2) + 3 for n>1
(two recursive invocations of fib1, plus three
computer steps to check on the value of n and a final
Therefore, T(n) >= Fn.
The running time of the algorithm grows as fast as the
Fibonacci numbers!
T(n) is exponential in n.
How much time does it take?
To compute F200, the fib1 algorithm executes
T(200)>=F200>=2138 elementary computer
steps.
On a computer which clocks 40 trillion steps
per second, fib1(200) would take at least 292
seconds.
How much time does it take?
Moore’s law: computer speeds doubles roughly every
18 months. Computers get roughly 1.6 times faster
each year. It helps?
The running time of fib1(n) is proportional to 20.694n
= (1.6)n, so it takes 1.6 times longer to compute
Fn+1 than Fn.
So if we reasonably compute F100 ith this year’s
technology, then next year we will manage F101. And
the year after, F102. And so on.
Can we do better?
• Why fib1 is so slow?
Can we do better?
• An polynomial algorithm
int fib2(n)
{
if ( n==0 ) return 0;
int f[n]; // create an array f[0…n]
f[0] = 0, f[1] = 1;
for (int i=2; i<=n; i++)
f[i] = f(i-1) + f(i-2);
return f[n];
}
How long does fib2 take?
• The inner loop consists of a single computer
step and it executed n-1 times. Therefore the
number of computers steps used by fib2 is
linear in n.
• It is now perfectly reasonable to compute F200
or even F200000.
Can we do even better?
Write the equations F1 = F1 and F2 = F0+F1 in matrix notation:
So, in order to compute Fn, it suffices to raise this 2*2 matrix,
call it X, to the nth power.
Two 2*2 matrices can be multiplied using 4 additions and 8
multiplications. How many matrix multiplications suffice for
computing Xn?
Efficient Exponentiation
long pow( long x, int n)
{
if( n == 0 ) return 1;
if( n == 1 ) return x;
if( n%2==0 )
return pow(x*x,n/2);
else
return pow(x*x,n/2) * x;
}
O(log n) matrix multiplications suffice for computing Xn?
Can we do even better?
Thus the number of arithmetic operations
needed by our matrix-based algorithm, call it
fib3, is just O(logn), as compared to O(n) for
fib2.
Sicily 1863.
Elegant fibonacci numbers again
• Description
Fibonacci numbers are nice simple integers. All of you are
familiar with it, aren’t you?
The Fibonacci sequence <F[n]> are defined by the recurrence:
F[0]=0;
F[1]=1;
F[n]=F[n-1]+F[n-2], for n>1
You know that the value of F[n] increases rapidly when the n
becomes larger. So for each testcase, output the value F[n]
mod m will be ok.
Sicily 1863.
Elegant fibonacci numbers again
• Input
The first line of the input is a positive integer.It is the number of the test
cases followed. Each test case contains two integers n (0<=n<2^32) and m
(0<m<10000). There may be one or several spaces between the two
integers.
• Output
The output of the program should consist of one line of output for each test
case. The output of each test case only contains one integer which equals to
the value F[n] mod m. No any redundant spaces are needed.
Sample Input
2
1 1000
2 100
Sample Output
1
1
Sicily 1863.
Elegant fibonacci numbers again
The RAM Model of Computation
• 用基本指令条数来精确度量一个算法的效

– 从理论上分析及其繁琐，从实际中估计也颇为

– 不同的计算机采用的指令集、缓存策略等技术

• 寻找一种简便易行的，与具体机器无关的

The RAM Model of Computation
• Random Access Machine (RAM)
– Each simple operation (+, *, -, =, if, call) takes
exactly one time step.
– Loops and subroutines are not considered simple
operations. Instead, they are the composition of
many single-step operations.
– Each memory access takes exactly one time step.
Further, we have as much memory as we need. The
RAM model takes no notice of whether an item is in
cache or on the disk.
The Big-O Notation
Definition：
f (n)，g (n) are positive function about natural
number. If there is a constant c >0，and natural
number n 0，When n>n 0，there is f (n) <c g (n), so
f (n)=O ( g (n))
If there is a constant c >0，and natural number n 0，
When n>n 0，there is f (n) >c g (n), so
f (n)= Ω(g (n))
if f (n)=O( g (n)), f (n)= Ω( g (n)) , then
f (n)= Θ( g (n))
The Big Oh Notation
Working with the Big Oh
Working with the Big Oh
• Multiplying Functions
O(cf(n)) = O(f(n))
Ω(cf(n)) = Ω(f(n))
Θ(cf(n)) = Θ(f(n))
O(f(n)) * O(g(n)) = O(f(n) * g(n))
Ω(f(n)) * Ω(g(n)) = Ω(f(n) * g(n))
Θ(f(n)) * Θ(g(n)) = Θ(f(n) * g(n))
Working with the Big Oh
• Transitive Experience
If f(n) = O(g(n)) and g(n) = O(h(n)), then f(n)=O(h(n)).
Grow Rates
Grow Rates
Popular：
O(1) < O (logn) <O(n) < O(nlogn) <O(n2)<O(n3)
O(2n) < O(n!) < O(nn)
Some rules：
O( f )+O( g )=O(max( f , g ))
O( f )+O( g )=O( f+g )
O( f )O( g )=O( f *g )
O( c f (n)=O( f (n))
•Popular: O(1), O(log2 n), O(n), O(nlog2 n) , O(n2), O(2n), O(n!)
2n
T(n)
5n2
2000
100n
20
200log2n
10
20
n
Best, Worst and Average-Case
Complexity

• 计算算法运行过程所占用的存储量作为问题规模

• 通常以简单变量的存储空间为单位
• 算法的存储空间包括：
1）程序本身、常量和变量等所需空间
2）程序运行中动态申请的空间和递归调用所

• 通常考虑除输入和程序以外的空间. 如果相对于

selectSort.
Running Time Calculations
• A Simple Example
• General Rules
• Solutions for the maximum Subsequence Sum Problem
A Simple Example
A simple program to calculate 13+23+..+n3.
int sum(int n)
{
int partialSum;
1 partialSum = 0;
2 for (int i=0; i<=n; i++)
3
partialSum += i*i*i;
4 return partialSum;
}
General Rules
Rule 1-For loops.
The running time of a for loop is at most
the running time of the statements inside
the for loop times the number of
iterations.
Rule 2-Nested loops.
Analyze these inside out. The total
running time of a statement inside a group
of nested loops is the running time of the
statement multiplied by the product of the
size of all the loops.
General Rules
Rule 3-Consecutive Statements.
for (i=0; i<n; i++)
a[i] = 0;
for (i=0; i<n; i++)
for (j=0; j<n; j++)
a[i] += a[j] + i + j;
General Rules
Rule 4-If/Else.
For the fragment
if (condition)
S1
else
S2
the running time of an if/else statement
is never more than the running time of the
test plus the larger of the running times
of S1 and S2.
The Maximum Subsequence Sum
Problem
• 给一串整数a[1…n]，求出它和最大的子序

a[i]+a[i+1]+…+a[j]最大
• 介绍四个算法并分析时间复杂度
–
–
–
–

max := a[1];
for i:=1 to n do
for j:=i to n do begin
sum := 0;
for k:=i to j do
sum := sum + a[k];
if sum > max then max := sum;
end;
• 思想：枚举所有的i和j，计算a[i]+…+a[j]，选择最大的
• 时间复杂度如何分析？
– 三层循环
– 内层操作次数取决于i, j
– 中层操作次数取决于i

•
•
•
•

1
– 先计算里层求和号，得  ( n  i  1)( n  i  2)
2
n
n
i 1
ji
n
i 1
– 再加起来？好复杂…
– 直接算法需要利用公式12+…+n2=O(n3)
– 一般地，有1k+…+nk=O(nk+1). 证明: 数学归纳法

n
n
i 1
ji
• 总次数为：  ( j  i  1)
– 完全的计算太麻烦
– 能不能不动笔就知道渐进时间复杂度呢？
• 何必非要计算出详细的公式再化简呢？
– 里层求和计算出的结果是O((n-i)2)
– 12+22+…+n2=O(n3)
– 每步都化简！但是要保留外层需要的变量
• 结论：算法一时间复杂度为O(n3)
• 经验：一般只能支持 n<=200

• 思路
–
–
–
–

s[0] := 0;
for i:=1 to n do
s[i] := s[i–1] + a[i];

• 有了s[i]，怎么快速求a[i]+…+a[j]呢？
a[i]+…+a[j]
= (a[1] + … + a[j]) – (a[1] + … + a[i-1])
=s[j] – s[i-1]

max := a[1];
for i:=1 to n do
for j:=i to n do
if s[j] – s[i-1] > max then
max := s[j] – s[i-1];

• 用一种完全不同的思路
• 最大子序列的位置有三种可能
– 完全处于序列的左半，即j<=n/2
– 完全处于序列的右半，即i>=n/2
– 跨越序列中间，即i<n/2<j
• 用递归的思路解决！
– 设max(i, j)为序列a[i…j]的最大子序列, 那么…

• 用递归的思路解决！
– 设max(i, j)为序列a[i…j]的最大子序列
– 设mid = (i + j)/2，即区间a[i…j]的中点
• 最大的第一种序列为max(i, mid)
• 最大的第二种序列为max(mid+1, j)
• 问题：最大的第三种序列为？？？
– 既然跨越中点，把序列a[i…j]划分为两部分
• a[i…mid]：最大序列用扫描法在n/2时间内找到
– 一共只有mid-1=n/2种可能的序列，一一比较即可
• a[mid+1…j]：同理
• 一共花费时间为j-i+1

//返回数组在区间[x,y)中的最大连续和
int maxsum(int *A, int x, int y)
{
int i, m, v, L, R, max;
if (y-x == 1) return A[x]; //只有一个元素，直接返回
m = x + (y – x) / 2; // 分治第一步：划分成[x,m)和[m,y)
//分治第二步，递归求解
max = maxsum(A, x, m) >? maxsum(A, m, y);
//分治第三步：合并(1),求分界点开始向左的最大连续和；
v = 0; L = A[m-1];
for(i=m-1;i>=x;i--) L >?= v += A[i];
//分治第三步：合并[2],求分界点开始向右的最大连续和；
v = 0; R = A[m];
for(i=m;i<y;i++) R >?= v += A[i];
return max >? (L+R);
}

• 如何分析时间复杂度呢？
– 我们没有直接得到具体的T(n)的式子
– 但是容易得到递推关系T(n) = 2T(n/2) + n
• 设时间复杂度的精确式子是T(n)
• 第一、二种序列的计算时间是T(n/2)，因为序列长度

• 第三种序列的计算时间是n
– 由递归式四, 得T(n)=O(nlogn)

• 算法二的实质是求出i<=j,让s[j]-s[i-1]最大
– 对于给定的j，能否直接找到在j之前的最小s值呢？
– 从小到大扫描j
• j=1时，只有一个合法的i，即i=1, s[1-1]=0
• 如果s[j]变大，则最小的s值和上次一样
• 如果s[j]再创新低，应该让s[j]作为今后的最优s值
min_s := 0;
For j :=1 to n do begin
if s[j] < min_s then min_s := s[j];
if s[j] – min_s > max then
max := s[j] – min_s;
end;

O(n3)

O(n2)

O(nlogn)

O(n)

Why study algorithms and
performance?
• Algorithms help us to understand scalability.
• Performance often draws the line between
what is feasible and what is impossible.
• Algorithmic mathematics provides a language