Algorithm 1 - Sorting & Complexity

- 3월 04, 2024

Algorithm : finite set of instructions that accomplish some task.
basic requirements : input(0 or more), output(1 or more), Definiteness(unambiguous), finiteness(terminate after steps), effectiveness

Sorting algorithm : reorder elements in the list into certain order.
max/min, uniqueness, count, search 등 여러 부분에 응용될 수 있기에 매우 실전적인 문제.

input : sequence of n numbers
output : permutations of inputs(sorted)

1. Bubble sort : repeatedly swap adjacent elements.
n개의 element에 대해 n번의 iteration이 필요하며 각 iteration마다 최대/최소 인 element의 자리가 맞춰진다.
각 스테이지마다 n-1, n-2 .... 등등 점점 줄어드는 횟수의 비교가 필요하고, 이를 모두 더하면 n(n-1)/2가 되므로 O(n2)가 된다.

2. Selection sort : repeatedly select smallest element and move it to sorted part.
unsorted portion에서 가장 작은 것을 비교하며 찾아서 sorted portion의 가장 오른쪽에 놓는 방식이 된다. (작은 것을 찾아 앞으로 놓을 경우)
마찬가지로 n개에 대해 n 번의 iteration이 필요하고, 각 iteration에서 나머지 element와의 비교가 필요하다.

Evaluate Algorithm performance
running time을 기준으로 잡으면 실행하는 기계에 따라 다른 시간이 걸릴 수 있으므로 올바르지 않다. 따라서 time complexity를 기준으로 알고리즘의 효율성을 비교한다.

- Time complexity : 실행되는 operation, primitive instruction의 line 수를 생각. N개의 element input에 대해 몇 번의 계산이 수행되는가? (number of primitive instructions)

- Space complexity : input n에 대해 알고리즘이 function을 실행하는데 필요한 공간의 크기.
Asymptotic Analysis : input size에 기반하여 algorithm의 performance를 측정하는 방법.

input size를 늘려갈 때 알고리즘이 필요로 하는 time/space가 증가하는 속도를 비교하여 알고리즘을 평가할 수 있는 방법으로 특정 문제에 대해 서로 다른 알고리즘의 효율을 평가할 때 유용한 방법이다.

Asymptotic Notation : 증가하는 input size에 대해 time complexity(수행되는 연산 수)가 얼마나 빠르게 증가하는지 비교하는 방법.
이를 통해 서로 다른 input size에 대해 얼마나 많은 연산이 수행되는지 알고리즘 별로 비교할 수 있다.

1. Big- Oh notation : Less than or Equal to
O(g(n)) = {f(n) : 0<=f(n)<= c*g(n) for all n>=n0} 로 표현한다. 이는 어떤 n 이상에서 f(n)이 g(n)의 상수 배를 한 것 보다 항상 작거나 같을 때 이를 O(g(n)) = f(n)으로 표현한다.

이런 빅오 관계를 표현하려면 두 함수에 대해 대소 관계를 만족하는 계수 c와 정의역 n0를 결정하면 된다. (c>0)

그러나 이를 통해 나타낸 upper bound는 범위가 너무 넓기에 의미가 없는 경우가 많다. (asymptotic tight)

2. Little-Oh notation : much smaller than
위의 기준에서 0<=f(n) < c*g(n)으로 상한에 '='이 포함되지 않는다. 이 때, f(n)이 g(n)보다 asymptotically small이라 할 수 있다. (not asymptotically tight)
이 경우, c의 범위가 더 넓어져 위와 달리 some positive constant가 아니라 all positive constant가 된다.

3. Big-Omega notation: greater than or equal to
위와 마찬가지로 이번에는 상한을 표현하는 경우.
0<=c*g(n)<=f(n) : f(n) is greater than or equal to g(n). -> function g(n) is asymptotic lower bound of f(n)
그러나 위의 BIg-O와 동일하게 not tight enough

4. Little-Omega notation : much larger than
앞서와 동일하게 '='이 빠지고 0<= c*g(n) < f(n) 이 되며, 이는 f(n)이 c*g(n)보다 much larger 이라는 것을 나타낸다. -> f(n) is asymptotically larger than g(n)
이것도 모든 positive constant에 대해 만족한다. (18쪽에 오타는 and/and/but)

5. Big-theta notation : 위의 두 조합을 합친 것.
어떤 함수 g(n)의 c1배 보다는 항상 크고, c2배 보다는 항상 작은 함수 f(n)이 존재할 때 이를 나타낸다. (상한과 하한을 결정)
0<=c1 * g(n) <= f(n) <= c2 * g(n) : g(n)이 asymptotic bound가 된다.

따라서 f(n)의 upper bound O(g(n)), lower bound W(g(n)) < - > f(n) = Theta(g(n)) : tight bound 이다. (필요충분조건)

그러나 결국 가장 높은 차수만 남기고 그보다 낮은 차수를 날림으로써 O(n)을 구할 수 있다. (반복은 n, 나머지는 constant, nested iteration은 nest 개수만큼 n 제곱)

이 블로그 검색

gaenyangi's space

Algorithm 1 - Sorting & Complexity

댓글

댓글 쓰기

이 블로그의 인기 게시물

IIKH Class from Timothy Budd's introduction to OOP

Compiler 9 - Efficient Code generation

Software Engineering 10 - V&V, SOLID principle