The complexity of algorithms is given by a function. When we are dealing with classic nonrecursive problems, we usually come across polynomials.
TL;DR, the @Isac response shows in a very elegant way the necessary results
TL;DR2, optimal response with optimal mathematical rigor on the relevance of function to its order of complexity
As stated in the commentary, the large notation O
sets. So when we talk about n^2 + n = O(n^2)
, we are saying that the function n^2 + n
belongs to the set O(n^2)
. When we say that O(n^2) = O(n^3)
we are saying that the set O(n^2)
is contained in O(n^3)
; note that this operation is not commutative as it is not traditional equality, so despite O(n^2) = O(n^3)
be true, O(n^3) = O(n^2)
is false.
Okay, now how do we define the relevance of a function to a set described by large notation O
? Doing a function division leading to the infinity limit (literally, I’ll come back to that point later).
You may have noticed that the big notation O
is always accompanied by a function, right? How O(n log n)
, O(n)
, O(n^2)
, O(e^n)
, O(n^3 + n)
. The function within the parentheses represents its behavior; I haven’t found a formal name, but we can call this function set signature. the name for this function is limiting function (information courtesy of @LINQ).
To determine whether a function f(x)
belongs to the set O(g(n))
, we need to know if f(x)
dominates the signature function limiting function of the whole g(x)
; if f(x)
dominate g(x)
, then f(x)
does not belong to O(g(x))
, belongs to its complementary set omega(g(x))
.
Notation omega
: in large notation O
, we define which functions have maximum asymptotic behavior equal to signing set limiter, in the rating omega
, we define which functions have minimum asymptotic behavior equal to signing set; for example, Insertion Sort works in omega(n)
(previously ordered set or slightly messy) but merge Sort works in omega(n log n)
Notation theta
: if f(x) = O(g(x))
and f(x) = omega(g(x))
, then f(x) = theta(g(x))
; the notation theta
establishes minimal and maximum asymptotic behaviors at the same time; for example, unbranded Bubble Sort works on theta(n^2)
.
So now we need to define the operation of domination between functions. I will call this function domina(f, g)
, it takes two functions and returns one of three possible results:
dominada
if f
is dominated by g
;
co-dominantes
if it is not possible to define who dominates whom;
dominante
if f
dominate g
.
The function domina(f,g)
is described as such:
Okay, now we know how to define when a function f(x)
belongs to a group O(g(x))
, but this is still not enough to prove whether a given algorithm runs in that order of complexity. To know this, it is necessary to know more or less how many operations will be necessary for the algorithm to work.
Defining amounts of operations depends on the machine running the algorithm. For example, an x86 has as a unit operation the sum of two 16-bit numbers n
and m
returning another 16-bit number making a sum in O(1)
; a simple Turing machine using unary notation cannot sum into a constant term of operations, but would need to do O(n + m)
operations to complete processing.
So, yeah, soma(n, m) = O(1)
on x86 processors with n
and m
with a fixed size of 16 bits, but soma(n, m) = O(n + m)
on a Turing machine. (Note: on the Turing machine, we are not limiting the values of n
nor of m
).
Normally, basic arithmetic operations (such as summation, multiplication and division) with palpable numbers are done in fixed habitation and we consider them with weight O(1)
.
Small informal demonstration: b
is therefore constant O(1)
; n
is limited by b
digits, having size O(b)
; the sum using binary numbers is done by analyzing each bit of n
and of m
a single time, together with an auxiliary variable, so the Turing machine will pass through all the bits only once, so O(tamanho(n) + tamanho(m))
; as the size is limited to O(b)
, O(tamanho(n) + tamanho(m)) = O(O(b) + O(b)) = O(b + b) = O(2b) = O(b) = O(1)
.
In nonrecursive cases (either directly or indirectly), simply check the loops to see how many operations are done. Let’s look at your case:
function somaMatriz(matA, size)
let soma = 0;
for(let i = 0; i < size; i++) {
for(let j = 0; j < size; j++) {
soma += a[i][j];
}
}
return soma;
}
We can unlock that function in two and have the same asymptotic behavior:
function somaLinha(matA, size, i) {
let soma = 0;
for(let j = 0; j < size; j++) {
soma += a[i][j];
}
return soma;
}
function somaMatriz(matA, size) {
let soma = 0;
for(let i = 0; i < size; i++) {
soma += somaLinha(matA, size, i);
}
return soma;
}
Does that mean that somaMatriz(matA,size)
will do O(size*O(somaLinha(matA,size,i)) + 2)
operations; size * O(somaLinha(matA,size,i))
because the outer loop repeats size
times (i
of [0,size)
is O(size)
) doing the operation somaLinha(matA,size,i)
each time. The 2
which appears is due to return and initialization, but we know that any growing function dominates a constant value, so this 2
ends up leaving the signature.
Very well, to set now the cost of somaLinha(matA,size,i)
we can divide it into two parts (similarly to what we have just done with somaMatriz(matA, size)
):
function operador_soma(a, b) {
return a + b;
}
function somaLinha(matA, size, i) {
let soma = 0;
for(let j = 0; j < size; j++) {
soma = operador_soma(soma, a[i][j]);
}
return soma;
}
Similarly, somaLinha(matA, size, i)
will do O(size)
operations of operador_soma
. operador_soma
is only a simple sum of reasonable numbers and fixed dwelling, then executes in O(1)
. Therefore, somaLinha(matA, size, i) = O(size * operador_soma) = O(size * O(1)) = O(size)
.
Replacing this result in the asymptotic behavior of somaMatriz
, we have to O(simaMatriz,matA,size) = O(size*O(somaLinha(matA,size,i)) + 2) = O(size*O(O(size)) + 2) = O(size * size) = O(size ^2)
.
When the loop does not have a very 100% size defined in an obvious way, the analysis is a little more complicated.
Take as an example the function that makes the union between two ordered sets (based on the function of that answer:
entrada:
A, conjunto ordenado de elementos do tipo E
B, conjunto ordenado de elementos do tipo E
cmp, função sinal que compara dois elementos de E
retorno:
C, conjunto de elementos do tipo E oriundo da união de A e B
começo
i <- 0 # índice para iterar em A
j <- 0 # índice para iterar em B
C <- []
ultimo_elemento_adicionado <- null
enquanto i < A.tamanho && j < B.tamanho:
s = cmp(A[i], B[j])
se s == '0':
# elementos são iguais, um deles como elemento candidato
candidato <- A[i]
i <- i + 1
j <- j + 1
senão, se s == '-':
# A[i] < B[j], então próxima comparação será com A[i + 1] e B[j]; A[i] agora é candidato
candidato <- A[i]
i <- i + 1
senão # caso trivial onde s == '+':
# A[i] > B[j], então próxima comparação será com A[i] e B[j + 1]; B[j] agora é candidato
candidato <- B[j]
j <- j + 1
# agora vamos ver se o candidato deve ser inserido em C: precisa ser distinto do último elemento adicionado, ou ser o primeiro elemento adicionado
se ultimo_elemento_adicionado != null && cmp(candidato, ultimo_elemento_adicionado) != '0':
ultimo_elemento_adicionado = candidato
C.push(candidato)
# caso i ou j extrapolem o tamanho de A ou B, respectivamente, não há mais comparações a se fazer
retorna C
fim
Whereas the operations of cmp
are performed in constant time O(1)
and that adding an element at the end of an indexed set is also done in O(1)
.
The stopping condition is when i
reaches the size of A
or when j
reaches the size of B
. For loop iteration, we obtain that i
or j
(or inclusive) are increased. This means that, to stop by the condition of i >= A.tamanho
, will be necessary A.tamanho
increment operations in i
; meanwhile, to stop by the condition of j >= B.tamanho
, will be necessary B.tamanho
increment operations in j
. As each iteration I ensure there is at least one increment, the maximum iterations made is (A.tamanho - 1) + (B.tamanho - 1) + 1
, so this bond needs O(A.tamanho + B.tamanho)
operations to be executed.
In the case of merge Sort, we have a recursive function. Your analysis is even more complicated, so I won’t worry about her formalism, but about passing on the general idea.
Reviewing the merge Sort:
mergeSort(E[] entradaDesordenada, E[] areaTrabalho, int ini, int end) {
if (end - ini <= 1) {
return;
}
int middle = (ini + end) / 2;
mergeSort(areaTrabalho, entradaDesordenada, ini, middle);
mergeSort(areaTrabalho, entradaDesordenada, middle, end);
merge(entradaDesordenada, areaTrabalho, ini, middle, end);
}
The idea of merge
is to join two ordered sets into a third ordered set. The bulk of the algorithm is very similar to the union algorithm I presented earlier, including the same behavior. Therefore, merge(A) = O(tamanho(A))
; if we call tamanho(A) = n
, we have to the complexity of merge = O(n)
.
So all that remains is to define how many times merge
will be called. A recursion mergeSort
is made so that, in recursive call, only half of the elements are passed. So the depth of recursive calls is log2(n)
; as log2(n) = log(n) * log2(10)
, we can say that the depth is O(log n)
. For each recursion level, mergeSort
will be called twice, each call with n/2
elements; then each level will call merge
for n/2
elements 2 times.
The general idea can be seen in this image:
Note that in each level we run several times the merge
(an execution for each node of the tree). In all, the amount of merging elements, at the same level of recursion depth, is n
. So on the whole we have to be done O(profundidade da recursão * n)
operations. As the depth is O(log n)
, the quantity of operations is O(n log n)
.
What is the relevance of this in practical life? Well, I did an experiment to answer a question about sorting algorithm performance. You can check the results in response. The difference between a time complexity algorithm O(n log n)
and O(n^2)
up to two orders of magnitude in total time for an input with n=100.000
The big notation
O
represents sets. We use sign of equality by vice. The correct would be to sayn^2 + n pertence a O(n^2)
– Jefferson Quesado
@Jeffersonquesado I ended up finding the "belong" sign a little ugly for when I put there, then I switched to the xD equality sign
– Jéf Bueno
are the language addictions. And honestly gets much prettier with = and everyone understands, so it’s a forgivable addiction
– Jefferson Quesado
lim(n->inf, 5n2+3n)=lim(n->inf, n(5n+3))
3 is despicable against 5n therefore...=lim(n->inf, 5n2)=5*lim(n->inf, n2)
==> ...O(N2)– JJoao
5n^2+3n <= 5n^2+3n^2
=>5n^2 + 3n <= 8n^2
that would give a k of8
, soonan^2 + bn <= an^2 + bn^2
=>an^2 + bn <= (a+b)n^2
, which implies that ak
ofa+b
always works. That’s what I was trying to get?– Isac
@Isac I got confused;
=>
is really an arrow, correct?– Jefferson Quesado
@Jeffersonquesado yes an implication arrow(implies that), to follow up the equation. Normally it would be the
<=>
of equivalent, but came me so– Isac
@Isac right, it’s just that this same trail has confused me
– Jefferson Quesado
@Isac That’s it. Right on this path.
– Jéf Bueno
@Isac, I showed membership to a large set O by function division. Particularly it’s the formalization of that you wrote. Not to mention that you feel that
o(n) = o(n^2)
but thato(n^2) =/= o(n)
– Jefferson Quesado
@Jeffersonquesado yes, your answer is very good and covers what I said formally, I was just giving a possible and short calculation for the
k
(which in the question is indicated asm
) in an equation of the typean^2+bn
– Isac
Jefferson gave a very complete answer. Actually, it’s very simple. To prove it, just calculate the limit. Why was there no selection of the answer? I should also try to explain to see if he understands?
– RHER WOLF
@RHERWOLF Try to explain to see if who understands?
– Jéf Bueno
You. It says here that you didn’t get any answers, so it hasn’t been cleared up yet, has it? I’m seeing a more direct explanation to post, may be more appropriate.
– RHER WOLF
@RHERWOLF actually means that I haven’t decided which answer is going to be chosen yet and I’m accepting new answers. Feel free.
– Jéf Bueno
@LINQ ok, the answers from there are sufficient, well complete.
– RHER WOLF