Deepening the response of @Maniero, in the matter of complexity.
Asymptotic complexity is a lot, but that’s not all. Asymptotic complexity depends on variables, it is annotated according to the input. So this value can be "fake" depending on how you take this variable.
Let’s take the following loop, which reads the properties of a cube:
elementos = d*d*d;
for (i = 0; i < elementos; i++) {
x = i % d;
y = (i/d)%d;
z = (i/d)/d;
imprime(cubo[x][y][z]);
}
So your algorithm runs in linear time, right? Well, linear wheel over variable elementos
, but as you receive a cube, it is much more likely that your concern is with d^3
.
Here, an algorithm that is quadratic in the amount of elements (such as a simpler sorting algorithm) would make the life of the program running through the data cube much more complicated.
What happened here was that complexity was "falsified" to resemble being linear. Some people end up doing similar things to reduce the "apparent complexity" of the operation.
In this case, @Maniero demonstrated in the answer that for every element you travel, you do o(1)
operations. This means that in the set of size elements n
is a linear algorithm. However, analyzing from the perspective of data stored in an information cube, the size n
is o(d^3)
.
To get an idea of how looking at the distinct variable affects complexity, look at this example:
add two numbers (arbitrary basis), N
and M
, each with n
digits
The sum has complexity o(n)
, but also o(log N)
. ( Source: https://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations )
Since here the number of digits to represent a number is the logarithm of that number in the basis being represented, this must be taken into account. It becomes much more sincere and visible you say that the cost of the sum is linear in the number of digits.
You can make considerable gains using the spatial locality. I don’t know how your matrix was implemented, but if it was implemented as a Jagged array (an array, which in turn each element contains a vector, the latter containing each element a number vector, each with its individual size), is expected not to be contiguous. If they are not contiguous, then they may be scattered in random memory, and this is not good.
Why is it not good? Because you will have a lesser guarantee of the use of the spatial location of your variables. As an optimization of the processor you use caches on several levels, usually are charged to the cache the last variables used (temporal locality) and their neighbors (spatial locality), since it is very likely to use a nearby memory region. And what possibly I don’t have with Jagged arrays? Nearby memory regions between data.
To try to have a significant gain on that, we could iterate on the matrix as follows:
for ($i=0; $i < 10; $i++) {
for ($j=0; $j < 20; $j++) {
for ($p=0; $p < 40; $p++) {
echo $vaar[$i*20*40 + $j*40 + $p];
}
}
}
Or else:
for ($i=0; $i < 10*20*50; $i++) {
echo $vaar[$i];
}
In this case here, we linearize the matrix. Asymptotically, this iteration remains cubic, as it is impossible to go through all the elements of a cube faster. However, we are now dealing with consecutive data, data that will be contiguous in memory.
How the data is contiguous by clicking on cache L1 the vector at position, say, 10, the processor can put right into the cache the values of the vector between 5 and 14 (inclusive). Thus, the number of accesses to the main memory decreases. With less access to the main memory, we run faster. Not to mention that the amount of dereferencing is lower.
In a Jagged array, each look at the matrix index is a dereferencing, and dereferencing is to load the variable into memory and ask the processor to load the region pointed by it.
For example, to do $vaar[$i][$j][$p]
:
- take the address indicated by
$vaar
(load variable in memory)
- take the value
$i
(load variable in memory)
- add
$i
houses to $vaar
- load the memory value obtained in the sum
(address de-referencing)
- take the value
$j
(load variable in memory)
- add
$j
boxes to the previously loaded memory variable
- load the value in memory by the address obtained by this new sum
(address de-referencing)
- take
$p
(load variable in memory)
- add
$p
houses to memory loaded variable
- load the memory value obtained by the address obtained in that new sum
(address de-referencing)
Since the variables are in the same context, it is very likely that, by temporal location, carrying these variables does not involve accessing the main memory. Even so, for each $i
or j
different, it will be necessary to load the new vector vector that is being referenced. It is likely that they are scattered in memory.
Buy with the simplest iteration of matrix linearization: $vaar[$i]
- take the address indicated by
$vaar
(load variable in memory)
- take the value
$i
(load variable in memory)
- add
$i
houses to $vaar
- load the memory value obtained in the sum
(address de-referencing)
Ready, with only a single dereference, the variable is loaded in memory now. And since we’re dealing with a contiguous piece of information, the next number will be close and therefore operable quickly.
If you’re gonna do something like that, remember:
- to access the position
$i,$j,$p
of the cube, make the following account: $vaar[$i*20*40 + $j*40 + $p];
- your maintenance team will have ease of reading this code?
- if not, maybe the performance gain does not compensate for the new possible "illegibility" of the code
Depends on what you want, if you want to pick up a specific item could simplify a lot or if you want to display all the same you have nothing to do, the performance is always relative, usually if you have no operation within each loop, except display will be fast, even with 3 Fors.
– Guilherme Nascimento
If your variable is three-dimensional and you want to print out all the values, that’s the way it is. But the loops are short, you’re having some performance problem?
– bfavaretto
actually need to supplement my question then, because I just exemplified in micro scale, but in my real problem, are 4 Fors and each does something with the values of the vectors
– Julio Henrique
Well, the performance of this will depend on the size of the vector and the operations you do with each item. But if you need to access all the items, you need to nest the
for
even, there is no other way.– bfavaretto