12
When we are dealing with trees and graphs we find these terms. What do they mean and why are they important in using data structures of these types and algorithms that manipulate them? What I gain or lose when one or the other occurs?
12
When we are dealing with trees and graphs we find these terms. What do they mean and why are they important in using data structures of these types and algorithms that manipulate them? What I gain or lose when one or the other occurs?
8
Breadth first and Depth first are two similar but distinct search strategies. Normally you find the full term: BFS or DFS, where the S of the acronym means search.
Its meanings are:
Yes, there are other search alternatives. Djikstra’s algorithm, for example, is a priority search. I’m not going to go into sordid details about other types of graph search, I’ll just take that quote
Before going into more detail, an extremely important point to choose which algorithm to use: in decision problems. The search in width is guaranteed to find solution (if any) in a RE problem, the search in depth does not; the search in depth finds much faster a problem in class R (which is a subclass of RE).
Joint definition of class R:
The class RE consists of all the decision problems that exist a Turing machine can define, in finite time, whether the answer is "yes"; that is the class of tractable problems (for "yes" answer). In it, we have the parade problem and the problem of recognizing whether a word belongs to the language of an unrestricted grammar.
The class co-RE, however, is composed of the problems to which some Turing machine would answer "no" in finite time; this is the class of tractable problems (for "no" answer). An example of co-RE problem is the domino problem using Wang tiles periodically (more details).
The class R is composed of the problems that always have the answer, whether "yes" or "no", in a finite time (that is why it is the intersection between RE and co-RE); that is the class of decidable problems. A problem that is contained in R (and also contained in NP, a very small and restricted subset of R) is the Hamiltonian path problem. Your decision version looks like this: given a graph G
(with weighted edges) and a number k
, I can go through all the vertices of G
only once so that my path has weight less than or equal to k
?
When working with graphs (and, consequently, with trees), we need to find some condition. There are several types of conditions to be found, but usually they are divided into the following:
Many algorithms start from a single point, while others start from several points (Floyd-Warshall part of all vertices at the same time _(ツ)_/).
For those starting from some unique source, it needs to determine how to visit the neighboring vertices to then continue the search. So you need to go to the neighbors' neighbors and so on and so forth.
The search in width visits, at a time, all vertices of a certain depth (related). Already the in depth, no, goes until the end of a path to then try an alternative path.
One problem to visualize this is finding the exit in a maze.
Given a map, where
.
means free space,#
means insurmountable wall,s
the exit andi
its initial position, it is possible to get out of the maze?Example:
####### #...#.# #.#...# #.###.# #.#s#.# #i....# #######
First: is this a graph problem? Yes, it is. But here we do not explicitly define the set of vertices and the set of edges. We have, here, 19 vertices (all Tiles with: .
or i
or s
) and 18 untargeted edges.
So we need to find a vertex with a special property, starting from a definite origin. Below, you check the execution of the algorithm in depth taking into account the following:
I’ll mark points already visited on that particular descent with a X
. When it is necessary to backtracking, I will mark the points of the old path with B
. When I reach the target, I will mark with success with Y
:
Now, let’s see the problem being solved with a wide search. Here, we will avoid making any kind of walk over the path already covered (as we did in the biscuit in depth). I will place in the lower left corner of the animation which level of depth I am expanding:
If we were to use another search, SEARCH FOR THE* would meet and would be even better than the search in width. But she is a heuristics, not a purely systematic search.
Another interesting problem about graphs is the problem of zombie bridge.
The problem consists of the following statement:
A group of
N
people released a horde of zombies in the laboratory from the top of the mountains. To escape from this horde, they need to cross a bridge. There’s only one person with a flashlight and it’s too dark. So, it’s only possible to cross the bridge with one person holding the lantern and one person following, right behind. And there’s something else, each person has a different time to cross the bridge, and when two people cross at the same time, they need to walk at the slowest pace.What is the shortest possible time to cross all the people of the bridge?
The input will be provided in two lines. In the first, we have a number
N
with the amount of people. In the second line, will haveN
numbers with the time (in minutes) each person takes to cross the bridge.Example:
4 1 2 5 10
First of all, is this a graph problem? Well, oddly enough, it is a graph problem... but, as with the other problem, the graph is implicit.
Let’s take the example case: 4 people, times 1, 2, 5 and 10. We can cross in 19 minutes if we go through the graph in one of these ways:
For this specific case, it has a heuristic defining an upper limit:
This ensures that I guarantee that at least soma todos - menor + (quantidade - 1)*menor
minutes I can cross the bridge (in the example in question this value is 18 - 1 + (4-1)*1 = 17 + 3 = 20
). With that, I can do an in-depth search and, if by chance my search has passed a path of more than 20 minutes, I can abort that descent. So in this case, a deep search goes well.
By the way, I found 17 minutes for everyone’s crossing, and it seems to be the best case
Finally, I would like to quote game Nim (more details). This game (in the chosen variation) consists of the following:
There are 2 players, Alice and Bob, who play alternately. There is a stack with
n
sticks, each player needs to remove 1, 2 or 3 toothpicks on his turn. The player who removes the last toothpick loses.You must determine the winning strategy for Alice, where the input of the problem consists of a positive integer
n
and which player will start playing.
That, again, is a graph. Each vertex of the graph consists of 2 pieces of information:
This generates a bipartite acyclic directed graph whose size is limited only by n
maximum; as nothing has been described regarding n
apart from being integer and positive, this graph has the potential to be infinite.
So we have the following vertices that define victory:
1, Alice
; as Alice needs to draw at least one toothpick, and has only one toothpick to be removed, so Bob is the winner1, Bob
; analogous to the previous, with Alice being the winnerSo, how to discover the winning strategy? Doing a set of searches by depth. If the vertex represents Alice’s turn, then all it takes is one of the possible edges to win for Alice; in case the turn is Bob’s, all possible edges must return victory to Alice to be a winning strategy. The advantage of using deep search is that we can use memoisation: if I find out that k, J
is guaranteed victory for Alice, so we have to, if by chance I meet again k, J
, I already know the answer to that situation and then I don’t need to sail through it again.
We know that 1, Bob
is guaranteed victory for Alice. And the vertex 5, Bob
? Well, from here we have 3 possibilities: 4, Alice
, 3, Alice
and 2, Alice
. If, by any chance, all these options result in a guaranteed victory for Alice, then 5, Bob
is victory for Alice.
Let’s go to 2, Alice
first. An alternative is to remove 2 toothpicks, causing Alice to lose. However, there is another alternative: to remove 1 toothpick. As we remove a toothpick, we go to the vertex 1, Bob
, so Alice is the winner.
Now in 3, Alice
. We have 3 alternatives: 1, Bob
, 2, Bob
and 3, Bob
. 2, Bob
and 3, Bob
do not generate guaranteed victory for Alice, but 1, Bob
gera. Then, 3, Alice
is also victory.
Finally, we have 4, Alice
. From here, the alternatives are: 3, Bob
, 2, Bob
and 1, Bob
. We found in the previous navigation that 3, Bob
and 2, Bob
may generate victory for Bob, but 1, Bob
generates victory for Alice.
Therefore, how 2, Alice
, 3, Alice
and 4, Alice
guarantee victory for Alice, 5, Bob
also guarantees. And 6, Alice
? Well, of 6, Alice
we can go to 5, Bob
, that guarantees victory. And 6, Bob
? Of 6, Bob
we fell in 5, Alice
, 4, Alice
(already proven Moose winner) and 3, Alice
(already proved Alice winning). So, of 5, Alice
we can guarantee Alice’s victory?
For that, we must see 4, Bob
, 3, Bob
(already proven that Bob can win) and 2, Bob
(already proven that Bob can win). From 4, Bob
we can go to 1, Alice
, that guarantees Bob victory. Therefore, in 6, Bob
Bob can win.
And 7, Alice
? Well, of 7, Alice
we can go to 5, Bob
, that is guaranteed victory for Alice. And so we can proceed to any time, using the in-depth search memoizing the results.
The search in width, however, would not favor in the case of memoization. It would add in the row of vertices to be visited the same vertex several times, increasing the maximum memory load. Not to mention that, to catch all of Bob’s moves of a certain vertex k,Bob
, the search in width will not facilitate to get the answer after finishing the navigation.
To do an in-depth search, it is said that you use a stack; already, for the wide search, you say you use a queue. They just don’t tell you which objects are stacked or lined up.
In the case of the search in width, the vertices themselves line up. More or less like this:
def busca_largura(origem, estrutura_auxiliar):
fila = Fila()
fila.push(origem)
while not fila.empty():
sendo_visitado = fila.pop()
if not estrutura_auxiliar.ja_visitou(sendo_visitado):
estrutura_auxiliar.marcar_visitado(sendo_visitado)
[fila.push(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]
On the other hand, the search in depth, although it is said to use a stack, usually this stack is delegated to the recursive call of functions:
def busca_profundidade(sendo_visitado, estrutura_auxiliar):
estrutura_auxiliar.marcar_visitado(sendo_visitado)
[busca_profundidade(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]
If you want to implement this iteratively, well, it’s possible. A naive implementation would stack the vertices. It is almost equal to the search in width, only changes the data structure used:
def busca_profundidade_iterativa_naive(origem, estrutura_auxiliar):
pilha = Pilha()
pilha.push(origem)
while not pilha.empty():
sendo_visitado = pilha.pop()
if not estrutura_auxiliar.ja_visitou(sendo_visitado):
estrutura_auxiliar.marcar_visitado(sendo_visitado)
[pilha.push(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]
But to be honest, in this case you’re using more memory than the recursive function. That’s why I called this solution naive. The recursive solution keeps the node being visited in the stack and also an indicator of which child is being visited. So, here I would stack a pair of data: the vertex and the index of the next child to be visited. Starting from 0, always. It looks like this:
def busca_profundidade_iterativa(origem, estrutura_auxiliar):
pilha = Pilha()
pilha.push((origem, 0))
while not pilha.empty():
sendo_visitado, index_filho = pilha.pop()
# index_filho == 0 indica a primeira vez que se vem nesse vértice
pop_valido = False
if index_filho == 0 and not estrutura_auxiliar.ja_visitou(sendo_visitado):
estrutura_auxiliar.marcar_visitado(sendo_visitado)
pop_valido = True
if index_filho > 0:
pop_valido = True
if pop_valido:
vizinhos = sendo_visitado.vizinhos()
while index_filho < len(vizinhos):
proximo_vizinho = vizinhos[index_filho]
if not estrutura_auxiliar.ja_visitou(proximo_vizinho):
pilha.push((sendo_visitado, index_filho + 1))
pilha.push((proximo_vizinho, 0))
break
index_filho += 1
Each problem has a specific search that fits better. However, it is worth highlighting some points:
O(|av|*n)
, where |av|
is the average of edges per vertex and n
is the depth level being investigated; therefore, in cases of full graphs (or simply not sparse), BFS occupies more memoryNormally, the choice of the search in width or depth depends on the domain of the problem, including some cases where the choice of a search can generate a better use of memory or allow optimizations (such as memoization or the existence of some cutting heuristic).
Browser other questions tagged algorithm terminology graph tree
You are not signed in. Login or sign up in order to post.
There are two very good articles on Wikipedia: Depth first and Breadth first. What changes between the two approaches is how the search for a certain value in the tree is done. Depth being a query for depth, taking a "leg" from the tree and going to the end. Breadth would already be by the width of the tree (horizontal), consulting each parent node before going to children nodes.
– user3603