What’s Breadth First and Depth First?

Asked

Viewed 517 times

12

When we are dealing with trees and graphs we find these terms. What do they mean and why are they important in using data structures of these types and algorithms that manipulate them? What I gain or lose when one or the other occurs?

  • There are two very good articles on Wikipedia: Depth first and Breadth first. What changes between the two approaches is how the search for a certain value in the tree is done. Depth being a query for depth, taking a "leg" from the tree and going to the end. Breadth would already be by the width of the tree (horizontal), consulting each parent node before going to children nodes.

1 answer

8


Breadth first and Depth first are two similar but distinct search strategies. Normally you find the full term: BFS or DFS, where the S of the acronym means search.

Its meanings are:

  • breadth first search: first search for width
  • Depth first search: search for depth first

Yes, there are other search alternatives. Djikstra’s algorithm, for example, is a priority search. I’m not going to go into sordid details about other types of graph search, I’ll just take that quote


Before going into more detail, an extremely important point to choose which algorithm to use: in decision problems. The search in width is guaranteed to find solution (if any) in a RE problem, the search in depth does not; the search in depth finds much faster a problem in class R (which is a subclass of RE).

Joint definition of class R:

classe R, imagem cortesia do Anderson Carlos Woss

The class RE consists of all the decision problems that exist a Turing machine can define, in finite time, whether the answer is "yes"; that is the class of tractable problems (for "yes" answer). In it, we have the parade problem and the problem of recognizing whether a word belongs to the language of an unrestricted grammar.

The class co-RE, however, is composed of the problems to which some Turing machine would answer "no" in finite time; this is the class of tractable problems (for "no" answer). An example of co-RE problem is the domino problem using Wang tiles periodically (more details).

The class R is composed of the problems that always have the answer, whether "yes" or "no", in a finite time (that is why it is the intersection between RE and co-RE); that is the class of decidable problems. A problem that is contained in R (and also contained in NP, a very small and restricted subset of R) is the Hamiltonian path problem. Your decision version looks like this: given a graph G (with weighted edges) and a number k, I can go through all the vertices of G only once so that my path has weight less than or equal to k?


What is a search in a graph?

When working with graphs (and, consequently, with trees), we need to find some condition. There are several types of conditions to be found, but usually they are divided into the following:

  • find a vertex with a certain property
  • find a path (set of edges) with some particular property
  • find a single edge with a particular property

Many algorithms start from a single point, while others start from several points (Floyd-Warshall part of all vertices at the same time _(ツ)_/).

For those starting from some unique source, it needs to determine how to visit the neighboring vertices to then continue the search. So you need to go to the neighbors' neighbors and so on and so forth.

The search in width visits, at a time, all vertices of a certain depth (related). Already the in depth, no, goes until the end of a path to then try an alternative path.

One problem to visualize this is finding the exit in a maze.

Given a map, where . means free space, # means insurmountable wall, s the exit and i its initial position, it is possible to get out of the maze?

Example:

#######
#...#.#
#.#...#
#.###.#
#.#s#.#
#i....#
#######

First: is this a graph problem? Yes, it is. But here we do not explicitly define the set of vertices and the set of edges. We have, here, 19 vertices (all Tiles with: . or i or s) and 18 untargeted edges.

So we need to find a vertex with a special property, starting from a definite origin. Below, you check the execution of the algorithm in depth taking into account the following:

  1. while building a path, I do not go back (we already visited will not be revisited in a single descent)
  2. If there’s more than one way to go, first I go left, then up, then right, and last way down
  3. if by chance I hit a path that has already been traveled, then that means I need to do backtrack and I can abort that descent

I’ll mark points already visited on that particular descent with a X. When it is necessary to backtracking, I will mark the points of the old path with B. When I reach the target, I will mark with success with Y:

percorrendo em profundidade

Now, let’s see the problem being solved with a wide search. Here, we will avoid making any kind of walk over the path already covered (as we did in the biscuit in depth). I will place in the lower left corner of the animation which level of depth I am expanding:

percorrendo em largura

If we were to use another search, SEARCH FOR THE* would meet and would be even better than the search in width. But she is a heuristics, not a purely systematic search.

Another interesting problem about graphs is the problem of zombie bridge.

The problem consists of the following statement:

A group of N people released a horde of zombies in the laboratory from the top of the mountains. To escape from this horde, they need to cross a bridge. There’s only one person with a flashlight and it’s too dark. So, it’s only possible to cross the bridge with one person holding the lantern and one person following, right behind. And there’s something else, each person has a different time to cross the bridge, and when two people cross at the same time, they need to walk at the slowest pace.

What is the shortest possible time to cross all the people of the bridge?

The input will be provided in two lines. In the first, we have a number N with the amount of people. In the second line, will have N numbers with the time (in minutes) each person takes to cross the bridge.

Example:

4
1 2 5 10

First of all, is this a graph problem? Well, oddly enough, it is a graph problem... but, as with the other problem, the graph is implicit.

Let’s take the example case: 4 people, times 1, 2, 5 and 10. We can cross in 19 minutes if we go through the graph in one of these ways:

grafo bonitão do problema da ponte de zumbis

For this specific case, it has a heuristic defining an upper limit:

  1. the fastest person ever accompanies all other people
  2. if you still have someone left to cross, as fast as you go back to the bridge alone

This ensures that I guarantee that at least soma todos - menor + (quantidade - 1)*menor minutes I can cross the bridge (in the example in question this value is 18 - 1 + (4-1)*1 = 17 + 3 = 20). With that, I can do an in-depth search and, if by chance my search has passed a path of more than 20 minutes, I can abort that descent. So in this case, a deep search goes well.

By the way, I found 17 minutes for everyone’s crossing, and it seems to be the best case

Finally, I would like to quote game Nim (more details). This game (in the chosen variation) consists of the following:

There are 2 players, Alice and Bob, who play alternately. There is a stack with n sticks, each player needs to remove 1, 2 or 3 toothpicks on his turn. The player who removes the last toothpick loses.

You must determine the winning strategy for Alice, where the input of the problem consists of a positive integer n and which player will start playing.

That, again, is a graph. Each vertex of the graph consists of 2 pieces of information:

  1. number of toothpicks in the stack
  2. the turn of who should remove the toothpicks

This generates a bipartite acyclic directed graph whose size is limited only by n maximum; as nothing has been described regarding n apart from being integer and positive, this graph has the potential to be infinite.

So we have the following vertices that define victory:

  • 1, Alice; as Alice needs to draw at least one toothpick, and has only one toothpick to be removed, so Bob is the winner
  • 1, Bob; analogous to the previous, with Alice being the winner

So, how to discover the winning strategy? Doing a set of searches by depth. If the vertex represents Alice’s turn, then all it takes is one of the possible edges to win for Alice; in case the turn is Bob’s, all possible edges must return victory to Alice to be a winning strategy. The advantage of using deep search is that we can use memoisation: if I find out that k, J is guaranteed victory for Alice, so we have to, if by chance I meet again k, J, I already know the answer to that situation and then I don’t need to sail through it again.

We know that 1, Bob is guaranteed victory for Alice. And the vertex 5, Bob? Well, from here we have 3 possibilities: 4, Alice, 3, Alice and 2, Alice. If, by any chance, all these options result in a guaranteed victory for Alice, then 5, Bob is victory for Alice.

Let’s go to 2, Alice first. An alternative is to remove 2 toothpicks, causing Alice to lose. However, there is another alternative: to remove 1 toothpick. As we remove a toothpick, we go to the vertex 1, Bob, so Alice is the winner.

Now in 3, Alice. We have 3 alternatives: 1, Bob, 2, Bob and 3, Bob. 2, Bob and 3, Bob do not generate guaranteed victory for Alice, but 1, Bob gera. Then, 3, Alice is also victory.

Finally, we have 4, Alice. From here, the alternatives are: 3, Bob, 2, Bob and 1, Bob. We found in the previous navigation that 3, Bob and 2, Bob may generate victory for Bob, but 1, Bob generates victory for Alice.

Therefore, how 2, Alice, 3, Alice and 4, Alice guarantee victory for Alice, 5, Bob also guarantees. And 6, Alice? Well, of 6, Alice we can go to 5, Bob, that guarantees victory. And 6, Bob? Of 6, Bob we fell in 5, Alice, 4, Alice (already proven Moose winner) and 3, Alice (already proved Alice winning). So, of 5, Alice we can guarantee Alice’s victory?

For that, we must see 4, Bob, 3, Bob (already proven that Bob can win) and 2, Bob (already proven that Bob can win). From 4, Bob we can go to 1, Alice, that guarantees Bob victory. Therefore, in 6, Bob Bob can win.

And 7, Alice? Well, of 7, Alice we can go to 5, Bob, that is guaranteed victory for Alice. And so we can proceed to any time, using the in-depth search memoizing the results.

The search in width, however, would not favor in the case of memoization. It would add in the row of vertices to be visited the same vertex several times, increasing the maximum memory load. Not to mention that, to catch all of Bob’s moves of a certain vertex k,Bob, the search in width will not facilitate to get the answer after finishing the navigation.


To do an in-depth search, it is said that you use a stack; already, for the wide search, you say you use a queue. They just don’t tell you which objects are stacked or lined up.

In the case of the search in width, the vertices themselves line up. More or less like this:

def busca_largura(origem, estrutura_auxiliar):
  fila = Fila()
  fila.push(origem)

  while not fila.empty():
    sendo_visitado = fila.pop()
    if not estrutura_auxiliar.ja_visitou(sendo_visitado):
      estrutura_auxiliar.marcar_visitado(sendo_visitado)
      [fila.push(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]

On the other hand, the search in depth, although it is said to use a stack, usually this stack is delegated to the recursive call of functions:

def busca_profundidade(sendo_visitado, estrutura_auxiliar):
  estrutura_auxiliar.marcar_visitado(sendo_visitado)
  [busca_profundidade(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]

If you want to implement this iteratively, well, it’s possible. A naive implementation would stack the vertices. It is almost equal to the search in width, only changes the data structure used:

def busca_profundidade_iterativa_naive(origem, estrutura_auxiliar):
  pilha = Pilha()
  pilha.push(origem)

  while not pilha.empty():
    sendo_visitado = pilha.pop()
    if not estrutura_auxiliar.ja_visitou(sendo_visitado):
      estrutura_auxiliar.marcar_visitado(sendo_visitado)
      [pilha.push(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]

But to be honest, in this case you’re using more memory than the recursive function. That’s why I called this solution naive. The recursive solution keeps the node being visited in the stack and also an indicator of which child is being visited. So, here I would stack a pair of data: the vertex and the index of the next child to be visited. Starting from 0, always. It looks like this:

def busca_profundidade_iterativa(origem, estrutura_auxiliar):
  pilha = Pilha()
  pilha.push((origem, 0))

  while not pilha.empty():
    sendo_visitado, index_filho = pilha.pop()
    # index_filho == 0 indica a primeira vez que se vem nesse vértice

    pop_valido = False
    if index_filho == 0 and not estrutura_auxiliar.ja_visitou(sendo_visitado):
      estrutura_auxiliar.marcar_visitado(sendo_visitado)
      pop_valido = True
    if index_filho > 0:
      pop_valido = True

    if pop_valido:
      vizinhos = sendo_visitado.vizinhos()
      while index_filho < len(vizinhos):
        proximo_vizinho = vizinhos[index_filho]
        if not estrutura_auxiliar.ja_visitou(proximo_vizinho):
          pilha.push((sendo_visitado, index_filho + 1))
          pilha.push((proximo_vizinho, 0))
          break
        index_filho  += 1

On the use of resources

Each problem has a specific search that fits better. However, it is worth highlighting some points:

  1. Deep search allows to make memoization in some problems (see Nim problem/bridge of zombies)
  2. the maximum DFS stack size is the graph depth
  3. the maximum length of the BFS queue is O(|av|*n), where |av| is the average of edges per vertex and n is the depth level being investigated; therefore, in cases of full graphs (or simply not sparse), BFS occupies more memory
    In Dags, the same vertex may belong to several depths of the graph; for example, in the labyrinth, the output has two depths: 4 and 16; this is one of the factors that explode the length of the queue in a BFS
  4. in infinite graphs (even Dags), a deep search may not return the answer, as in RE problems
  5. if the answer is in a vertex of finite depth, even if the graph is infinite, the search in width at all times will find the answer when exploring the minor of its depths

Normally, the choice of the search in width or depth depends on the domain of the problem, including some cases where the choice of a search can generate a better use of memory or allow optimizations (such as memoization or the existence of some cutting heuristic).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.