How to calculate the percentage of items performed in a data tree in the list format?

Asked

Viewed 1,819 times

0

There is the following list:

[
    [0, False],
    [1, True],
    [1, False],
    [2, False],
    [2, True],
    [2, False],
    [3, False],
    [3, True],
    [2, False],
    [1, False],
    [2, False]
]

This list forms the following tree:

árvore de dados

As you can see, the list respects the order that relates to the branching of each item, so in an order like (example) 1, 2, 3, 2, 1 says the first occurrence 2 belongs to the first 1, and the 3 belongs to the first 2, the second occurrence of 2 belongs to the first 1, and the last 1 has no branch.

I could not solve this problem, I thought I could use recursiveness, but the list items are sorted, each next item in the list can relate to the previous ones. Well, I don’t know how to fix it. In this example I quoted the calculated result gave 45,83% of the tree solved.

Explanation of the Tree of the Question:

The level 3 branch is 50% resolved (one True and one False), it belongs to a level 2 branch, so this branch has a value of 50%. These four level 2 branches have a total of (0% + 100% + 50% + 0%) which divided by 4 is 37.5%. Then there is at level 1 a branch of 100% another of 37.5% and another of 0%, which divides by three of 45.83%

How to solve this in Python?

  • What it means to be "solved"?

  • @Juanlopes solved is when is True

  • No fraction of denominator 11 (the number of nodes in the tree) rounds to 45.83%. Could you explain the result, please?

  • @Juanlopes I think the explanation is very clear. The level 3 branch is 50% resolved (a True and another False), it belongs to a branch of level 2, so this branch has a value of 50%. These four level 2 branches have a total of (0% + 100% + 50% + 0%) which divided by 4 is 37.5%. Then there is at level 1 a branch of 100% another of 37.5% and another of 0%, which divides by three of 45.83%.

  • It had not been clear. The interpretation I had initially given was "what percentage of us is solved?" , not "what percentage of subtrees is solved recursively?". But now I get it. I just think it should be in the body of the question.

  • It is interesting to note that, judging by the semantics, each node of the tree would be better represented by a tuple than by a list.

Show 1 more comment

2 answers

1

Python is a cool language because it doesn’t usually come between "you and the problem" - in this case, I think it took me longer to understand how you came to this 45.83% than to think of a way to solve the issue.

Well, it is necessary to represent to the computer each element of the list with the properties it has - in this case, each element has a list of other dependent elements and a % of how much is solved. If the element has "children" this percentage of how much is solved is equal to the list of children - otherwise, it is given by the value in your list.

This representation of the list is a minimal way of giving all the necessary information, but it doesn’t help to solve the problem - we need a Node class that has the two properties I mentioned: coefficients of how complete it is, and list of children - and from your given list we created this tree - we can have the refinement of using Python’s "Property" developer for the coefficient of how much is complete - this makes the reading of a coefficient of completeness actually the call of a function that calculates the value in real time.

class Node(object):
    def __init__(self, completeness=0):
        self._completeness =  float(completeness)
        self.children = []

    def add_child(self, child):
        self.children.append(child)

    @property
    def completeness(self):
        if self._completeness or not self.children:
            return self._completeness
        return sum(child.completeness for child in self.children) / len(self.children)

(If you’re not familiar with the syntax I used in the last line of Generator Expression, it’s worth checking out the documentation: https://www.python.org/dev/peps/pep-0289 - that code also takes advantage of the numerical value of the "True" in Python to be "1" for historical reasons - if it were not so, it would be necessary one more "if")

Good -this is the class as described above - only for it to describe the problem it is necessary to create a tree with nodes of this class from its entry list - this can be done with another function - it’s complicated, but note from the comments that it’s exactly how we would have to proceed if we were inserting this tree "on paper", drawing each node as we process it from the input list:

def build_tree(source):
    # transforma a lista num iterador:
    # isso permite que tomemos o primeiro elemento com a função "next"
    # e guardemos os demais elementos para serem consumidos pelo "for" abaixo.
    source = iter(source)
    # consome o primeiro nó na lista, criando a raiz da árvore)
    level, completeness = next(source)
    root = Node(completeness)
    # lista temporária que permite subir para os nós superiores
    # quando cada ramo for preenchido:
    tree_stack = [(level, root)]
    for level, completeness in source:
        new_node = Node(completeness)

        # se estamos de volta num nó pelo menos um nível acima do anterior -  remover
        # elementos da lista temporária - o nó que ficar por último
        # nessa lista será o último nó inserido com nível acima do
        # atual (portanto, o pai do atual)
        while tree_stack[-1][0] > level:
            tree_stack.pop()
        previous_level = tree_stack[-1][0]

        if level == previous_level:
            # o mesmo nível do último nó inserido -
            # inserir o novo nó como irmão do último 
            tree_stack[-2][1].add_child(new_node)
            # remover o irmão do nó atual da lista temporária - 
            # de forma que o penultimo elemento seja o sempre o 
            # pai de outros nós no nível atual
            tree_stack.pop()
        elif level > previous_level:
            #colocar o novo nó como filho do último nó criado
            tree_stack[-1][1].add_child(new_node)
        tree_stack.append((level, new_node))
    return root

And, so that the program runs with its sample data, when it is invoked as "stand alone", we can add this part - (using Python’s own string formatting to display the value of the completeness coefficient as a percentage):

if __name__ == "__main__":
    data = [
    [0, False],
    [1, True],
    [1, False],
    [2, False],
    [2, True],
    [2, False],
    [3, False],
    [3, True],
    [2, False],
    [1, False],
    [2, False]
]
    tree = build_tree(data)
    print ("A porcentagem de completude da árvore é {:.02%}".format(tree.completeness)
  • Despite solving my problem, I found Juan’s solution simpler, I do not know which performance is best, anyway thank you!

  • It’s not about performance - this is the solution to mount the tree in Python and make operations on it. the other answer gives a way to do the calculation using the tree representation the way it comes to you, without having to assemble a generic data structure. If it is a marathon problem, or a school problem - the other answer is "desired" - because it arrived in a good algorithm to do the manipulations to reach the desired number. If you’re part of a real-world problem - the approach of mounting real objects and having a maintainable system is more important.

  • This is not a question of college, I understand your point of view, however the problem in question is only to calculate the percentage of the tree and print the value, after that the data are not reused, so Juan’s solution was more interesting in my case. Anyway it was worth!

  • 1

    yes, if you only need that number and nothing else, his answer should be to accept.

1


You can navigate the tree implicitly by calculating the total solved by the children of each node, thus:

def complete(tree, i, level):
    first = i
    siblings = []
    while i < len(tree) and tree[i][0] == level:
        count, value = complete(tree, i+1, level+1)
        siblings.append(max(int(tree[i][1]), value))
        i += count + 1

    return (i - first, float(sum(siblings))/len(siblings or [0]))


L = [
    [0, False],
    [1, True],
    [1, False],
    [2, False],
    [2, True],
    [2, False],
    [3, False],
    [3, True],
    [2, False],
    [1, False],
    [2, False]
]

print complete(L, 0, 0)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.