Count occurrences in a list according to prefixes

Asked

Viewed 5,471 times

8

Let’s say I have a list

['rato', 'roeu', 'rolha', 'rainha', 'rei', 'russia']

and another list with prefixes

['ro', 'ra', 'r']

how do I count how many times each prefix is within the first list?

4 answers

7

One-Liner functional abominable with sum and map:

sum(map(lambda x: 1 if x.startswith(tuple(pref)) else 0, words))

Abominable one-Liner with reduce:

reduce(lambda x, y: x + 1 if y.startswith(tuple(pref)) else x, words, 0)

:)


Update:

According to OP requirements, even more abominable one-Liner snow:

map(lambda p: reduce(lambda c, w: c + 1 if w.startswith(p) else c, words, 0), pref)

Less forced example:

def countPrefix(words, prefix):
    return len([1 for w in words if w.startswith(prefix)]) 

[countPrefix(words, p) for p in pref]   

Upshot:

[2, 2, 6]
  • this only gives me the number of occurrences of one of the prefixes, in this case it only gives me 6.. I need the number of occurrences of each of the prefixes

  • The least forced example I can’t get by working.. I don’t know how you did it

  • What is your question? I am declaring a function to count the amount of a certain prefix in a list of words. Next we have a comprehensilist on loose by applying the function to all declared prefixes. To get the result you can, for example, throw it in a variable and have it printed (see example)

  • ah ok, I get it, thank you!

6

I could do it this way:

>>> palavras = ['rato', 'roeu', 'rolha', 'rainha', 'rei', 'russia']
>>> prefixos = ['ro', 'ra', 'r']
>>> len(filter(None, [p if p.startswith(tuple(prefixos)) else None for p in palavras]))
6

*In this case to know how many words had one of the occurrences of the list.

  • 1

    I noticed here that you can avoid this filter simply using [p for p in palavras if p.startswith(tuple(prefixos))].

6

Assuming that there is no "hierarchy" between prefixes (e.g., every word that begins with ro also begins with r), a simple and direct way is using the itertools.product. It will combine each element of the first list with each element of the second list. Then just filter those that the second prefix the first, and count:

>>> import itertools
>>> palavras = ['rato', 'roeu', 'rolha', 'rainha', 'rei', 'russia']
>>> prefixos = ['ro', 'ra', 'r']
>>> len([palavra for palavra,prefixo in itertools.product(palavras, prefixos) if palavra.startswith(prefixo)])
10

Note that the first 4 words have been counted twice (because they begin as much with ro/ra as to r). For a solution that only counts each word once, see for example Orion and Anthony Accioly (that of the Dherik does both, and also counts occurrences by prefix).

  • I need him to tell me the occurrences of each in part, I apologize I did not specify it well. In this case I gave 2,2 and 6

5


words = ['rato', 'roeu', 'rolha', 'rainha', 'rei', 'russia']
pref = ['ro', 'ra', 'r']

contTotal = 0
for p in pref:
    cont = 0
    for w in words:
        if w.startswith(p):
            cont+=1
    contTotal += cont
    print p + ' aparece ' + str(cont) + ' vezes nas palavras como prefixo'
print 'O numero total de vezes é ' + str(contTotal)
  • There’s probably a way, using less code, to do this too.

  • This code of yours does not check correctly, as it is not only picking up the prefix, but any occurrence within the word.

  • 1

    True, my fault :P, I’ll adjust

  • Your counter is being reset too, you should start outside the loop

  • 1

    I did it this way because it was not clear if I wanted to count the total or number of occurrences by prefix. I left the 2 forms :)

  • works perfectly :) thanks, but the contTotal in this case was not necessary

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.