3
The algorithm should receive a string, count how many words equal and return a list of tuples with the words that most appear in the string and how many times it appears. The problem is that in searches it is equal starting words it counts more often. Like: "but" and "Butter", it counts but 3X and Butter 2X. "Betty bought a bit of Butter but the Butter was Bitter"
I still wish to order first by the words that appear more and if an equal number of times appear, by alphabetical order of words. Type: "Falling" and "down", both appear 4X, so in the output sort first "down" and then "Falling". "London bridge is Falling down Falling down Falling down London bridge is Falling down my fair lady"
def count_words(s, n):
top_n = []
itens = n
words = s.split()
pref = words
for p in pref:
cont = 0
for w in words:
if w.startswith(p):
cont+=1
if (p, cont) not in top_n:
top_n.append((p, cont))
top_n.sort(key = lambda t:t[1], reverse = True)
#from operator import itemgetter
#sorted(top_n, key = itemgetter(1), reverse = True)
while len(top_n) > itens:
del top_n[len(top_n)-1]
return top_n
def test_run():
print count_words("cat bat mat cat bat cat", 3)
print count_words("betty bought a bit of butter but the butter was bitter", 3)
print(count_words("london bridge is falling down falling down falling down london bridge is falling down my fair lady", 5))
if __name__ == '__main__':
test_run()
Take a look at the
collections.Counter
also -– jsbueno