2
I have this text file that is processed to capitalize and this part does correctly.
olá meu nome é meu nome pois eu olá
é meu nome walt não disney
olá
Then I have this function which should be able to calculate the frequency of each word (and does it as it should). And then you should sort the list dataFreq
and make the calculation of what the probability of a given word appears in the text. That is, in this way: frequenciaPalavra/totalPalavras
def countWordExact(dataClean):
count = {}
dataFreq = []
global total
for word in dataClean.splitlines():
for word in word.split(" "):
if word in count:
count[word] += 1
else:
count[word] = 1
total += 1
dataFreq.append(count)
freq = []
for indice in sorted(count, key=count.get):
#print(count[indice])
freq.append((count[indice])/total)
#print(freq)
return dataFreq
My question is: how to order the dictionary (consecutively the list) and add to this the values resulting from the calculation of the frequency indicated above? Take the example:
[{'olá': 0.12, 'meu': 0.12, 'nome': 0.132, 'é': 0.12321, 'pois': 0.56, 'eu': 0.65, 'walt': 0.7, 'não': 0.7, 'disney': 0.5}]
(the above brake values are wrong)
thx, optimized my code. But how do I save this last print?
– Walt057
@Walt057 I edited the answer by building a dictionary of probabilities
– Woss