Even the "a" is showing up I don’t know why!
The code as is passes in every word of lista
and see if it exists in the text. And it doesn’t have to exist as a loose word, just exist in the middle and that’s why the a
appears:
texto = "Hoje é sábado, vamos sair pois o dia está bonito. Até mais tarde."
#o 'a' está aqui---^---
The operator in
Python in this case checks whether the text contains the.
For your goal just reverse the logic of the for
going through the text word by word and checking if it exists in the list. This not only solves the problem of a
as warrants your order:
lista = ["dia", "noite", "tarde", "é", "está", "bonito", "o", "a", "muito", "feio"]
texto = "Hoje é sábado, vamos sair pois o dia está bonito. Até mais tarde."
frase = []
for palavras in texto.split(' '): #agora texto e com split(' ') para ser palavras
if palavras in lista: #para cada palavra agora verifica se existe na lista
frase.append(palavras)
print (' '.join(frase))
See the example in Ideone
Note that dividing words with spaces will catch words with characters like .
and ,
, getting words like bonito.
or tarde.
, causing the code not to find them
You can get around this problem in many ways. One of the simplest is to remove these markers before analyzing:
texto2 = texto.replace('.','').replace(',','');
See Ideone how it looks with this pre analysis
You can even do something more generic and create a list of scorecards to remove and remove through a custom function:
def retirar(texto, careteres):
for c in careteres:
texto = texto.replace(c, '')
return texto
And now use this function over the original text:
texto2 = retirar(texto, ".,");
See also this example in Ideone
It is necessary to remove the punctuation of your input, in this case "beautiful", it will not be included: https://repl.it/Mp7s . Right? That’s what you want?
– Miguel
@Miguel Thanks, I thought there would be a way to extract "the day is beautiful" which is a sequence with words from the list.
– pitanga
Pitanga... Good exercise (; https://repl.it/Mp7s/4 . Thus we find all sequences (more than one word) in a text
– Miguel
Wow!!! I’m speechless @Miguel I’m going to comment on all this code you made. And then I’m going to try to redo it myself. That’s wonderful, thank you!
– pitanga
You’re welcome... good luck
– Miguel