As you are training Python I will post my answer to contribute to your studies.
Negative rates
The first observation I have is that in your role palindromo
you "slice" the string in the following ways:
palavra[len(palavra) - 1] # pega a última letra da string
palavra[1:len(palavra) - 1] # cópia da string sem o primeiro e último caracteres
As string in Python are sequences, you can use negative indexes to access the sequence from its end. That is, you can take the last letter of a string searching for the index -1
, see:
palavra = "abcde"
palavra[-1] # "e"
The same concept can be used when slicing a sequence (see slicing):
palavra = "abcde"
palavra[1:-1] # "bcd"
Reading of a file
The second observation is a recommendation, when reading some file use with
, because so the object that reads the file is in charge of closing it, even if an exception is thrown during its reading. The change in your code is minimal, see:
Before:
palavras = open(nomeArquivo, "r")
for line in palavras:
# processa o texto
# você não liberou o arquivo para o sistema operacional (palavras.close())
return palavra
Afterward:
with open(nomeArquivo, "r", encoding="utf8") as arquivo:
for line in arquivo:
# processa o texto
# arquivo.close() foi executado automaticamente
return palavra
Return all words from a file
My third observation is that in your role obtemPalavras
, you create the variable palavra
within the for
, makes modifications to its content and does not store anywhere.
This way the second execution of the loop will erase the value contained in palavra
calculated in the first execution, and the third will erase the value of the second, and so on. At the end you are only returning the result of the last loop of the for
.
You could save the result of each loop in a list and return it at the end. Your code would look like this:
palavras = []
for line in arquivo:
palavra = line.replace("\n", "").lower().split()
palavras += palavra
return palavras
The above code breaks the line into words using the method str.split()
which returns a list of strings, this list is then added to the list palavras
that will store our results. By adding two lists you are concatenating them. The code below illustrates this:
lista = "1 2 3 4 5".split()
# ['1', '2', '3', '4', '5']
outra_lista = ['6', '7', '8']
# retorna um nova lista com as duas listas concatenadas
terceira_lista = lista + outra_lista
# ['1', '2', '3', '4', '5', '6', '7', '8']
# Concatena as duas lista e guarda o resultado dentro de `lista
lista = lista + outra_lista
# ou
lista += outra_lista
Final code:
def obtemPalavras(nomeArquivo):
palavras = []
with open(nomeArquivo, "r") as arquivo:
for line in arquivo:
palavras += line.replace("\n", "").lower().split()
return palavras
Possible problem with memory
If you assemble a list of all words and your file is too large, your program can spend a lot of resources to keep everything in memory.
If you use generators you can return 1 word at a time, as long as necessary.
Modifying the previous example to use generators would be:
def obtemPalavras(nomeArquivo):
with open(nomeArquivo, "r") as arquivo:
for line in arquivo:
for palavra in line.replace("\n", "").lower().split():
yield palavra
Improving the example using yield from
:
def obtemPalavras(nomeArquivo):
with open(nomeArquivo, "r") as arquivo:
for line in arquivo:
yield from line.replace("\n", "").lower().split()
Now the function obtemPalavras
reads our file and returns one word at a time, and as we ask for more words, it reads the file to fetch them. This way we could open a large file without fear of bursting your computer’s memory.
For example, imagine the file frase.txt
contains the following content:
Uma frase com cinco palavras
Outra frase
Olá mundo
We can go through all the words like this:
print("Palavras em 'frases.txt':")
for palavra em obtemPalavras("frases.txt"):
print("-", palavra)
The result would be:
Palavras em 'frases.txt':
- Uma
- frase
- com
- cinco
- palavras
- Outra
- frase
- Olá
- mundo
String handling
You do not need to remove the new line character using line.replace("\n", "")
because when you use the method str.split()
without any parameter, the separator used is any blank spaces, and as \n
is a white space is also removed from the resulting list.
"1 2 3 4 5 \n \n".split()
# ['1', '2', '3', '4', '5']
You are also converting your characters to low box using str.lower()
for your algorithm to be case insensitive, for knowledge purposes you could use the method str.casefold()
for the same purpose.
This method also converts to low box, but contains special character handling that does not work with str.lower()
(as the character ß
of German, for example).
Completion
Your final code would look like this:
def is_palindrome(string):
return string == string[::-1]
def get_words(filename):
with open(filename, 'r', encoding='utf8') as arquivo:
for linha in arquivo:
yield from linha.casefold().split()
for palavra in get_words('frases.txt'):
if is_palindrome(palavra):
print(palavra)
Exit:
ama
a
anna
é
o
asa
o
socorrammesubinoonibusemmarrocos
bob
Code running on Repl.it
I hope you’ve been helpful...