Remove specific Python characters

Asked

Viewed 4,032 times

1

My doubt is the following, in this excerpt of code I am removing a character that specifies with the replace():

lista = [["de carlos,"],["des.dd carlossd,"],["Peixe, texto!"]]
lista_separados = ['.',',',':','"','?','!',';']



for i, j in enumerate(lista):
   lista[i] = j[0].replace(',','').replace('!','').replace('.','')

print (lista)

exit:

['de carlos', 'des dd carlossd', 'Peixe texto']

in this example I was able to delete the specified characters, but

someone has some idea of accomplishing this in another way?

2 answers

4

Can use Regular Expressions

Maybe there’s an even better way to do it, someone with more experience on the subject.

I’ll show you two options, the first one leaves only letters, numbers and spaces.

import re

lista = [["de carlos,"],["des.dd carlossd,"],["Peixe, texto!"]]
for i, j in enumerate(lista): lista[i] = re.sub('[^a-zA-Z0-9 ]', '', re.sub(r'\.', ' ', j[0]))

print (lista)

See working on repl

Thinking I, in case you wanted to keep one Full stop that is in a sentence, and not a . in the middle of a word, and any other special character.

import re

lista = [["Esse, ponto! vai permanecer, porque e um ponto final. Agora esses.pontos.serao.substituidos.por.espacos.porque.esta no@ meio¨&*() das #palavras"]]

for i, j in enumerate(lista): lista[i] = re.sub('[^a-zA-Z0-9 .]', '', re.sub(r'\.\b', ' ', j[0]))

print (lista)

See working on repl

1


Another way to do it is by using Translate, that:

lista = [["de carlos,"],["des.dd carlossd,"],["Peixe, texto!"]]
lista_separados = ['.',',',':','"','?','!',';']

trans = {ord(i): '' for i in lista_separados} # mapear codigo ascii de cada caracter para o seu substituto, neste caso nada...    
for idx, val in enumerate(lista):
   lista[idx][0] = val[0].translate(trans)
print(lista) # [['de carlos'], ['desdd carlossd'], ['Peixe texto']]

DEMONSTRATION

I don’t know why you have a list of lists, maybe you do, but otherwise you can just do it with a list of one dimension:

lista = [["de carlos,"],["des.dd carlossd,"],["Peixe, texto!"]]
lista_separados = ['.',',',':','"','?','!',';']

trans = {ord(i): '' for i in lista_separados}
lista = [j.translate(trans) for i in lista for j in i]
print(lista) # ['de carlos', 'desdd carlossd', 'Peixe texto']

DEMONSTRATION

Note that you can also pass None instead of empty string: {ord(i): None for ... }

  • I have a list of lists, because each sub-list refers to a content file[i]. txt from the base documents with various.txt files, got it? nice this way you explained! did not know...

  • @Williamhenrique in this case the first alternative is the appropriate one. Welcome, I’m glad I helped

Browser other questions tagged

You are not signed in. Login or sign up in order to post.