How to remove unwanted characters from a list of strings?

Asked

Viewed 2,738 times

1

I’m new to python and I couldn’t find an answer to my question. I get a list of texts from the database and turn it into a list of strings as below:

textosPuros = df['texto']
# print(textosPuros)

textoMinusculo = textosPuros.str.lower().str.split(' ')

# print(textoMinusculo)

textoLimpo = [item for item in textoMinusculo if item not in ['\n', '\t', '/', '.', '-', '(', ')']]

In an attempt to clear the strings so that I can work with them normalized, I implemented the last line, but I still have bad characters:

[['\testá', 'tossindo', 'noite.mãe', 'fez', 'inalação', 'com', 'berotec', 'essa', 'noite,com', 'melhora', '.está', 'usando', 'o', 'piemonte', 'há', '2', 'cuidou', '', ',', '----', 'com', '', 'febre.é', 'muito', 'ansioso', 'e', 'agitado.\nex.f:beg', 'corado,com', 'taquipnéia', 'leve', 'afebril', 's/sinais', 'meníngeos', 'otosc:nl', 'cavum:hiperemia', 'pulmões:esc+sibilos', 'abdome', 'nl\t'],['mais strings','episódio\n\nap\n-','\t0000000000\t'],['outra lista','menopausea.\n\nexames']]

How do I remove these unwanted characters? How

\t n : . , - _

  • good afternoon friend, see if this helps you: https://stackoverflow.com/questions/29251378/replace-n-in-a-string-in-python-2-7

  • @Anderson, it’s a list not a simple string

  • @Henriquemendes if it is a string list, just go through the list and apply the given solutions. It changes practically nothing.

  • It returns an error : Attributeerror: 'list' Object has no attribute 'replace' whenever I tried

  • I need to return this value to the list.

Show 2 more comments

1 answer

1


This should work

novo = []
for x in lista:
    item = x
    for y in ['\n', '\t', '/', '.', '-', '(', ')']:
        item = item.replace(y, "")
    novo.append(item)

or

novo = []
for x in lista:
    novo.append(x.translate(None, "\n\t/.-()"))
  • The first solved quite a thing yes! Thank you very much!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.