Clear string in Python (remove escape characters)

Asked

Viewed 2,211 times

1

I came across some strings with the following content, example:

"++++++//texto+++!!!+++//texto++++"

I’m trying to find a method to clear the phrase but I’m not succeeding, someone could help me?

  • 1

    What does "clear the sentence" mean? What would be the expected/desired result in this case?

  • meant to remove characters like this "+, /" in the middle of the string, I believe I can do this using regular expressions, but I’m lost in how to do.

  • 1

    If there are few characters to be removed, you can use the minha_string.replace('+', ''), for example. https://docs.python.org/3/library/stdtypes.html#str.replace

  • Unfortunately there are more characters, I needed a way that kind of formatted the text, I’ll explain better what I’m doing, I’m using scrapy to extract data from a page and the description of some items comes this way that I quoted, wanted the description in a good format to add the list of extracted items. For example, if the text comes: "++++//jamanta+++de++//stone+++" it holds "stone jamanta".

  • 1

    The title of your question is not 100% correct, which you want to remove not necessarily "escape characters".

1 answer

4


You can use Regex(Regular expressions) to find such words, follow a short example:

import re

test1 = "31teste123 regex==="
test2 = "++++//jamanta+++de+++//pedra++++"

def formatString(string):
    formatedArray = re.findall('[a-zA-Z]+', string)
    print(" ".join(formatedArray))

formatString(test1)
formatString(test2)

Output would be an array containing words:

Test1 = ['test', 'regex']

test2 = ['jamanta', 'de', 'stone']

You can test more things on this site: http://regexr.com/

  • 1

    To print the way he wants, it is possible to do this way: print(" ".join(formatedArray)), the result is each word separated by a space

  • Keeps giving Nameerror: name 'formatedArray' is not defined. Sorry if you ask for beast...

  • @Miltonteixeira Here it is running normally, try adding this line: formatedArray = ""

  • @Miltonteixeira Add it before using formatedArray

  • s = '++++text+++text++++' formatedArray = "" - s = s.replace("+", "") print (s. Join(formatedArray))

  • So? Sorry guy I’m trying to learn this.

  • @Miltonteixeira no, you are mixing the two solutions, my code by itself already removes all characters, like this "+", do as I did, use a function.

  • When I try to print your code it returns None... I believe I’m doing something wrong.

  • incidentally pardon it returns exactly as you said, but I need you to return "Jamanta of stone"

  • @Miltonteixeira Code has been edited, test now.

  • Gee Vinicius thank you so much, I owe you some barley!

Show 6 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.