How to remove a character-specific string from a Python string

Asked

Viewed 245 times

1

I would like to know how I can remove a specific string string string. The string would be: \r\n.

The texts are as follows::

'Let the Bird of loudest lay r non the Sole Arabian Tree r nHerald Sad and Trumpet be, r nto Whose sound chaste Wings Obey. r n r nBut Thou shrieking Harbinger, r nFoul precurrer of the Fiend, r nAugur of the Fever’s end, r nto this Troop come Thou not near. r n r nFrom this Session interdict r nEvery Fowl of tyrant Wing, r nSave the Eagle, Feather’d king; r nKeep the obsequy so Strict. r n r nLet the priest in surplice white, r nThat defunctive music can, r nBe the death-divining Swan, r nLest the Requiem Lack his right. r n r Nand Thou Treble-dated Crow, r nThat thy Sable gender Mak 'st r nWith the Breath Thou giv 'st and tak 'st, r n 'Mongst our mourners shalt Thou go. r n r nHere the Anthem doth commence: r nLove and constancy is Dead; r nPhoenix and the Turtle fled r Nin a Mutual Flame from Hence. r n r nso they Lov’d, as love in Twain r nHad the Essence but in one; r nTwo distincts, Division None: r nNumber there in love was slain. r n r nHearts remote, yet not asunder; r nDistance and no space was Seen r n 'Twixt this Turtle and his queen: r nBut in them it Were a Wonder. r n r nso between them love Did Shine r nThat the Turtle saw his right r nFlaming in the Phoenix ' Sight: r neither was the other’s mine. r n r nProperty was Thus appalled r nThat the self was not the same; r nSingle Nature’s double name r nNeither two nor one was called.

I tried to:

remove = "\r\n"

for i in range(0, len(remove)):
    new_poetry = poetry[0].replace(remove[i], " ")

But removes partially.

2 answers

1

You can use the method replace, but passing '\r\n' at once (and not separately, as you did):

texto = 'Let the bird of loudest lay\r\nOn the sole Arabian tree\r\nHerald sad and trumpet be,\r\nTo whose sound chaste wings obey.\r\n\r\nBut thou shrieking harbinger,\r\nFoul precurrer of the fiend,\r\nAugur of the fever\'s end,\r\nTo this troop come thou not near.\r\n\r\nFrom this session interdict\r\nEvery fowl of tyrant wing,\r\nSave the eagle, feather\'d king;\r\nKeep the obsequy so strict.\r\n\r\nLet the priest in surplice white,\r\nThat defunctive music can,\r\nBe the death-divining swan,\r\nLest the requiem lack his right.\r\n\r\nAnd thou treble-dated crow,\r\nThat thy sable gender mak\'st\r\nWith the breath thou giv\'st and tak\'st,\r\n\'Mongst our mourners shalt thou go.\r\n\r\nHere the anthem doth commence:\r\nLove and constancy is dead;\r\nPhoenix and the Turtle fled\r\nIn a mutual flame from hence.\r\n\r\nSo they lov\'d, as love in twain\r\nHad the essence but in one;\r\nTwo distincts, division none:\r\nNumber there in love was slain.\r\n\r\nHearts remote, yet not asunder;\r\nDistance and no space was seen\r\n\'Twixt this Turtle and his queen:\r\nBut in them it were a wonder.\r\n\r\nSo between them love did shine\r\nThat the Turtle saw his right\r\nFlaming in the Phoenix\' sight:\r\nEither was the other\'s mine.\r\n\r\nProperty was thus appalled\r\nThat the self was not the same;\r\nSingle nature\'s double name\r\nNeither two nor one was called.'
novo_texto = texto.replace('\r\n', '')
print(novo_texto)

In this case, it changes all occurrences of '\r\n' for '' (the empty string), that is, the result is the original string with all \r\n removed.


Another alternative is to use regular expressions, through the module re and its method sub:

import re

texto = 'Let the bird of loudest lay\r\nOn the sole Arabian tree\r\nHerald sad and trumpet be,\r\nTo whose sound chaste wings obey.\r\n\r\nBut thou shrieking harbinger,\r\nFoul precurrer of the fiend,\r\nAugur of the fever\'s end,\r\nTo this troop come thou not near.\r\n\r\nFrom this session interdict\r\nEvery fowl of tyrant wing,\r\nSave the eagle, feather\'d king;\r\nKeep the obsequy so strict.\r\n\r\nLet the priest in surplice white,\r\nThat defunctive music can,\r\nBe the death-divining swan,\r\nLest the requiem lack his right.\r\n\r\nAnd thou treble-dated crow,\r\nThat thy sable gender mak\'st\r\nWith the breath thou giv\'st and tak\'st,\r\n\'Mongst our mourners shalt thou go.\r\n\r\nHere the anthem doth commence:\r\nLove and constancy is dead;\r\nPhoenix and the Turtle fled\r\nIn a mutual flame from hence.\r\n\r\nSo they lov\'d, as love in twain\r\nHad the essence but in one;\r\nTwo distincts, division none:\r\nNumber there in love was slain.\r\n\r\nHearts remote, yet not asunder;\r\nDistance and no space was seen\r\n\'Twixt this Turtle and his queen:\r\nBut in them it were a wonder.\r\n\r\nSo between them love did shine\r\nThat the Turtle saw his right\r\nFlaming in the Phoenix\' sight:\r\nEither was the other\'s mine.\r\n\r\nProperty was thus appalled\r\nThat the self was not the same;\r\nSingle nature\'s double name\r\nNeither two nor one was called.'
r = re.compile(r'\r\n')
novo_texto = r.sub('', texto)
print(novo_texto)

The result is the same: all occurrences of \r\n are removed from the string.


In your code you are swapping the characters for a space (" "). If that’s what you want, you can use texto.replace('\r\n', ' ') or r.sub(' ', texto) (note that there is now a space between the quotes). Remember that in this case, the sequence \r\n (that is, these two characters, whenever they appear exactly in this order) will be replaced by a single space.


Just to explain why your code didn’t work:

remove = "\r\n"    
for i in range(0, len(remove)):
    new_poetry = poetry[0].replace(remove[i], " ")

This for calls the method replace once to \r and again to \n. But the problem is replace returns another string, leaving the original (poetry[0]) unchanged.

So in the first iteration you exchange all the \r by space and places in new_poetry, and in the second iteration you exchange the \n by space, but the replace is done in the original string (poetry[0]), which still contains the \r (then new_poetry now you’ll have the \n replaced, but the \r no - the previous value, obtained in the first iteration, which had only the \r replaced, is overwritten in the second iteration).

Already if you use \r\n, as the above solutions, the substitution will occur only if you have \r followed by \n (and both will be replaced at once by a single space if you use ' ' in the substitution methods - or removed if you use '').


The above solutions only make the replacement of a \r\n (that is, only these two characters, in this order). But if you want to replace also one \r or \n alone, can change to:

import re

texto = 'Let the bird of loudest lay\r\nOn the sole Arabian tree\rHerald sad and trumpet be,\nTo whose sound chaste wings obey.'
r = re.compile(r'\r\n?|\n')
novo_texto = r.sub(' ', texto)
print(novo_texto)

Now the regex has alternation (the character |, which means or), with two options:

  • \r\n?: one \r followed by a \n optional (the ? makes the \n optional), or
  • \n: the very character \n

Thus, the regex seeks to \r\n, or only \r (for the \n after it is optional), or (|) by just one \n. Any of these options is replaced by the string you pass on sub - in the example above, I used a space (' ').

0

The method strip() removes blanks.

new_poetry = ""
for line in poetry[0].splitlines():
    new_poetry += line.strip() + " "

Browser other questions tagged

You are not signed in. Login or sign up in order to post.