Remove specific set from the end of the regex array

Asked

Viewed 95 times

4

GOAL

I am trying to remove all elements of the array from the contents ":PM" with regex, but is returning me error.

DETAILS

The array will always contain ":PM", and it is only this part that I wish to remove from the array elements.

SCRIPT

import re

array = ['SOLDADO1:PM','SOLDADO2','SOLDADO3:PM','SOLDADO4','SOLDADO5:PM']

regex = r"^...-(.*)"
re.match(regex, array)

for linha in array:
    print(linha)

OUTPUT

Traceback (Most recent call last): File "regex.py", line 6, in re.match(regex, array) File "C: Python27 lib re.py", line 141, in match Return _Compile(Pattern, flags). match(string) Typeerror: expected string or buffer

2 answers

5


If the excerpt ":PM" is always at the end of the string, just do:

import re

array = ['SOLDADO1:PM','SOLDADO2','SOLDADO3:PM','SOLDADO4','SOLDADO5:PM']
regex = re.compile(r':PM$')
for texto in array:
    print(regex.sub('', texto))

I use the bookmark $ to specify the end of the string.

Then, for each element of the array, I use the method sub to replace the corresponding chunk with the empty string (''), which is the same as removing the.

Exit:

SOLDADO1
SOLDADO2
SOLDADO3
SOLDADO4
SOLDADO5

If you want to generate another list with strings without the snippet, just do:

outra_lista = [ regex.sub('', texto) for texto in array ]

The result is the list ['SOLDADO1', 'SOLDADO2', 'SOLDADO3', 'SOLDADO4', 'SOLDADO5'].


About your regex: ^...-(.*), It didn’t work because she has the following:

  • ^: string start marker
  • ...-: 3 characters followed by a hyphen
  • (.*): zero or more characters

That is, far from being what you need. Also, the method match takes a string, not a list. That’s why we do the loop and we apply regex to each element of the list.

  • 1

    Oops, thank you very much, your didactic was perfect, congratulations!

2

I know the other answer has already been accepted, but I think it is worth adding to anyone reading this question in the future:

Do not use regexp for this, the operation in this case is too simple to be worth using regexp; you will have simpler code and it works better using the method endswith and slicing the string:

outra_lista = [texto[:-3] if texto.endswith(':PM') else texto for texto in array]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.