Function . split() buging the output of the string in python2, how to solve?

Asked

Viewed 104 times

2

Hello, I’m new to Python and my question may be obvious, but I would like to better understand the workings of the string; I am facing an error of execution of a function that I created and do not understand the behavior of the string of this case. The function takes from each variable of the "data" list the blanks and the text after the identification of a specific character ("#") using the function .split(), .strip() and .index(), leaving only the first part, without spaces, of each element of the list (using [s:]), whose value is returned by the function.

Follows the code:

arquivo = "treee; hpot #NomesEspecificos"
tipo = "30;40;10 #Numeros Especificos"
arq = "tritrem; bitrem; rodotrem #VehicleNamesIn"

dados = [tipol,arq,arquivo,tipo]

def separador(s):
    s = ''.join(s.split(' '))
    tail = s[s.index('#'):]
    head = s.strip(tail)
    return head

dados2=[separador(s) for s in dados] 
print(dados2)

After execution, this code returns:

['2168', 'tritrem;bitrem;rodotr', 'treee;hpot', '30;40;10']

But you should return:

['2168', 'tritrem;bitrem;rodotrem', 'treee;hpot', '30;40;10']

I have already made some changes in the formation of strings to achieve this demonstrated result, but if you run the variable "file" with the following change:

arquivo = "treee;hpot #CraneNameInput"

It will result in the same error identified in the output of the "Arq variable":

['2168', 'tritrem;bitrem;rodotr', ';hpo', '30;40;10']

I would like to understand the reason for this deviation and, if possible, a solution to this problem. I confess that I have tried a lot and have not yet solved. Thanks a lot.

  • Thanks for the suggestion!!

1 answer

4


Your code is not making very conventional calls, and there are more consistent and short ways to do what you want.

The problem you are having is that you are using the method .strip strings, to remove the suffix - only strip does not take into account the order of characters - it simply removes - on both sides of the string, all characters that are in the list, regardless of the order. Then, as in your example, the suffix includes the characters "e" and "m", these are being removed along with the rest.

In fact, the common method for separating "before and after a given character" is precisely the .split() - And the crazy part is that you’re also using this method, only for a purpose that’s not his. The normal way to remove all occurrences of a given character in a string is to use the method .replace, and not a .join on top of a .split. (The way it’s done though, though less efficient, harder to read, etc... the result is as expected).

Therefore, its function can be simply:

def separador(s):
    s = s.replace(' ', '')
    return s.split('#')[0]
  • Great explanation, I’m starting in Python and your answer contributed a lot, I’m already correcting all my code. Thank you for the attention!

  • You might want to take a look at the documentation of the methods available for string: https://docs.python.org/3/library/stdtypes.html#string-methods (note: there is so much, among legacy calls created decades ago, and relatively new things like .casefold that nobody knows everything Color, nor makes sense - but it is nice to give a read to know what already exists)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.