Delete string snippet whenever Python repeats

Asked

Viewed 39 times

0

I have a Dataframe that in a given column there are many names and all these names have a type extension:

  • "NAME A_SOJA_2020.xlsx"
  • "NAME B_SOJA_2020.XLSX"

I would like to develop a function or code to always remove the "_SOJA_2020.xlsx" chunk and only keep the name in the Dataframe. I was doing in the following method:

df3['Nome da Seguradora'] = df3['Nome da Seguradora'].apply(lambda x: str(x).replace('Essor_Soja_2020.xls','Essor'))

However, doing this for each name in Dataframe does not seem to me the best solution, if someone can help me find a more optimized solution, thank you.

2 answers

1

You can simply exchange the unwanted chunk for an empty string:

df3['Nome da Seguradora'] = df3['Nome da Seguradora'].apply(lambda x: str(x).replace('_Soja_2020.xls',''))
  • Wow, how inattentive of me... Thank you

1

I didn’t take the performance test, but in general using pre-built functions is preferable to using the apply, only with the replace gets like this:

df['Nome da Seguradora'].replace({'Soja_2020.xls' : ''}, regex=True, inplace = True)
  • Good point, performed better even. Thank you

Browser other questions tagged

You are not signed in. Login or sign up in order to post.