How to select codes with different sizes in pandas?

Asked

Viewed 108 times

0

In Python 3, with pandas, I have this dataframe with several codes in the columns "Cpf_cnpj_donor" and "Cpf_cnpj_donor"

cand_doacoes = pd.read_csv("doacoes_csv.csv",sep=';',encoding = 'latin_1',  decimal = ",")

cand_doacoes.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 427489 entries, 0 to 427488
Data columns (total 12 columns):
UF                                427489 non-null object
Partido                           427489 non-null object
Cargo                             427489 non-null object
Nome_candidato                    427489 non-null object
CPF_candidato                     427489 non-null int64
CPF_CNPJ_doador                   426681 non-null float64
Nome_doador                       427489 non-null object
Nome_doador_Receita               427489 non-null object
Valor                             427489 non-null float64
CPF_CNPJ_doador_originario        427489 non-null object
Nome_doador_originario            427489 non-null object
Nome_doador_originario_Receita    427489 non-null object
dtypes: float64(2), int64(1), object(9)
memory usage: 39.1+ MB

The codes in the columns "Cpf_cnpj_donor" and "Cpf_cnpj_donor" are always integer numbers with different sizes: 14 digits, 13 digits, 11 digits or 10 digits

I need to create a dataframe with only 14 and 13 digit codes. Please, does anyone know how I can select in the "cand_doacoes" dataframe only the 14 and 13 digit codes in the "Cpf_cnpj_doador" column? It is necessary first to turn into string?

1 answer

1


Hello,

according to the dataframe information above, the Cpfs you are dealing with are float64, which makes things a little easier.

You can make a slice in your dataframe taking only the values that interest you. For this, you can apply a function that detects the size of the Cpfs of the desired column.

Here I use the function apply with an anonymous function lambda as a parameter that makes the calculations for me.

df[df['CPF_CNPJ_doador'].apply(lambda x: len(str(x)) == 13 or len(str(x)) == 14)]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.