Print of the smallest number of a pandas table

Asked

Viewed 53 times

-1

Hello, I have a CSV file with 1000 lines and 10 columns, one of the columns shows the age of the people, the minimum male age is 0 and the maximum is 96, but when giving the print the result comes out with a dash "-".

for example, if you print the min and the max will come out:

  • until 96 But I need you to leave: 0 to 96

follows the code below

import pandas as pd

contas = pd.read_csv('contas.csv', sep=';')
Feminino = contas.loc[contas['sexo'] == 'F']
Masculino = contas.loc[contas['sexo'] == 'M']
print('''Faixa etária Feminina: {} até {}'''.format(min(Feminino['idade']),max(Feminino['idade'])))
print('''Faixa etária Masculina: {} até {}'''.format(min(Masculino['idade']),max(Masculino['idade'])))

cont_F = 0
cont_M = 0
for i in contas['sexo']:
    if i == 'F':
        cont_F +=1
    elif i == 'M':
        cont_M += 1
print('Feminino: {} Masculino: {}'.format(cont_F,cont_M))

inserir a descrição da imagem aqui

I need only in the place that is circled in red is the value of 0.

  • You tried to Masculino['idade'].min()? I therefore ask min(Series) retrieves Nan as a low value, but Series.min() ignore Nan

  • Already, but in the table is not as Nan, is with the value of 0 same (the numeric)

  • Tried using f-string? Something like print(f"De {Masculino['idade'].min()} até {Masculino['idade'].max()}")

  • Already, and continues with the dash "-". " Appears: "From - to 96"

  • Now, just with the same data to try to know what’s going on.

  • Matric;sex;age;meet;tuss;servico;plano;vl_unit;vl_ref;Qtde 14993;M;0;06/12/2017;20201010;RENAL TRANSPLANT CLINICAL FOLLOW-UP AT THE RECIPIENT’S ADMISSION PERIOD;1077;210.00;202.01;1 10258;M;27;14/03/2016;31602037;GENERAL OR CONDUCTIVE ANESTHESIA FOR CARRYING OUT NEUROLYTIC BLOCK;1145;492.99;316.02;1 BILAT;1046;192.24;160.20;22 "Try throwing this data by creating a CSV archive, there are two male ages, one with 27 and the other with 0"

Show 1 more comment

1 answer

0


Using the data posted in the comment.

Importing library

>>> import pandas as pd

Uploading file

>>> contas = pd.read_csv('dados.csv', delimiter=";")

Filtering dataframe

>>> Masculino = contas.loc[contas['sexo'] == 'M']

Checking filter

>>> Masculino
   matric sexo  idade       atend  ...  plano vl_unit  vl_ref  qtde
0   14993    M      0  06/12/2017  ...   1077  210.00  202.01     1
1   10258    M     27  14/03/2016  ...   1145  492.99  316.02     1

[2 rows x 10 columns]

Printing result

>>> print('''Faixa etária Masculina: {} até {}'''.format(min(Masculino['idade']),max(Masculino['idade'])))
Faixa etária Masculina: 0 até 27

However, if you have a hyphen in the data, it will generate the result presented by you.

>>> import pandas as pd

>>> contas = pd.read_csv('dados-com-hifen.csv', delimiter=";")

>>> contas
   matric sexo idade       atend  ...  plano vl_unit  vl_ref  qtde
0   14993    M     -  06/12/2017  ...   1077  210.00  202.01     1
1   10258    M    27  14/03/2016  ...   1145  492.99  316.02     1

[2 rows x 10 columns]

>>> Masculino = contas.loc[contas['sexo'] == 'M']

>>> Masculino
   matric sexo idade       atend  ...  plano vl_unit  vl_ref  qtde
0   14993    M     -  06/12/2017  ...   1077  210.00  202.01     1
1   10258    M    27  14/03/2016  ...   1145  492.99  316.02     1

[2 rows x 10 columns]

>>> print('''Faixa etária Masculina: {} até {}'''.format(min(Masculino['idade']),max(Masculino['idade'])))

Faixa etária Masculina: - até 27

As I said in the comments, it must be something with your data.

Take the proof of the nine with the command below

>>> contas.loc[contas['idade'] == '-']

   matric sexo idade       atend  ...  plano vl_unit  vl_ref  qtde
0   14993    M     -  06/12/2017  ...   1077   210.0  202.01     1

[1 rows x 10 columns]

or

>>> contas.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   matric   2 non-null      int64
 1   sexo     2 non-null      object
 2   idade    2 non-null      object    <--- Veja que foi identificado como object e não como int64 como era de se esperar.
 3   atend    2 non-null      object
 4   tuss     2 non-null      int64
 5   servico  2 non-null      object
 6   plano    2 non-null      int64
 7   vl_unit  2 non-null      float64
 8   vl_ref   2 non-null      float64
 9   qtde     2 non-null      int64
dtypes: float64(2), int64(4), object(4)
memory usage: 288.0+ bytes
  • Sorry to waste time, the file has 999 lines, and in one of them had indeed a trace, I had to examine one by one until I realized. Now it’s done right, thank you very much !!! :)

  • With the command contas.loc[contas['idade'] == '-'] you would have the line index...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.