What do you mean, "Empty Dataframe"?

Asked

Viewed 236 times

5

I’m learning and doing a lesson from my college. My teacher asked to use format table filters .csv using the command .query() So original is like this:

    Funcionario Escolaridade Genero  Idade  Meses  Salario
1     Fundam.2      M     37      0    16790
2     Superior      M     33     14    15460
3     Fundam.1      F     39     57     4690
4     Superior      F     56     36    18760
5     Fundam.1      F     48      1    10580
6     Fundam.2      F     50     84    18030
7     Fundam.2      F     60     21    15040
8     Fundam.2      F     66     40     6200
10    Fundam.2      F     27     68     4470
11    Fundam.2      F     45     94     2900
12    Fundam.1      M     57     99     6590
13    Fundam.2      M     23      7    15460
14    Fundam.2      F     23     30    11330
15    Superior      M     59     91     6580
16    Fundam.2      O     28     60    18260
17    Fundam.1      F     71     75     4700
18    Fundam.2      F     49     33     3840
19    Fundam.1      M     43     48     2320
20    Fundam.2      M     57     23     7250

I created an application to filter with query.("Idade == 47" or "Meses > 44") Then my teacher asked to display columns in order like this: "Age", "Months", "Employee", "Gender". I programmed it like this:

import pandas as pd

dn = pd.read_csv("fake-file14.csv", sep = ",")

x = dn.query("Idade == 47" or "Meses > 44")

df = x[['Idade', 'Meses', 'Funcionario', 'Genero']]

print(df)

This appeared:

Empty DataFrame
Columns: [Idade, Meses, Funcionario, Genero]
Index: []

What do you mean? Why did it happen like this? I have tried many times...

Note: My teacher wants to appear like this:

Idade  Meses  Funcionario Genero
2      39     57            3      F
5      50     84            6      F
9      27     68           10      F
10     45     94           11      F
11     57     99           12      M
14     59     91           15      M
15     28     60           16      O
16     71     75           17      F
18     43     48           19      M

1 answer

7


It seems to me that the problem is the query:

x = dn.query("Idade == 47" or "Meses > 44")  

That’s making a or between 2 strings and sending the result to query, that is, if you put this in a print this will be the result

print( "Idade == 47" or "Meses > 44" )
> Idade == 47  # output

As it seems to have no record with Idade == 47 the result is a DataFrame off the books.

As it should be:

x = dn.query("Idade == 47 or Meses > 44") 

See working on ideone, and in the repl it..


Another way to do this search:

x = dn[  (dn['Idade'] == 47)   |   (dn['Meses'] > 44)   ]  
#               ↑              ↑          ↑
#         tabela verdade 1     |    tabela verdade 2
#                              |
#                   'OR' operador bitwise

Reference: pandas.DataFrame.query

Browser other questions tagged

You are not signed in. Login or sign up in order to post.