Pandas dataframe.Loc() does not find the record

Asked

Viewed 517 times

0

Good morning!

I am trying to manipulate a Dataframe that originates in a DRE report (accounting). I would like the index to be the account code, which I have already been able to do. However, Dataframe.Loc[] does not find the record. Below:

import pandas as pd
import csv
from pandas import DataFrame

dre = pd.read_csv('/home/andre/Documentos/ambev_dre3.csv', names=['Conta',   'Descrição', '2017', '2016', '2015'], dtype={'Conta':str})
dre = dre.set_index('Conta')
dre

DRE

It turns out that dre.loc['3.02'] returns error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_key(self, key, axis)
   1789                 if not ax.contains(key):
-> 1790                     error()
   1791             except TypeError as e:

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in error()
   1784                                .format(key=key,
-> 1785                                        axis=self.obj._get_axis_name(axis)))
   1786 

KeyError: 'the label [3.02] is not in the [index]'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-61-8aa3f3ce8015> in <module>()
----> 1 dre.loc['3.02']

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1476 
   1477             maybe_callable = com._apply_if_callable(key, self.obj)
-> 1478             return self._getitem_axis(maybe_callable, axis=axis)
   1479 
   1480     def _is_scalar_access(self, key):

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1909 
   1910         # fall thru to straight lookup
-> 1911         self._validate_key(key, axis)
   1912         return self._get_label(key, axis=axis)
   1913 

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_key(self, key, axis)
   1796                 raise
   1797             except:
-> 1798                 error()
   1799 
   1800     def _is_scalar_access(self, key):

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in error()
   1783                 raise KeyError(u"the label [{key}] is not in the [{axis}]"
   1784                                .format(key=key,
-> 1785                                        axis=self.obj._get_axis_name(axis)))
   1786 
   1787             try:

KeyError: 'the label [3.02] is not in the [index]'

My error is probably quite primary, since I’m a beginner, but I’ve been trying for hours to manipulate this data!

Thank you for your attention!

  • It doesn’t happen here - as you can see in my reply, I can’t reproduce the error. Is it something with the csv file you’re using? Try using my csv file I put in the answer, see if it works

  • In fact, a primary problem. I was checking csv through a Calc spreadsheet, which prevented me from realizing that the values in the Account column were saved with extra spaces - e.g. ", instead of "3.04". Fixed the defect through gedit and Dataframe now behaves as expected. Thank you very much!

1 answer

0


The problem is that .set_index() does not change the content of DataFrame, and yes, a new one returns DataFrame with the changed index.

Change

dre.set_index('Conta')

To

dre = dre.set_index('Conta')

For example, I created this file teste.csv to test:

3.01,Receita de Venda,478,355,666
3.02,Custo dos bens,-123,34234,773
3.03,Resultado bruto,456,545,234

I ran that code:

import pandas as pd
dre = pd.read_csv('teste.csv', 
    names=['Conta', 'Descrição', '2017', '2016', '2015'],
    dtype={'Conta':str})
dre = dre.set_index('Conta')
print(dre.loc['3.02'])

The result as expected:

Descrição    Custo dos bens
2017                   -123
2016                  34234
2015                    773
Name: 3.02, dtype: object

EDIT: Maybe it’s the whitespace in your field Conta, inside your csv file. Try removing them by placing the code below one line before the set_index:

dre['Conta'] = dre['Conta'].str.strip()
  • I understood your point and made the change. However, the error persists.

  • I can’t play the @Andregraes problem anymore, it’s working here. I edited my question

  • @Andregraes if still having error even after the change, please edit the question and place the new error message complete with traceback

  • @Andregraes edited the answer with one more possibility to fix the problem; Check

Browser other questions tagged

You are not signed in. Login or sign up in order to post.