error while opening . csv file with/ python/ pandas


I am new to the language and I am using Python 3 in jupternotebook inside anaconda. I followed the steps below. But it’s making a mistake I can’t decipher, please help me

setting the work directory

print( os.getcwd())

checking the files


importing libraries

import pandas as pd
import numpy as np

loading data frame

socio = pd.read_csv( 'caracterizacao_socioeconomica.csv', sep=',', header=0)
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-77-9b2e3fbc6ca4> in <module>
      1 #carregando primeiro dataframe
----> 2 socio = pd.read_csv( 'caracterizacao_socioeconomica.csv', sep=',', header=0, encoding='UTF-8')

~\anaconda3\lib\site-packages\pandas\io\ in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
    686     )
--> 688     return _read(filepath_or_buffer, kwds)

~\anaconda3\lib\site-packages\pandas\io\ in _read(filepath_or_buffer, kwds)
    453     # Create the parser.
--> 454     parser = TextFileReader(fp_or_buf, **kwds)
    456     if chunksize or iterator:

~\anaconda3\lib\site-packages\pandas\io\ in __init__(self, f, engine, **kwds)
    946             self.options["has_index_names"] = kwds["has_index_names"]
--> 948         self._make_engine(self.engine)
    950     def close(self):

~\anaconda3\lib\site-packages\pandas\io\ in _make_engine(self, engine)
   1178     def _make_engine(self, engine="c"):
   1179         if engine == "c":
-> 1180             self._engine = CParserWrapper(self.f, **self.options)
   1181         else:
   1182             if engine == "python":

~\anaconda3\lib\site-packages\pandas\io\ in __init__(self, src, **kwds)
   2008         kwds["usecols"] = self.usecols
-> 2010         self._reader = parsers.TextReader(src, **kwds)
   2011         self.unnamed_cols = self._reader.unnamed_cols

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._get_header()

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas\_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf3 in position 1: invalid continuation byte
    It looks like you are trying to read a file like UTF-8 that was not encoded in UTF-8.

Probably the file being loaded is not encoded with UTF-8 (which is the default when you don’t specify any).

Try specifying a different charset in your call, something like:

socio = pd.read_csv( 'caracterizacao_socioeconomica.csv', sep=',', header=0, encoding = "ISO-8859-1")


socio = pd.read_csv( 'caracterizacao_socioeconomica.csv', sep=',', header=0, encoding='utf8')

or other Charsets, such as encoding='latin1', encoding='iso-8859-1', encoding='cp1252'...

  • Thanks Alexandre, it worked with encoding iso-8859-1, the strange thing is that all other files of the same base opened, that same file even opened in R with UTF-8, and all the other 10 files that were in the same directory, opened disfigured with this encoding... but solved the problem

