Problem Reading Unicode python file

Asked

Viewed 116 times

0

Guys I got my script :

import sys
search = sys.argv[1]
ref_arquivo = open('C:/Zabbix/RelatorioErros.txt','r').readlines()[11:]
for line in ref_arquivo:
    if search in line:
        print(line[30:66],line[66:77],line[92:99],line[100:110])   

it works only on UTF-8 files but when running it on a machine windowns when reading Reporterrors.txt does not work because the form of txt is in Unicode what to do ?

  • Are you sure this is the error? Could post error message here?

  • To function open has the parameter encoding that you can define which encoding is used in reading the file; by default it is UTF-8.

  • 1

    @Woss default is not always UTF-8; of Docs, In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.. That assumption that it’s UTF-8 by default has bit me too!

  • @Pedrovonhertwigbatista Well remembered.

1 answer

1

If the file is in utf-8 itself, just state this explicitly when opening the file in Windows. Otherwise Python will use the default encoding system, which in case is latin-1, and the contents of the file will get corrupted in memory (each character outside the ASCII range, including all accented, will turn 2 or more other characters nothing to see).

In case, just do:

ref_arquivo = open('C:/Zabbix/RelatorioErros.txt','r', encoding="utf-8").readlines()[11:]

(the other party involving accentuation - reading of sys.argv[1] should be treated automatically in Python 3 - it will transform from the encoding used in the terminal to text).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.