Problems with accentuation - Python

Question

Problems with accentuation - Python

Asked 8 years, 8 months ago

Viewed 3,202 times

1

Hi, I’m having stress problems in Python.

In the code I put this: # -- coding: UTF-8 -- But the accents are not recognized in the cmd.

Follow print for better understanding.

From now on, thank you!

1

Test @Miguel’s solution. If not, an alternative is for you to save the file in your code editor or IDE, change the configuration to ANSI, ISO-8859-1, WIN1252 (instead of UTF-8 that appears in your screenshot’s status bar). Usually this is an option when choosing "save as...", or even giving 2 clicks or right click on the appropriate status bar location, depending on the editor.

– Bacco

2016/11/20 at 00:48

1 answer

Browser other questions tagged python character-encoding utf-8 cmd console

You are not signed in. Login or sign up in order to post.

by jsbueno • **30,668** points · Answer 1 · 2016-11-23T11:44:46+00:00

When you have a little time, read this here. The title might scare you a little bit -but it’s the best introduction to accent and special characters I’ve ever seen.

That being said, what happens is that until about 30 years ago, computers were limited to displaying a maximum of 256 characters at a time. It’s easy to see that with so many languages and characteristics in the world, it doesn’t even begin to meet the communication needs that we have.

Well, as a palliative, each country adopted a table different 256 characters - preserving a common core of codes between 0 and 127 (this is called "ASCII"), and creating new maps for codes from 128 to 255.

In fact, the difernetes tables were not only for "country", but several tables appeared at different moments of history in several countries. The Unicode consortium was eventually instituted - it standardizes all these different tables, giving each one a name - in addition to placing coding standards that support more than 256 simultaneous characters - for example the "utf-8".

In the case of Windows you have an even bigger problem because programs in normal Windows environment use an encoding (latin1 for windows in Portuguese), and programs running on CMD use another different coding - (cp852). Therefore, a character that appears as 'È' in a programming editor may appear as " when printed in the CMD.

The Python language, as of version 3, greatly improves the approach and simplifies correct programming - in particular, it automatically treats all text in code as "Unicode text", which is independent of coding (but still, you have to leave the encoding of your programming editor equal to the encoding demarcated in the first line of Python code) - and automatically checks what is the encoding of the terminal when you find a print or another way out. So, its character È It’ll show up right on CMD. I strongly recommend that you use Python3 if you are learning or starting a new project - this issue of text coding is the most important of the version change. (From your print, I assume you are using Python 2 - precisely by the characters that appear).

For Python 2, do so:

configure your editor to actually use UTF-8 in menus, in addition to coding statement in the first line of your program.
Prefix all your strings with the letter u, as in: a = u"maçã" - this will cause them to be objects of type "Unicode" and not a sequence of bytes. (this is the standard behavior of Python3)
In each print, encode your text for the standard terminal encoding, calling the method .encode(sys.stdout.encoding) in its text. (import the sys module into your program). This behavior is also standard in Python3.

Example in Python2:

# coding: utf-8
import sys

coding = sys.stdout.encoding
a = "eu tenho uma maçã"
print a.encode(coding)

Example in Python3 (provided your editor is set to utf-8):

print("Eu tenho uma maçã")

(not even the encoding statement in the file itself . py, when it is utf-8)