26
What is the most appropriate character encoding (Collation) for a Mysql database that will store Portuguese language data?
26
What is the most appropriate character encoding (Collation) for a Mysql database that will store Portuguese language data?
41
Both serve: latin1_swedish_ci
or utf8_general_ci
.
To change the CHARSET
and COLLATION
of an existing bank:
ALTER DATABASE `sua_base` CHARSET = Latin1 COLLATE = latin1_swedish_ci;
or
ALTER DATABASE `sua_base` CHARSET = UTF8 COLLATE = utf8_general_ci;
Explanation
CHARSET
and COLLATE
are different things, in Mysql, each CHARSET has Collates, each with its own particularity.
latin1_general_ci
: There is no distinction between upper and lower case letters. Searching for "test", records such as "Test" or "TEST" will be returned.latin1_general_cs
: Distinguishes upper and lower case letters. Searching for "test" will only return "test". Options such as "Test" and "TEST" will not be returned.latin1_swedish_ci
: It does not distinguish lowercase and uppercase letters or accented characters with cedilla, that is, the record containing the word "Intuition" will be returned when there is a search for the word "intúicao"(edited in 2019)
The universal standard is UTF-8, even more in Brazil, where it is "standard of fact and by law".
Thus the first option (with distinction) is utf8_swedish_ci
,
ALTER DATABASE `sua_base` CHARSET = Latin1 COLLATE = utf8_swedish_ci;
and the second (without distinction) utf8_general_ci
.
Browser other questions tagged mysql database
You are not signed in. Login or sign up in order to post.
Is there any reason to prefer
Latin1
instead ofUTF-8
? I would strongly recommend using the Unicode standard, even AP only foreseeing Portuguese language characters (facilitates interoperability later), provided of course it meets what was requested (by the way: there is autf8_swedish_ci
).– mgibsonbr
Really, both serve, I prefer too
UTF-8
– Felipe Douradinho