26
Every time I create a database, I always come across these definitions and I don’t really know the difference.
I understand what it is charset, but not collation.
26
Every time I create a database, I always come across these definitions and I don’t really know the difference.
I understand what it is charset, but not collation.
22
The Charset defines the character set that you will use to store data in text form, where each character receives a number. A very common example is the 256-character ASCII table. If you indicate that your system’s charset is ASCII, this means that your system will only recognize the 256 characters contained in it. Therefore sometimes we see texts without accentuation on the internet, with a question mark instead of the character, because the page’s charset does not contain the character that should be displayed. Currently, the standard charset is UTF-8, which allows up to 1,114,112 characters, which allows you to display characters in any language, including Chinese and Japanese.
Collation is the term used to define the set of rules that the server will use for sorting and comparing texts, that is, how the operators =, >, <, order by, etc. For example, depending on the configured Collation, the system will sort the character 'ö' between the characters ːo' and ːp', using another collation, this character can be ordered at another position. So it can give conflict when making queries that relate tables with different collations. In addition, the collation also defines whether the system will differentiate accented characters or whether it will be case sensitive, for example the collation Latin1_general_ci_as defines that the system should treat characters as case insensitive (IC) and acentue sensitive (AS). Examples:
Even if you change the collation of the database the previously created objects will not have the collation changed, for that you will have to recreate the object.
This subject is very complex, I tried to summarize to the maximum, I hope it helps.
18
Charset is the set of symbols and encodes
Let’s assume you have the letters "A", "B", "a" and "b". For each letter shall be associated with a number: "A" = 0, "B" = 1, "a" = 2 and "b" = 3. A letter "A" is a symbol and 0 is the encounter to "A". The combination of symbol + encounter is a charset.
Already the collation is a set of character comparison rules.
I have withdrawn this reply from this excellent reference.
It’s much clearer now, thank you!
I even gave an improved answer, just to be clear.
Related: http://answall.com/a/21358/4751 (What are collations for)
Excellent response
Browser other questions tagged database character-encoding collation
You are not signed in. Login or sign up in order to post.
It was a good summary, has more information than the current answer accepted. Won my +1.
– Wallace Maxters
Your reply was excellent
– Fabio William Conceição