What is considered when ordering a varchar field? Being filled in by numbers

Asked

Viewed 319 times

5

I have a table with the birthday of the month, for a reason of use here we should store this as varchar. When using it it is necessary to sort by date from the most recent to the oldest, the same used this command:

SELECT * FROM m_aniversariantes order by dia;

And we got this result:

inserir a descrição da imagem aqui

To solve you have to make this other select:

SELECT * FROM m_aniversariantes order by CAST(dia AS INT);

Who returned the way we needed.

My question is, what the first SELECT considered to order the list?

  • Alphanumeric ordering.

  • I don’t know for sure, and I can’t verify what I’m going to say, but I think it would be like an array of character positions, and in sorting consider each order, dia[0] then sorts the first all in the array at the first position and compares, and then goes to the next. Now for the order of that first value of the array, I believe it is by some kind of table like ASCII

  • I don’t know if I’m talking nonsense, but it seems he listed the data as if all the fields were letters (mr. obvious). This happens a lot in excel, where you have to list increasingly and the same thing happens. If you set the field to "numbers", it lists the numbers correctly. I believe it is the same thing there, you can notice that the query respects the decimal order (you can see that it jumps from 18 to 20 because 2 is greater than 1, and from 29 to 3, because 3 is greater than 2), and when you gave one (day as INT) it turned that varchar (letter mode) into int (number mode).

2 answers

4


The data were sorted as text (alphanumeric ordering) upwards, as is the type of data in the column. It is the known alphabetical order.

There are algorithms that optimize the sorting process, but for practical purposes it is done by comparing the characters from the leftmost character to the corresponding position character in the other evaluated records. Then the next character is evaluated, and this cycle goes successively by all characters (use as reference the highest content).

Thus, in everyday common lists, the algorithm causes Adriana appear in call lists before Ana, for example. Even if the second has only 3 letters, the 2nd character of Ana (n >> U+006E) is greater* than the second character of Adriana (d >> U+0064). If on this list (and if it were possible to register someone like this) someone had called 1berto, he would appear before Adriana, inclusive.

* Based on the character codes of Unicode Table. There may be variations in each bank, depending on how accented characters are treated, the dice encoding, case and lower case handling and so on.

In this ordering, common numbers and special characters always precede the letters in the matter of precedence, as you can see in the table ASCII or UNICODE.

This causes a counterintuitive effect when we are dealing only with numbers, as is your case, because 489 will always appear before 75 (because 4 (U+0034) is less than 7 (U+0037)), for example, and so on.

I hope I’ve helped.

  • 1

    Thank you so much for sharing and it makes sense even the way it was ordained. Grateful friend for the explanation.

2

He considers as an alphabetical reading, taking the first value, as "1" and "2" comes before "3" he considers them first, since it is a text field and it does not matter the total value but character by character.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.