What’s the difference between varchar and nvarchar?

Asked

Viewed 19,395 times

22

  • What’s the difference between using data types varchar and nvarchar?
  • nvarchar exists in every SQL database?
  • Is there any significant difference in performance between the two?
  • There is a criterion for using them?

And I saw that there is also the same with char and nchar. The same goes for them?

2 answers

23


It has to do with character encoding. The NVARCHAR is a guy multibyte to store texts Unicode.

As far as I know only exists in SQL Server, it is not part of the standard. The question has the tag Mysql, but it does not accept this type.

There is intrinsic difference in performance since the encoding used by NVARCHAR has several disadvantages.

I adopt VARCHAR until you need the NVARCHAR. In most of my problems they are enough. The use of Latin1 serves me very well and gain space and performance. There are those who adopt NVARCHAR by default, or "just in case," I don’t like that kind of attitude.

The same applies to the type of characters, the difference is that they will have fixed character size, while the others are the types with variable sizes. Not to be confused with the size in bytes NCHAR may vary.

The NVARCHAR can have a number of bytes different from the number of characters, and depends on the encoding and leotard used, can be twice the size, can depend on the content, the variation depends on the encoding. In VARCHAR the amount bytes is the same as the number of characters, plus the overhead control (currently 24 bytes), of course. So you can store less characters.

  • 2

    I suggest that this response be updated and/or corrected. The answer in English (https://stackoverflow.com/questions/144283/what-is-the-difference-between-varchar-and-nvarchar) to this same question brought other important points that need to be taken into consideration, regardless of whether it is Mysql or SQL Server: 1) Operating systems today already work with Unicode, so if I use varchar, I have an overhead to "convert" to Unicode both when reading and saving; 2) Disk space is less costly than codepage/character problems; 3) etc (see the answer, it’s quite interesting).

  • 1

    I didn’t see anything wrong or outdated in this answer so I don’t know what I could change. Some of these propositions are just opinions, so I stick to mine, but I respect the different ones, and the answer pointed out in Soen is simplistic and mistaken, despite all the votes, and mainly does not consider my context that is more important to me. For other contexts people should analyze what best fits them, using a standard solution for everyone is a mistake.

11

The varchar datatype considers non-UNUNULD characters, nvarchar instead works with UNICODE characters.

What you have to take into account is the amount stored by each data type.

VARCHAR will store the reported amount plus 2bytes. For example, a VARCHAR(10) field will store a maximum of 10bytes + 2bytes. These two extra bytes are precisely on account of being a variable-sized data type.

The NVARCHAR will take up twice the space plus the 2bytes of control. So, in the same example, an NVARCHAR(10) field will take up 20bytes + 2bytes.

This will make a big difference to your storage and should be taken into account.

Source


Roughly speaking, in the CHAR and VARCHAR world, each character occupies 1 byte. A byte is a set of 8bits and considering all the positions of these bits (on and off) we can have 256 combinations (2 8). This means that one byte is capable of representing 256 different combinations. For the American alphabet this is more than enough, for the Latin alphabet this is also more than enough.

The problem begins when we consider Arabic, Asian, Greek alphabets, etc. In this case, if we consider all possible letters and characters we will extrapolate all 256 combinations that 1 byte can represent. For these situations arose the NVARCHAR and the NCHAR. For these types of data each caractér occupies 2bytes. If one byte can express 256 combinations (2 8), two bytes can store 65536 combinations (2 16). With this amount of combinations, it is possible to represent any existing character only the storage cost gets higher.

If you use the CHAR and VARCHAR types and try to store certain characters, the available character universe will be restricted to the collation you have chosen. If you try to store another character that is not contemplated by that collation, that character will be converted to some approximate one. If you choose NCHAR and NVARCHAR, then this limitation does not occur.

Source

  • Now it’s better, I’m not a fan of copying content, except partially and places that explicitly say they are publicly licensed, in general where there is no writing that can copy the ideal is not copying, in this case also does not say anything clearly about not being able to copy. Anyway making your own text is always better than copying from other people.

  • 2

    It is true and I agree with you but the intention is, within the allowed, to enrich the knowledge of the community.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.