When to choose between using a string wide or not?

Asked

Viewed 56 times

6

When using a string wide (std::wstring) or a string normal (std::string)?

1 answer

3


Hard question to answer.

There is a tendency to use string on Linux and other platforms that use encodings that are guaranteed to be 1 byte per character or that characters are built by a set of bytes with a defined amount by the set itself where the most used example is UTF-8.

In Windows you usually use wstring that use characters guaranteed to be more than 1 byte. It is wide. UTF-16 or UTF-32 are the most commonly used.

Even on these platforms it is often best to use string. If the direct interaction with the operating system is small it can compensate, even if eventually needs a conversion the overall gain for being larger. But it’s really hard to get the point right.

We often use some library that abstracts this. It is not always the ideal.

Here is a controversy not so related to C++. I try to use, whenever possible encodings, with guaranteed size, primarily ASCII/Latin1 or something similar (almost always with you), if not I will UCS2 (is almost the UTF-16) and finally the UCS-4/UTF-32 (never used, but today has application that may be necessary for another mistake that the entity that defines these patterns committed). Only use UTF-8 and UTF-16 when I need to "chat" with external application resources and I have no control over their use.

Behold What are the main differences between Unicode, UTF, ASCII, ANSI?.

  • 1

    It is noteworthy that, unfortunately, std::basic_string (C++) strings do not know Unicode, because they treat the data as bytes only (can be thought of as code Units) and not as characters. It is therefore necessary to use libraries that recognize such encodings (such as ICU).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.