This depends on how the string is interpreted by the renderer. For example, maybe the renderer treats each byte of the string as a character code, or in other cases it can base the reading of character codes between bytes on an encoding.
Basically a value of the type string in Lua has a sequence of bytes, this would be a sequence of character codes, and the language does nothing and does not relate to these codes.
Depending on where your code is pasted to run, some characters can be encoded automatically. For example, as accented characters have a code greater than 127, UTF-8, a general character encoding, adds 1 or more bytes to represent it.
If you want to control the above action in a string character without having to modify the encoding of your code it is possible to generate it by Lua using your code.
If the version is 5.3:
local texto = '\xXX';
Or less:
local texto = string.char(byte);
where the XX/byte component is the character code, enter 0
and 255
. In the first example the code is in 2 digit hexadecimal format.
To get the code of a specific character in bytes you need to know whether or not it has encoding, and which one. Normally your code would be the first byte, but with encodings it can be otherwise represented, but obviously using any byte size. Warning that certain encodings allow codes larger than 255.
With the explanations above about the encoding of the Lua code to be executed, its character specified in the code itself can be encoded together with the other bytes of the same code. If it was UTF-8, it’s best to use a library to manipulate it. Unfortunately it is difficult to link things on mobile, so I will give an example according to the library utf8 version 5.3:
local offsetI = utf8.offset(
caractere, 1
);
local code = utf8.codepoint(caractere, offsetI);
This is an approximation. Now you can test the text renderer by specifying each character using bytes. If this also doesn’t work it may be that it uses another encoding or there is no way to render these characters yet.
The biggest problem is using a code editor in the same codepage from which the output is. For example. if saved in UTF-8, it will only work in Windows if it gives a
chcp 65001
before. Try saving in Win 1252 to see if it’s windows. It’s supposed to work normally. Check "save as" option in your editor encoding/codepage.– Bacco
@Bacco Hmm... no file usage has yet been cited in the question, but if this is the case, maybe the script might be encoded opposite to the Roblox text renderer.
– Klaider
If he is testing the script to exit in the console, it depends on the console. If he is seeing in Roblox even, in theory it should already be configured right. The problem in the case I mentioned is whether he did a script on Roblox and is testing on an output that is not Roblox itself. Only knowing more details (a screenshot or a better description) of the problem.
– Bacco
@Theprohands Yet, it may be the reverse, it’s editing out and trying to run the script on Roblox, you’ll know. I just wanted to comment, based on what I can assume, it could be a lot of different things (and of course, it might not be that kkk). The fact is that the encoding of the entrance does not match the output, remains to know the reason.
– Bacco
@Bacco Or the character has no design either
– Klaider
@Theprohands may be missing in the font used, actually. It is that usually if what appears are
<?>
, is Unicode decoding problem (ansi being treated as Unicode and giving error), but really a font with missing characters could leave holes in the text– Bacco
@Correct Bacco. I don’t really know much about these encodings ANSI, ASCII, ... I don’t understand why they created :v.
– Klaider
Must be standard encodings in some OS?
– Klaider
@Theprohands initially encodings were made to occupy 1 byte only. But this did not cover several languages. Then they coded it with 2 bytes. Then they realized they needed more and more. There they arrived in Unicode, which may have several bytes, but there occupied a lot of space. Then came the UTF encoding to try to spend extra bytes only on less used characters. It’s a lot of detail to comment on here, but I commented just to kick off.
– Bacco
@Bacco Understand. UTF-16 uses at least 2 bytes to represent a character code. Tbm has UTF-32, and must have UTF-24.
– Klaider
@Bacco must have up to UTF-160.
– Klaider