Is there a way to use accents in strings in LUA?

Question

Is there a way to use accents in strings in LUA?

Asked 9 years, 1 month ago

Viewed 963 times

2

I’m looking for a way to use string accents in the LUA language. I’ve tried this

texto='Pão,está,cabeçalho,'

Because in the engine I am using (ROBLOX studio), the game only writes until the stretch that is with accent or other special character,and when I will check the value of the Text (the variable that makes write on the screen its value),in the places of the letters that have accent has signs similar to this: <?>

Someone knows how to do it?

The biggest problem is using a code editor in the same codepage from which the output is. For example. if saved in UTF-8, it will only work in Windows if it gives a chcp 65001 before. Try saving in Win 1252 to see if it’s windows. It’s supposed to work normally. Check "save as" option in your editor encoding/codepage.

– Bacco

2017/01/26 at 10:53
@Bacco Hmm... no file usage has yet been cited in the question, but if this is the case, maybe the script might be encoded opposite to the Roblox text renderer.

– Klaider

2017/01/26 at 11:06
If he is testing the script to exit in the console, it depends on the console. If he is seeing in Roblox even, in theory it should already be configured right. The problem in the case I mentioned is whether he did a script on Roblox and is testing on an output that is not Roblox itself. Only knowing more details (a screenshot or a better description) of the problem.

– Bacco

2017/01/26 at 11:08
1

@Theprohands Yet, it may be the reverse, it’s editing out and trying to run the script on Roblox, you’ll know. I just wanted to comment, based on what I can assume, it could be a lot of different things (and of course, it might not be that kkk). The fact is that the encoding of the entrance does not match the output, remains to know the reason.

– Bacco

2017/01/26 at 11:09
@Bacco Or the character has no design either

– Klaider

2017/01/26 at 13:08
1

@Theprohands may be missing in the font used, actually. It is that usually if what appears are <?>, is Unicode decoding problem (ansi being treated as Unicode and giving error), but really a font with missing characters could leave holes in the text

– Bacco

2017/01/26 at 13:09
@Correct Bacco. I don’t really know much about these encodings ANSI, ASCII, ... I don’t understand why they created :v.

– Klaider

2017/01/26 at 13:13
Must be standard encodings in some OS?

– Klaider

2017/01/26 at 13:15
1

@Theprohands initially encodings were made to occupy 1 byte only. But this did not cover several languages. Then they coded it with 2 bytes. Then they realized they needed more and more. There they arrived in Unicode, which may have several bytes, but there occupied a lot of space. Then came the UTF encoding to try to spend extra bytes only on less used characters. It’s a lot of detail to comment on here, but I commented just to kick off.

– Bacco

2017/01/26 at 13:19
@Bacco Understand. UTF-16 uses at least 2 bytes to represent a character code. Tbm has UTF-32, and must have UTF-24.

– Klaider

2017/01/26 at 14:17
@Bacco must have up to UTF-160.

– Klaider

2017/01/26 at 14:20

Show 6 more comments

2 answers

Browser other questions tagged lua accentuation

You are not signed in. Login or sign up in order to post.

by Klaider • **2,509** points · Answer 1 · 2017-01-26T10:45:32+00:00

This depends on how the string is interpreted by the renderer. For example, maybe the renderer treats each byte of the string as a character code, or in other cases it can base the reading of character codes between bytes on an encoding.

Basically a value of the type string in Lua has a sequence of bytes, this would be a sequence of character codes, and the language does nothing and does not relate to these codes.

Depending on where your code is pasted to run, some characters can be encoded automatically. For example, as accented characters have a code greater than 127, UTF-8, a general character encoding, adds 1 or more bytes to represent it.

If you want to control the above action in a string character without having to modify the encoding of your code it is possible to generate it by Lua using your code.

If the version is 5.3:

local texto = '\xXX';

Or less:

local texto = string.char(byte);

where the XX/byte component is the character code, enter 0 and 255. In the first example the code is in 2 digit hexadecimal format.

To get the code of a specific character in bytes you need to know whether or not it has encoding, and which one. Normally your code would be the first byte, but with encodings it can be otherwise represented, but obviously using any byte size. Warning that certain encodings allow codes larger than 255.

With the explanations above about the encoding of the Lua code to be executed, its character specified in the code itself can be encoded together with the other bytes of the same code. If it was UTF-8, it’s best to use a library to manipulate it. Unfortunately it is difficult to link things on mobile, so I will give an example according to the library utf8 version 5.3:

local offsetI = utf8.offset(
    caractere, 1
);

local code = utf8.codepoint(caractere, offsetI);

This is an approximation. Now you can test the text renderer by specifying each character using bytes. If this also doesn’t work it may be that it uses another encoding or there is no way to render these characters yet.

by user73299 • 1 point · Answer 2 · 2017-04-15T04:24:18+00:00

-- https://gyazo.com/1c4ee2fc734a0d4aad22c7a2392da63a Kind of?

Texto = "Pão,está,cabeçalho,"
script.Parent.Text = Texto

If that’s what you wanted...