Should translation dictionaries use the original strings as keys?

Question

Should translation dictionaries use the original strings as keys?

Asked 10 years, 2 months ago

Viewed 161 times

13

I have already built some systems and multilingual websites, and in the tools I have known there seems to be a tendency to use the string in the application’s original language (usually English) as the key to the respective translations.

For example, in a gettext:

msgid "My name is %s.\n"
msgstr "Meu nome é %s.\n"

That way, with the gettext, a hypothetical code should be changed from

printf("My name is %s.\n", nome);

for

printf(_("My name is %s.\n"), nome);

The gettext has even a tool that scans code for function calls _, and generates the POT file in the above format, missing only fill the translated text.

Tutorials of other tools usually induce similar practices. However, the use of this type of solution has caused me many problems. Here in Stack Overflow itself, for example. Every time any text is changed in the English original, a new pending translation appears in Transifex, without reference to the previous version of the original or the translation itself. This causes a lot of duplicated work, a lot of inconsistency, and potentially a lot of translated and unused text (because it was lagged).

It seems to me that it makes more sense to use more meaningful names as keys to the dictionary, for example "CloseLinkText" or something like that. But it’s very rare to see someone recommend this, or even see software that uses this type of key.

Questions

Is there any good reason to spread this practice of using English strings as keys?
Is there some major disadvantage, that I would be failing to see, in using keys that give a little more context about what is being translated?
In short, why do so many internationalized systems turn translation work into a nightmare? There is no better alternative?

1

"It seems to me that it makes more sense to use more meaningful names [...]". What is more significant than the text itself in the original language? The way you suggest, you just created an extra "column" (equivalent to a new language to be manipulated). Incidentally, remembering such "keys" when it is necessary to reuse a string is much harder than remembering the original text (and if the counter-argument is automation with lists of choice, well, why not displaying the original text?). :)

– Luiz Vieira

2015/05/18 at 21:58
Using the translated text as a key leads to the natural reuse of already translated phrases. PO and related files also have "space" and are used to saving more context as a comment. There is also a large volume of tools that use these formats (example Poedit, virtaal, Translate-Toolkit, omegat, etc),

– JJoao

2015/05/18 at 22:09
1

@Luizvieira You are right to say that the text itself is self-explanatory, in general it is even (sometimes, when it is a loose word, it is difficult to understand the context). I think having an immutable identifier would avoid other problems, but how few people do it must have some very big disadvantage. Your argument hasn’t convinced me yet :) As for reusing strings that are the text itself, it also has its own problems - for example, "read" (read) in certain contexts would need to be "read", in others it would need to be "read", and so on...

– bfavaretto

2015/05/18 at 22:10
It’s true, there are many problems. I even think that the worst is not the translation, but the localization itself (text size in buttons, for example). Anyway, my point is that using the key wouldn’t solve them. This example is very suitable: having different keys to indicate singular and plural is only necessary if the word actually occurs in both contexts. And the professional responsible for making the location will need necessarily consider the context when making the translation (not only the original text, nor any key you produce...) :)

– Luiz Vieira

2015/05/18 at 22:32
In other words: the problems mentioned are a fact, but I do not know good arguments proving that the use of different keys from the original text makes the work process (which is the translation itself and the reuse of the text by programmers) effectively better. Anyway, that’s just mine educated Guess (so I’m just commenting, and I won’t dare answer...). :)

– Luiz Vieira

2015/05/18 at 22:38
@bfavaretto was trying to find a way to answer but falls into the same problem already mentioned here, you probably know more about it than me, I’ll just add my experience that should not be different from yours and nothing that helps to clarify the problem. What would be great to know about this. Did you try in English? I do not because it is not easy to explain and would probably have to interact a lot and I am not in the patience to spend my English "wonderful".

– Maniero

2015/08/19 at 12:53
I didn’t try in English, and to tell the truth I got a little discouraged after the comments of Luiz Vieira and the answer below. @bigown

– bfavaretto

2015/08/19 at 17:46

Show 2 more comments

1 answer

Browser other questions tagged localization i18n

You are not signed in. Login or sign up in order to post.

by epx • **8,191** points · Answer 1 · 2015-05-20T19:45:31+00:00

If the language of those who are developing the program is English, or the developers involved know English well, it seems to me much more productive and clear to use the original message, "pretending" that other languages do not exist.

In the specific example you gave, you can see the masks, which is important (if the number of masks is greater than the number of additional printf() parameters, the program will break). This is a good reason.

Localization/internationalization is always a nightmare. The biggest problems are: size that the translated message will occupy in the user interface, and cultural/jargon differences that the translator can ignore, and if you don’t know a little of the translated language, you can’t even verify.

I do not think that the POT is the worst scheme, nor does it provoke nor solve the major problems of the above mentioned location. If the original message is changed, at least this is explicit (as it can be detected automatically in the build).

On Android, where it is customary to use Labels R.id. instead of the original strings, the opposite problem occurs: someone changes the original message in English, the translations remain valid but maybe they are no longer suitable to the new size that the new message occupies in the user interface, or maybe the new message is completely different, and keeping the old translations is disadvantage.

I recognize that an advantage of the label instead of the original English string is when the same message is used in different contexts (menu and title, for example), and maybe the translation has to be less wordy in the menu because the available space is smaller.