This document can be used according to the conditions of the Creative Commons Attribution-NoDerivatives 4.0 International license.
Internationalization
This post will be about internationalization of servers written in Go. More specifically: about localization of the error messages. This is important because: A service that spreads data around the world should be able to complain in all languages.
There is quite a bunch of i18n solutions for Go, and they are sophisticated too.
So I found myself asking: What package should I use? How do I use it? Which issues are adressed? And which are not? I tried to find out but eventually gave up.
A different path
Let's change perspective. Let's build an example and try to find a solution step by step.
Asume the server assembles the following error message somewhere in its code:
Cannot write to file /tmp/tempfile: 987654321 bytes needed, 123456789 bytes available
Maybe some german speaking user would prefer:
Verfügbar sind noch 118MB. Datei /tmp/tempfile benötigt aber 942MB. Schreiben nicht möglich.
Three problems can be identified:
- There are fixed components ("Cannot write to file", "bytes needed", "bytes available") to be translated.
- Variable components "/tmp/tempfile", "987654321", and "123456789" have to be be inserted; the order of appearance may differ.
- Numbers are formatted differently (different countries, different languages). In our example the user prefers "118MB" to "123456789 bytes", which could equally be true for an english speaker.
A solution can be quite straightforward:
-
The fixed components form a search string to a table of translations.
(Here we present them conveniently JSON formatted.)
A search function then takes a string and a language code (e.g. "de")
and returns the proper translation as a string:
[ "Cannot write to file : bytes needed, bytes available": [ {"de": "Verfügbar sind noch . Datei benötigt aber . Schreiben nicht möglich."} ] ]
-
We will delegate variable substitution to the
text/template package.
So we add template expressions with variable names
to the search string and the translations:
[ "Cannot write to file {{.File}}: {{.Need}} needed, {{.Avail}} available": [ {"de": "Verfügbar sind noch {{.Avail}}. Datei {{.File}} benötigt aber {{.Need}}. Schreiben nicht möglich."} ] ]
-
Formatting numbers will also be achieved via the text/template package.
We use a function from the package
humanize;
("bytes" be the template's name for "humanize.Bytes"):
[ "Cannot write to file {{.File}}: {{.Need}} needed, {{.Avail}} available": [ {"de": "Verfügbar sind noch {{bytes .Avail}}. Datei {{.File}} benötigt aber {{bytes .Need}}. Schreiben nicht möglich."} ] ]
Of course there are useful functions in the standard library's package strings too. Replace can make a decimal point become a decimal comma. And TrimRight can eliminate trailung zeros.
stringl10n and stringl10nextract
The command stringl10n follows that path. And it uses text/template too. All necessary information comes via a JSON formatted file. stringl10n generates Go code, with all the translation data included, and provides two functions: l10nTranslate and l10nSubstitute.
These the programmer can call at the proper place.
Of course it's the programmers task to identify the strings which need translation, and to provide these and the translations and info about variables and used functions via the JSON formatted text file.
A first step towards machine assistance is the command stringl10nextract. It extracts all string literals from Go source files in a given directory, and places them JSON formatted suitable to the stringl10n command in a text file. There's still much editing to do ... or new tools to program.
What's next
Further above I mentioned a "proper place" to call the stringl10n-generated functions. That will not be where the error messages are created. It would then be necessary to pass the language code to every single function in the server package. A global variable won't do either because a server usually serves multi-user requests simultaneously.
To solve the problem I had help from an "extended error type" which lives in package mist. This type keeps message string and possible variables apart. But that's another story ...