Description
To avoid re-inventing wheels, it's better to use ICU message to do i18n/l10n.
Steps:
- Fix the buggy ini package
- Clean up all translation strings
- Introduce ICU message parser
- Convert legacy plural-related strings to ICU format
- Translate on Crowdin https://support.crowdin.com/icu-message-syntax/
Below is outdated description: the old idea is using a customized message format (it's a simple syntax like ICU message, but it's not supported by Crowdin, so Crowdin can't help to check mistakes).
The official package's design seems clear and will resolve Gitea's i18n/l10n problems fundamentally.
https://pkg.go.dev/golang.org/x/text/message
https://pkg.go.dev/golang.org/x/text/feature/plural
https://github.com/unicode-org/cldr/blob/main/common/supplemental/ordinals.xml
https://github.com/unicode-org/cldr/blob/main/common/supplemental/plurals.xml
I think a translator-friendly syntax is very important, because there are really a lot of broken translations, if we make the system more complex, there will be more errors.
And the syntax should be also designed for frontend (JS/Vue).
As the first step, we should refactor the locale package to make it stable, see the problems
A brief idea about how to maintain the translation strings:
<!-- 1: other --> {%d $[text]}
<!-- 2: one,other --> {%d $[text,texts]}
<!-- 3: zero,one,other --> {%d $zero[0,1,o]}
<!-- 3: one,two,other --> {%d $two[1,2,o]}
<!-- 3: one,few,other --> {%d $few[1,f,o]}
<!-- 3: one,many,other --> {%d $many[1,m,o]}
<!-- 4: one,two,few,other --> {%d $two-few[1,2,f,o]}
<!-- 4: one,two,many,other --> {%d $two-many[1,2,m,o]}
<!-- 4: one,few,many,other --> {%d $few-many[1,f,m,o]}
<!-- 5: one,two,few,many,other --> {%d $[1,2,f,m,o]}
<!-- 6: zero,one,two,few,many,other --> {%d $[0,1,2,f,m,o]}
Then use the syntax to support different languages:
en: msg = there are {%d $[pull request, pull requests]}
lv: msg = there are {%d $zero[for 0 pull request, pull request, pull requests]}
ar: msg = there are {%d $[for 0, for 1, for 2, few, many, other]}
Another possible approach, define all concepts ahead:
en: NumPR = {%d $[pull request, pull requests]}
lv: NumPR = {%d $zero[for 0 pull request, pull request, pull requests]}
ar: NumPR = {%d $[for 0, for 1, for 2, few, many, other]}
Then the NumPR could be reused:
en: msg = there are {$NumPR}
lv: msg = there are {$NumPR}
ar: msg = there are {$NumPR}
If we only need to support one %d
, the syntax might be simplified, eg:
en: msg = there are %d $[pull request, pull requests]
lv: msg = there are %d $zero[for 0 pull request, pull request, pull requests]
ar: msg = there are %d $[for 0, for 1, for 2, few, many, other]
Activity
[-][Proposal] Use golang's x/text package for i18n & l10n[/-][+][Proposal] Use ICU message for i18n & l10n[/+]on_date
translation is problematic #24074lunny commentedon Apr 28, 2023
Are there any tool to convert ini format to that ICU format? Or should we create one?
wxiaoguang commentedon Apr 28, 2023
I didn't get your mean.
ICU is a just message format, no need to convert
lunny commentedon Apr 28, 2023
Maybe we should use another format but ini files?
wxiaoguang commentedon Apr 28, 2023
Why?
silverwind commentedon Apr 28, 2023
YAML may be ok as it requires less escaping than INI. But one also needs to be aware of it's pitfalls, like
no
becoming booleanfalse
because it is a typed language which ini isn't.wxiaoguang commentedon Apr 28, 2023
At the moment I don't see real benefit that YAML would bring.
Actually we do not need too much "escaping" with INI, there are just some legacy bugs.
The only "escaping" requirements are:
#
"
I think INI still wins.
silverwind commentedon Jun 2, 2023
Found another use case where
{placeholder}
syntax would have been really useful:https://github.com/go-gitea/gitea/pull/25050/files#r1214691116