Skip to content

Design Principle: Computational vs. Manual #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stasm opened this issue Mar 2, 2020 · 5 comments
Closed

Design Principle: Computational vs. Manual #60

stasm opened this issue Mar 2, 2020 · 5 comments
Labels
design Design document or issues related to design functions Issue pertains to the default function set requirements Issues related with MF requirements list

Comments

@stasm
Copy link
Collaborator

stasm commented Mar 2, 2020

A dedicated issue for discussing one of the design dimensions proposed in #50.

Computational vs. Manual

Do we want the runtime to have some capacity to transform translations, e.g. by providing a method to automatically turn text to title-case? Or do we want localizers to provide all possible variants of translations to ensure all edge-cases are handled manually?

Related Issues: #35, #36, #38

@stasm stasm mentioned this issue Mar 2, 2020
@Fleker
Copy link

Fleker commented Mar 2, 2020

I would think that the number of computational transforms should be minimized in favor of providing translators the discretion to apply the best text style. While it may lead to some additional verbosity, embedding such translators in the runtime may lead to unexpected consequences if used too much.

@DavidFatDavidF
Copy link
Collaborator

DavidFatDavidF commented Jul 27, 2020

When this was discussed in the meeting on 15th June, it felt to me like the computational end of this axis would feature creep this effort into a universal rule based machine translation attempt.
E.g. title casing, I am not sure that there is another language apart from English that has title casing and even in English title casing is style guide dependent, so title casing doesn't seem to be a good candidate for a computational feature in the new message format.
On the other hand rendering factoids such as dates, currency amounts, quantities with units of measures etc. seems as a good candidate for runtime formatting.. But even here there are caveats in morphologically rich languages (such as Slavic languages) that could be hard to handle "computationally":

"This has to be done by [date]." - [date] replaced at runtime with e.g.- 16th June -> Czech: 16. června
"The day was [date]." - [date] replaced at runtime with e.g.- 16th June -> Czech: 16. červen

So in the above case the translator should be given an option to use something like or give a formatting hint such as
[date-adverb] or
[date-nominativ]
in their translation.

One of the most common failures in marketing email campaigns localized from English into Slavic languages is to start the message with something like

Hello [user], ---> wrong Czech: Pavel
Hello [first_name], ---> wrong Czech: Petra
Hello [full_name], ---> wrong Czech: David Filip

as these languages use a specific vocative form and simply replacing the above name placeholders with a name stored in a CRM will more often than not lead to a grammatically wrong (even insulting) form of address.
In the current state of the art, this is best solved by using a neutral salute that doesn't require use of the name.
Czech:
Dobrý den,

The name variable would have to be marked as canDelete="yes" during Extraction (and the builder/compiler would have to be happy with dropping the name).
See
https://galaglobal.github.io/TAPICC/T1/WG3/rs01/XLIFF-EM-BP-V1.0-rs01.xhtml#Hints or
http://docs.oasis-open.org/xliff/xliff-core/v2.1/os/xliff-core-v2.1-os.html#editinghints

If we went for "computational", the formatter would have to have knowledge of vocative forms creation in the target language to be able to interpret
Hello [user-vocative], ---> correct Czech: Pavle
Hello [first_name-vocative], ---> correct Czech: Petro
Hello [full_name-vocative], ---> correct Czech: Davide Filipe

@DavidFatDavidF DavidFatDavidF added the requirements Issues related with MF requirements list label Jul 27, 2020
@zbraniecki
Copy link
Member

But even here there are caveats in morphologically rich languages (such as Slavic languages) that could be hard to handle "computationally":

ICU supports a parameter that defines where in the sentence a given formatted string will appear. What ECMA402 does now is "stand-alone", but we do plan to add displayContext - tc39/ecma402#355

If we went for "computational", the formatter would have to have knowledge of vocative forms creation in the target language to be able to interpret

I'm in support of us supporting this - basically, we should be able to provide the user name declensed and the localization should use it where appropriate.

@mihnita mihnita added the design Design document or issues related to design label Sep 24, 2020
@aphillips aphillips added the functions Issue pertains to the default function set label Aug 19, 2023
@aphillips
Copy link
Member

I suspect that the underlying request here is for some standardized built-in functions for the registry.

I also think that displayContext might be related to the @annotation feature in #450 #426 (since display context might be externally supplied at runtime--contextually one might say 😉--but a message might wish to override it for a given formatter, e.g.:

{The date in a sentence is {$now :datetime}}
{The date as standalone is: {$now :datetime \@displayContext=standalone}}

@aphillips
Copy link
Member

I'm closing this issue in favor of having specific issues raised against the default registry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design document or issues related to design functions Issue pertains to the default function set requirements Issues related with MF requirements list
Projects
None yet
Development

No branches or pull requests

6 participants