Skip to content

Implement :currency function in the default registry #915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Nov 14, 2024

Conversation

aphillips
Copy link
Member

@aphillips aphillips commented Oct 26, 2024

Initial implementation of the :currency function.

This function will be standard in 2.0.

Note behavior differences from :number and :integer.

:unit is in #922.

Initial implementation of the `:currency` function.

This function will be `optional` in 2.0.

Not behavior differences from `:number` and `:integer`.

`:unit` to follow.
@aphillips aphillips requested a review from eemeli October 26, 2024 22:36
@aphillips aphillips requested a review from eemeli October 27, 2024 18:45
@aphillips aphillips added functions Issue pertains to the default function set Agenda+ Requested for upcoming teleconference LDML46.1 MF2.0 Draft Candidate labels Oct 27, 2024
Co-authored-by: Eemeli Aro <[email protected]>
- Remove `ordinal` selector
- Remove `currencyDisplay=auto`
- Add `currencyDisplay=none`
- Add documentation links for `currencyDisplay`
- Add a note about implementation aliasing
- Fix example to use `hideIfWhole` instead of invalid `none` option value
@aphillips
Copy link
Member Author

We agreed to take this in the 2024-11-04 call, provided the last changes were incorporated. This will fast-track, assuming I get positive reviews.

@aphillips aphillips added the fast-track Editorial change permitted to use fast-track merge rules label Nov 4, 2024
spec/registry.md Outdated
Comment on lines 379 to 381
The special _option_ _value_ `hideIfWhole` is used to display values without
fraction digits when the number of fraction digits is zero,
or based on the currency when the number of fraction digits for the currency is non-zero.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be left out. This is novel currency formatting behaviour, and we should not be introducing it here via message formatting. Is this already somehow supported in ICU4J and ICU4C? If this makes sense for formatting currencies, wouldn't this also make sense for many other number formatting situations?

I do not object to a value like this in principle, but I object to it being introduced here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is supported by ICU.

The use cases for currencies are more common than for other number values (with the possible exception of units). Normally one is trying to get consistent widths (or one doesn't care). With currency values you generally want a specific number of fractional digits or no fraction digits. Mutation of the value to produce this is undesirable.

However, I agree that this is "special sauce" and shouldn't be required here. Should we introduce optional values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important for currencies.

As we discussed in the meeting, we should signal that for some of the options it is ok to use defaults if the underlying system doesn't support the choice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we allow for implementations to support option values beyond the ones we explicitly define, the easiest way of making this optional is to leave it out for now. This would also more clearly allow for an implementation that did not support "hide if whole" fraction display to emit an error, rather than be required to fail silently.

This value is supported by ICU.

Could you give me a pointer to how/where this is supported? I could not find this myself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add the separate option using the Intl spelling. Do you have a list of other options to add?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing the following, which could well be added in a separate PR:

  • roundingPriority
    • auto (default)
    • morePrecision
    • lessPrecision
  • roundingIncrement
    • 1 (default), 2, 5, 10, 20, 25, 50, 100, 200, 250, 500, 1000, 2000, 2500, and 5000
  • roundingMode
    • ceil
    • floor
    • expand
    • trunc
    • halfCeil
    • halfFloor
    • halfExpand (default)
    • halfTrunc
    • halfEven

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't do this right this second, but will add these to this PR. Should they be optional or standard? I'm leaning towards standard.

Almost all of our work between now and 46.1 appears to be fixing up the registry/function set, plus some housekeeping.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should they be optional or standard?

I'm fine either way.

Almost all of our work between now and 46.1 appears to be fixing up the registry/function set, plus some housekeeping.

Good!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. I made them standard.

@mihnita @catamorphism Please note this (proposed) change.

spec/registry.md Outdated
Comment on lines 427 to 434
- `currencyDisplay` (this option's values are derived from those in ICU [NumberFormatter.UnitWidth](https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/number/NumberFormatter.UnitWidth.html))
- `narrow`
- `short` (default)
- `full`
- `iso`
- `formal`
- `variant`
- `none` (this is called `hidden` in ICU)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I continue to object to using a different source for this one single function option, when all the other options for all the other functions are using TC39 Intl as their source. Using this set of option values unnecessarily breaks from our otherwise well established patterns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in yesterday's call (which you weren't able to attend). The challenge here is that currency formatting has been greatly improved in CLDR (which is reflected by the changes in ICU) after Intl.NumberFormat was created. Those changes have not yet materialized in TC39, but really are needed. FWIW, there is continuing evolution here. To wit, there are a number of currencies with more than one symbol (some of which are formal and some of which are variant). These symbols need to be selectable and, when selected, can result in different formatting patterns.

The poster child for this is the Turkish Lira, which has a newfangled symbol, which displays on the left thusly: ₺123.456,79 , but which is very often rendered using the older ASCII symbol "TL":

image

The discussion yesterday recognized that there are more options here than in Intl, but that implementations are only required to accept all these keywords--not to provide different results based on them. Intl provides code, symbol, narrowSymbol, and name. We could reasonably rename thusly:

Proposed Above Intl Equivalent Suggested Comments
narrow narrowSymbol narrow No need to be extra verbose
short symbol symbol (default) symbol is consistent with user expectations
full name name ICU's name is actually FULL_NAME. Can be misleading, e.g. JPY renders as 123,457円
iso code code code might be better because some codes are not ISO4217
formal n/a formal Variations such as TRD => NT$
variant n/a variant Variations such as the TRY one illustrated above
none n/a none Frequently used for columns of currency values

Would that rename work for you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rename looks good.

Side note: we should look at having documentation of the equivalent options for Intl and ICU (maybe others). Perhaps on the site rather than the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the rename just now (in bbc7091) but am looking for @eemeli's feedback

Includes a note explaining this.
spec/registry.md Outdated
Comment on lines 379 to 381
The special _option_ _value_ `hideIfWhole` is used to display values without
fraction digits when the number of fraction digits is zero,
or based on the currency when the number of fraction digits for the currency is non-zero.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we allow for implementations to support option values beyond the ones we explicitly define, the easiest way of making this optional is to leave it out for now. This would also more clearly allow for an implementation that did not support "hide if whole" fraction display to emit an error, rather than be required to fail silently.

This value is supported by ICU.

Could you give me a pointer to how/where this is supported? I could not find this myself.

@aphillips aphillips requested a review from eemeli November 6, 2024 15:29
@macchiati
Copy link
Member

macchiati commented Nov 6, 2024

So to recap, https://cldr-smoke.unicode.org/spec/main/ldml/tr35.html#UnicodeCurrencyIdentifier defines the validity precisely, but is squishy about the well-formedness.

  • The valid values are those name attribute values in the type elements of key name="cu" in bcp47/currency.xml.

As to well-formedness, instead of

Codes consisting of 3 ASCII letters that are or have been valid ...

It should say:

Well-formed codes are of the form [A-Z]{3}.
The valid codes are or have been valid ...

(The second sentence is a description of how the valid values were derived, not the definition.)

Let me see about getting the TC to add this to the PR I already have for spec changes in 46.1

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My blocking objections to this PR have now been resolved. I would still prefer for the last three currencyDisplay option values to be left out, but I can live with them being included.

spec/registry.md Outdated
Comment on lines 451 to 453
- `formal`
- `variant`
- `none` (this is called `hidden` in ICU)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A somewhat dormant issue exists for adding at least "formal" to JS: tc39/ecma402#643

I'm not really sure about "variant", or what it's a variant of. Looking at the data, it looks like these are for the most part abbreviations, stored as <symbol alt="variant">. Would therefore "variantSymbol" maybe make more sense? That also more clearly implies what the fallback behaviour might be if a variant is not available (as is most often the case).

I have no opinions about "none" or "hidden", as neither is available in JS and neither term is used in any of the existing JS formatters.

Overall, I would prefer to leave out these option values, but I don't strongly object to them either.

Ping @sffc and @ryzokuken about this, in case you have thoughts on these. As is, we'll need to special-case the handling for this one option in the JS spec, and the values chosen here may end up influencing the corresponding ones for Intl.NumberFormat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variant symbols are often "abbreviation-like", but they really are alternate symbols. The recent fad of creating custom currency symbols unique to a specific currency to some degree corresponds with these symbols. The relationship with formal is kind of unclear. It's a bit clunky, TBH (you can't tell if a currency has a variant and specifying variant in the pattern depends on the currency of the value to some degree).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the fallback behaviour for formatting currencies with no variant documented somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the fallback for each element with alt=... is documented. It is normally the unmarked element (ie, without the alt=...), but in some documented cases (notably locale display names) it may be constructed.

Copy link
Member Author

@aphillips aphillips Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that variant is restricted (currently) to currency symbols.

The usability problem is that when a user writes the source message, she's not going to know about the need for variant. At Amazon, our currency formatter had enums for the presentation format ("in sentence", "standalone", "compact", "iso code", and such) that correspond to specific user interface use cases. Various locale/currency/site configurations then produced the correct symbol/pattern in each of these. This was helpful because it let us do one thing with (e.g.) TRY in the Türkiye site and different thing elsewhere without the developer (or translator) having to know whether a variant existed and, further, whether it was desired or not. Often "what to do" will depend entirely on which currency and which presentation case is wanted ☹️

The enums here are taken from NumberFormatter (and Intl.NumberFormat), which are procedural APIs for getting this effect. They are wrong for MessageFormat, I think, because MF is supposed to hide procedural gorp like that. As currently speced, the developer has to want the variant symbol all the time (unlikely) or the translator has to put it there (possibly at odds with the display use case) for specific locales (and not generally knowing the currency). You don't want to have to ever write:

.local $curCode = {$amount :get field=currency}
.match $curCode
TRY {{You have {$amount :currency currencyDisplay=variant} left in your account.}}
UAH {{You have {$amount :currency currencyDisplay=variant} left in your account.}}
NTD {{You have {$amount :currency currencyDisplay=formal} left in your account.}}
*   {{You have {$mount :currency} left in your account.}}

Co-authored-by: Eemeli Aro <[email protected]>
@aphillips aphillips removed the Agenda+ Requested for upcoming teleconference label Nov 7, 2024
- change `none` to `never`
- remove `variant`
@aphillips aphillips requested a review from eemeli November 12, 2024 22:54
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the latest changes, this looks good, provided that changes in #925 will allow an implementation to emit some signal about unsupported option values.

Comment on lines +409 to +426
The _option_ `trailingZeroDisplay` has a value `stripIfInteger` that is useful
for displaying currencies with their fraction digits removed when the fraction
part of the _operand_ is zero.
This is sometimes used in _messages_ to make the displayed value omit the fraction part
automatically.
> For example, this _message_:
> ```
> The special price is {$price :currency trailingZeroDisplay=stripIfInteger}.
> ```
> When used with the value `5.00 USD` in the `en-US` locale displays as:
> ```
> The special price is $5.
> ```
> But like this when when value is `5.01 USD`:
> ```
> The special price is $5.01.
> ```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: This seems a bit superfluous as we don't explain other options like this, but I don't object to including it.

Suggested change
The _option_ `trailingZeroDisplay` has a value `stripIfInteger` that is useful
for displaying currencies with their fraction digits removed when the fraction
part of the _operand_ is zero.
This is sometimes used in _messages_ to make the displayed value omit the fraction part
automatically.
> For example, this _message_:
> ```
> The special price is {$price :currency trailingZeroDisplay=stripIfInteger}.
> ```
> When used with the value `5.00 USD` in the `en-US` locale displays as:
> ```
> The special price is $5.
> ```
> But like this when when value is `5.01 USD`:
> ```
> The special price is $5.01.
> ```

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could put this in the user guide. I included it here because it might otherwise be mysterious.

- `morePrecision`
- `lessPrecision`
- `roundingIncrement`
- 1 (default), 2, 5, 10, 20, 25, 50, 100, 200, 250, 500, 1000, 2000, 2500, and 5000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future (not first release)

I know that this follows https://402.ecma-international.org/#:~:text=not%20the%20exponent.-,%5B%5BRoundingIncrement%5D%5D,-is%20an%20integer.

That list is limited to powers of 10 from 1 to 4.

evenly divides 10, 100, 1000, or 10000 into tenths, fifths, quarters, or halves

Given that currency values (due to hyperinflation) can range much higher than that, that seems like an arbitrary limit, and it should just be "a power of 10", and just limited by the implementation datatypes (as usual).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I closed my eyes while importing this.

> ```
> would be formatted with the resolved options
> `{ currencySign: 'accounting', trailingZeroDisplay: 'stripIfInteger', currency: 'USD' }`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question.

It appears reasonable to allow composition of currency with numeric values, eg:

.input {$n :number }
{{{$n :currency currencySign=accounting}}}

or

.input {$n :integer }
{{{$n :currency currencySign=accounting}}}

whereby any options on the numeric functions are ignored for the currency formatting.

Not a blocker, since this can be done later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is reasonable, but not using the messages in your examples, since $n (if just a number) doesn't have a currency code associated with it. Note that Measure and CurrencyAmount in ICU4J are not subclasses of Number. ICU might choose to treat either as numbers, of course. I'm just pointing out that tuple-types might not work out-of-the-box. In addition, the implementation might resolve the value to a non-currency type:

.input {$c :currency}
.local $discounted = {$c :math subtract=2}
{{{$discounted} might only be a Number...}}

The reverse is also potentially true as well: one could use the :number or :integer formatter or selector against a :currency annotated value, although I can't fathom why you'd want to. Bad Operand might be a blessing in this case.

Note that the options would not be ignored in cases where they overlap (which they do quite significantly and on purpose). The primary differences are the handling of fractional digits, display of the currency symbol, and the handling of sign/currencySign (which might want to be harmonized??).

Mainly, this behavior is up to the implementation, since currency types or tuples might not be on the number/integer menu of types.


.input {$n :number useGrouping=never numberingSystem=Latn}
.local $c = {$n :currency currencyCode=INR}
{{This prints in rupees without lakhs (or other groupings) and with Latin digits: {$c}}}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I fat-fingered. I had copied the example, and meant to delete the currencySign option, and instead deleted the currencyCode option!

I had also meant just composition one way, from a number to a currency amount, as you have it:

.input {$n :number useGrouping=never numberingSystem=Latn}
.local $c = {$n :currency currencyCode=INR}
{{This prints in rupees without lakhs (or other groupings) and with Latin digits: {$c}}}

I hadn't meant from a currency amount to a number.

.input {$c :currency}
.local $n = {$c :number useGrouping=never numberingSystem=Latn}
{{This prints just the number part of a currency amount: {$n}}}


#### Selection

The _function_ `:currency` performs selection as described in [Number Selection](#number-selection) below.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While considering #933, I realised that plural selection on currency values depends on being able to introspect the fraction digits which are by default locale-dependent. The ICU FormattedNumber might do that automatically, but in other APIs the operation can get a bit trickier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not locale-dependent. It's currency dependent. This is why fraction digits are different in this function!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@aphillips
Copy link
Member Author

This has two approvals and no apparent outstanding comments. Merging.

@aphillips aphillips merged commit 528109e into main Nov 14, 2024
1 check passed
@aphillips aphillips deleted the aphillips-currency branch November 14, 2024 17:14
@macchiati
Copy link
Member

I was just reviewing the registry document, and noticed that we are a bit inconsistent about when we use valid vs well-formed (and whether we mention either of them).

for example:

valid Unicode Number System Identifier (default is locale-specific)
well-formed Unicode Currency Identifier
valid identifier per BCP175
...

But the following note applies to other cases of "growing" identifiers (timezones, etc)

The currency is only required to be well-formed rather than checked for validity. This allows new currency codes to be defined (there are many recent examples of this occuring). It also avoids requiring implementations to check currency codes for validity, although implementations are permitted to emit Bad Option or Bad Operand for invalid codes.

I jotted down principles in #935 for a discussion on a more consistent approach

I don't want to derail our push to the end, so we could postpone this consistency work.

@aphillips
Copy link
Member Author

Understood. Let's get it right: let's make specific changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fast-track Editorial change permitted to use fast-track merge rules functions Issue pertains to the default function set LDML46.1 MF2.0 Draft Candidate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants