Skip to content

Add a JSON test suite #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 22, 2024
Merged

Add a JSON test suite #604

merged 5 commits into from
Jan 22, 2024

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Jan 16, 2024

Closes #109
CC @mradbourne

As discussed on yesterday's call, we should bootstrap our official test suite with the one I've been putting together for the JS messageformat implementation, as it's the most comprehensive one currently available.

For the avoidance of any doubt, I am the original author of these tests and I hereby relicense my work included in this PR under the Unicode License.

I've here moved the tests originally added by @grhoten that were in test/ to test/xml/, as I wasn't sure whether we wanted to keep them as well.

@eemeli eemeli added requirements Issues related with MF requirements list LDML45 LDML45 Release (Tech Preview) labels Jan 16, 2024
@aphillips aphillips added the test-suite Issue pertains to tests label Jan 16, 2024
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Some general observations follow.

@@ -0,0 +1,28 @@
The files in this directory were originally copied from the [messageformat project](https://github.com/messageformat/messageformat/tree/11c95dab2b25db8454e49ff4daadb817e1d5b770/packages/mf2-messageformat/src/__fixtures)
and are here relicensed by their original author (Eemeli Aro) under the Unicode License.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is kosher or not. I will check with legal about the best way to handle this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, happy to do whatever including assigning copyright. I just figure that it's best done explicitly here, given that the tests were originally Apache-2.0 licensed under the messageformat, which is an OpenJS Foundation project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion with Anne (legal), as sole author you can commit this file, relying on the CLA to cover it.

@@ -0,0 +1,140 @@
[
{ "src": "hello", "exp": "hello" },
{ "src": "hello {world}", "exp": "hello world" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments would be useful in this file. It took me a second to remember that world is an unquoted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like valuable input to our further decision-making on the format for these. Before we start applying any such changes, we should firm up the ultimate shape these will take.

Comment on lines 32 to 41
"src": "{$one} and {$two}",
"params": { "one": 1.3, "two": 4.2 },
"exp": "1.3 and 4.2"
},
{
"src": "{$one} et {$two}",
"locale": "fr",
"params": { "one": 1.3, "two": 4.2 },
"exp": "1,3 et 4,2"
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test something other than the :string formatter. It might be a good idea to move them with others of the same stripe or segregate them.

{ "src": ".local $foo = {bar} {{bar {$foo}}}", "exp": "bar bar" },
{ "src": ".local $foo = {|bar|} {{bar {$foo}}}", "exp": "bar bar" },
{
"src": ".local $foo = {|bar|} {{bar {$foo}}}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to have some whitespace tests with assignments and selectors. I don't see them elsewhere in this file?

]
},
{
"src": "{#tag a:foo=|foo| b:bar=$bar}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing standalone tags?

export type TestMessage = {
/** The MF2 syntax source. */
src: string;
/** The locale to use for formatting. Defaults to 'en-US'. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider defaulting to und? The root locale is US English-like but also somewhat invariant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, JS doesn't support an und locale.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works in Chrome. It also works in FF, except that the locale's name data isn't present.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only looks like it works, though. If you try this, for instance, you'll get your default locale (probably en-US) rather than und:

new Intl.NumberFormat('und').resolvedOptions().locale

/** The locale to use for formatting. Defaults to 'en-US'. */
locale?: string;
/** Parameters to pass in to the formatter for resolving external variables. */
params?: Record<string, string | number | null | undefined>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include date type here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, because this is JSON data which doesn't inherently support dates.

{ "src": "hello {-4.20 :integer}", "exp": "hello -4" },
{ "src": "hello {0.42e+1 :integer}", "exp": "hello 4" },
{
"src": ".match {$foo :integer} one {{one}} * {{other}}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when writing plural tests, I often find it a good idea to build out the variant matrix. Yes, en-US (or und for that matter) won't touch the two or few or whatever keyword. But it can help catch errors if one of the other keywords triggers incorrectly. It also allows the test itself to be recycled for other locales.

"exp": "=1.2"
}
],
"number": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move :number ahead of :integer. Note that you don't test any of :integer's options, e.g. minimumIntegerDigits, signDisplay, et al.

Co-authored-by: Addison Phillips <[email protected]>
@eemeli
Copy link
Collaborator Author

eemeli commented Jan 16, 2024

@aphillips I would very much prefer to not touch the data files at all in this PR, even though as you've pointed out there are a few cases where the tests are e.g. incorrectly categorised, and definitely incomplete.

As we squash our commits, that will allow this one PR to be the place where we adopt the messageformat test suite and for which we need to make sure that any copyright assignment or licensing is done right. Then subsequent PRs on top of this will be done wholly under the Unicode license.

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per your comment, we can commit this without addressing my comments.

It would make sense to do a more thorough job of it, though. Should we work on these files or something else?

@eemeli
Copy link
Collaborator Author

eemeli commented Jan 18, 2024

Should we work on these files or something else?

My recommendation would be to get these in, and then start iterating on them. Starting from a copy has the benefit of having at least one implementation for which all of these tests pass, which makes it easier to validate any changes.

@grhoten
Copy link
Member

grhoten commented Jan 20, 2024

I've here moved the tests originally added by @grhoten that were in test/ to test/xml/, as I wasn't sure whether we wanted to keep them as well.

I say delete them. They don't fit well into the current design of MF2, and the files can always be resurrected from the history if necessary.

Comment on lines 193 to 213
"ordinal": [
{
"src": ".match {$foo :ordinal} one {{st}} two {{nd}} few {{rd}} * {{th}}",
"params": { "foo": 1 },
"exp": "st"
},
{
"src": ".match {$foo :ordinal} one {{st}} two {{nd}} few {{rd}} * {{th}}",
"params": { "foo": 2 },
"exp": "nd"
},
{
"src": ".match {$foo :ordinal} one {{st}} two {{nd}} few {{rd}} * {{th}}",
"params": { "foo": 3 },
"exp": "rd"
},
{
"src": ".match {$foo :ordinal} one {{st}} two {{nd}} few {{rd}} * {{th}}",
"params": { "foo": 4 },
"exp": "th"
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have the ability to reference ICU RBNF functionality? ICU MessageFormat could reference them as the spellout and ordinal, which conflicts with this plural rule ordinal.

See here for reference: https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classMessageFormat.html

Please ignore duration from MessageFormat. It's deprecated, and you need a decent message formatting framework to do it right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing the point of this test, which is testing the ordinal selector (== selectordinal in MF1). @eemeli has used the English ordinal suffixes in the test for readability in a test that is certainly not ready for any locale other than en-US.

Note well that we are considering spellout, ordinal, and duration for inclusion into the default registry as formatters--in another set of issues and design documents. I recommended against including spellout and duration in the default registry. ordinal as a formatter is on the bubble (only ICU4J has it, not Intl or ICU4X) and there are obvious functionality problems with ordinals in the current implementation.

@eemeli
Copy link
Collaborator Author

eemeli commented Jan 22, 2024

@aphillips Applied the updates discussed on the call; this should now be good to merge.

@aphillips aphillips merged commit d06033b into main Jan 22, 2024
@aphillips aphillips deleted the json-tests branch January 22, 2024 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LDML45 LDML45 Release (Tech Preview) requirements Issues related with MF requirements list test-suite Issue pertains to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Message Format - Test Suite
3 participants