Skip to content

Arabic Numeric Shaping Support #503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

AhmedMustafa
Copy link

Arabic and many other languages have classical shapes for digits (National Digits) That are different from the conventional Western Digits (European). Arabic digits have the same semantic meaning as the European digits. The difference is Only a difference in glyphs.
This module is used to shape the digits contained in any string from Arabic to European And vice versa.

@rxaviers
Copy link
Member

Hi @AhmedMustafa,

Thank you for taking your time to contribute to Globalize.

I've noticed your solution doesn't make use of CLDR nor reuse any Globalize methods (e.g., number formatter).

As far as I understood, your change converts digits from Latin to Arabic and vice-versa in a hardcoded way (e.g., code). On Globalize, you can use formatNumber() to actually do this conversion not only from latin to arabic, but between any non-algorithmic numbering systems specified in CLDR (http://unicode.org/repos/cldr/trunk/common/bcp47/number.xml). Can we re-use that instead in your code.

For example:

Globalize("en").formatNumber(123);
// > '123'

Globalize("ar").formatNumber(123);
// > '١٢٣'

Globalize("bn").formatNumber(123);
// > '১২৩'

Globalize("th-u-nu-thai").formatNumber(123);
// > '๑๒๓'

You may notice that other Globalize modules (for example, the date module) uses the number module under the hoods. So, it can format the Date digits in the right way:

Globalize("en").formatDate(new Date());
// > '8/26/2015'

Globalize("ar").formatDate(new Date());
// > '٢٦‏/٨‏/٢٠١٥'

Globalize("bn").formatDate(new Date());
// > '২৬/৮/২০১৫'

Globalize("th-u-nu-thai").formatDate(new Date());
// > '๒๖/๘/๒๐๑๕'

I am wondering if it's possible to extract the essence of your problem and come up with a more general purpose solution that works for a wider ranger of scripts.

I'm copying @jquery/globalize team to give their input.

@AhmedMustafa
Copy link
Author

Thanks Rafael for reviewing this PR.

Firstly, let me mention that the scope of my task is to convert numerals from European to Arabic & from Arabic to European only.
The conversion between any other numbering systems is outside the scope of my task.

Numeric shaping & numeric formatting are two different processes. Shaping deals with digits shapes(glyphs) while formatting deals with the format. One can take place after the other.

Because its main purpose is to do contextual digit shaping (& unlike formatNumber), shaper.shape() operates on any input string(numbers & characters) not only numbers. Also shaper.shape() can take Arabic numerals(U+0660 to U+0669) as input & convert it to European numerals(U+0030 to U+0039) -which I think is not applicable using formatNumber.

So using formatNumber will not be applicable as it is not just converting from European numerals to Arabic numerals, we also have to convert from Arabic to European as well.
And as we make contextual shaping, we must be aware of the surrounding characters. So we have to deal with characters as well as numbers.- and this is also another thing that formatNumber cannot handle.

Finally, If we are going to support the conversion to/from any other systems, we have to calculate the range of digits, strong characters & weak characters in every system we are going to support. And I find this outside the scope of my task now. So please accept this as a simple Arabic-European numeric shaping feature for the moment & to extend this feature in future updates.

@jzaefferer
Copy link
Contributor

Currently this sounds like the shaper can and should be a standalone library, because it doesn't reuse anything from Globalize and tackles a separate usecase with little or no overlap.

@rxaviers
Copy link
Member

rxaviers commented Oct 8, 2015

@jzaefferer's conclusion is the same I have. We're open to add new features as long as there's correlation, for example, anything based on the CLDR spec or Ecma-402 would be a good start.

Thanks for considering add this feature to Globalize. You are welcome to keep involved. We'd love to see further contributions from you.

@rxaviers rxaviers closed this Oct 8, 2015
@tomerm
Copy link

tomerm commented Oct 26, 2015

@rxaviers we can certainly provide shaping functionality for a wider range of scripts not only for Arabic.
It is also clear that any formatting function from Globalize (or outside of it), will use the shaper in exactly the way you described earlier:
Globalize("ar").formatNumber(123);
Globalize("th-u-nu-thai").formatDate(new Date());
Globalize("he").messageFormatter("breadcrumb")([ "first123", "second345", "third678" ])

Namely for scripts for which shaping makes sense and is defined in CLDR appropriate shaper will be invoked in the appropriate format functions.

Apologies for my ignorance. Does the fact that ticket is closed means PR can be submitted ?

@tomerm
Copy link

tomerm commented Oct 26, 2015

Related issue is "Message formatting dependent on base text direction (Bidi support)" #539

@tomerm
Copy link

tomerm commented Oct 26, 2015

Information about numbering system used by different languages / locales is available from CLDR: http://cldr.unicode.org/translation/numbering-systems

@tomerm
Copy link

tomerm commented Oct 26, 2015

By the Hebrew numerals discussed in "General purpose API for converting regular numerals to Hebrew ones" #537 belong to the same category.

type name="hebr" description="Hebrew numerals — algorithmic"
type name="arab" description="Arabic-Indic digits"
type name="arabext" description="Extended Arabic-Indic digits"

@rxaviers rxaviers reopened this Oct 26, 2015
@rxaviers
Copy link
Member

Awesome. Reopened it. Feel free to force push to the same branch of this original PR ACGC:numericShaping3 or to submit a new one. Thanks

@rxaviers
Copy link
Member

Closing it in favor of #553.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants