-
Notifications
You must be signed in to change notification settings - Fork 597
Arabic Numeric Shaping Support #503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @AhmedMustafa, Thank you for taking your time to contribute to Globalize. I've noticed your solution doesn't make use of CLDR nor reuse any Globalize methods (e.g., number formatter). As far as I understood, your change converts digits from Latin to Arabic and vice-versa in a hardcoded way (e.g., code). On Globalize, you can use formatNumber() to actually do this conversion not only from latin to arabic, but between any non-algorithmic numbering systems specified in CLDR (http://unicode.org/repos/cldr/trunk/common/bcp47/number.xml). Can we re-use that instead in your code. For example: Globalize("en").formatNumber(123);
// > '123'
Globalize("ar").formatNumber(123);
// > '١٢٣'
Globalize("bn").formatNumber(123);
// > '১২৩'
Globalize("th-u-nu-thai").formatNumber(123);
// > '๑๒๓' You may notice that other Globalize modules (for example, the date module) uses the number module under the hoods. So, it can format the Date digits in the right way: Globalize("en").formatDate(new Date());
// > '8/26/2015'
Globalize("ar").formatDate(new Date());
// > '٢٦/٨/٢٠١٥'
Globalize("bn").formatDate(new Date());
// > '২৬/৮/২০১৫'
Globalize("th-u-nu-thai").formatDate(new Date());
// > '๒๖/๘/๒๐๑๕' I am wondering if it's possible to extract the essence of your problem and come up with a more general purpose solution that works for a wider ranger of scripts. I'm copying @jquery/globalize team to give their input. |
Thanks Rafael for reviewing this PR. Firstly, let me mention that the scope of my task is to convert numerals from European to Arabic & from Arabic to European only. Numeric shaping & numeric formatting are two different processes. Shaping deals with digits shapes(glyphs) while formatting deals with the format. One can take place after the other. Because its main purpose is to do contextual digit shaping (& unlike formatNumber), shaper.shape() operates on any input string(numbers & characters) not only numbers. Also shaper.shape() can take Arabic numerals(U+0660 to U+0669) as input & convert it to European numerals(U+0030 to U+0039) -which I think is not applicable using formatNumber. So using formatNumber will not be applicable as it is not just converting from European numerals to Arabic numerals, we also have to convert from Arabic to European as well. Finally, If we are going to support the conversion to/from any other systems, we have to calculate the range of digits, strong characters & weak characters in every system we are going to support. And I find this outside the scope of my task now. So please accept this as a simple Arabic-European numeric shaping feature for the moment & to extend this feature in future updates. |
Currently this sounds like the shaper can and should be a standalone library, because it doesn't reuse anything from Globalize and tackles a separate usecase with little or no overlap. |
@jzaefferer's conclusion is the same I have. We're open to add new features as long as there's correlation, for example, anything based on the CLDR spec or Ecma-402 would be a good start. Thanks for considering add this feature to Globalize. You are welcome to keep involved. We'd love to see further contributions from you. |
@rxaviers we can certainly provide shaping functionality for a wider range of scripts not only for Arabic. Namely for scripts for which shaping makes sense and is defined in CLDR appropriate shaper will be invoked in the appropriate format functions. Apologies for my ignorance. Does the fact that ticket is closed means PR can be submitted ? |
Related issue is "Message formatting dependent on base text direction (Bidi support)" #539 |
Information about numbering system used by different languages / locales is available from CLDR: http://cldr.unicode.org/translation/numbering-systems |
By the Hebrew numerals discussed in "General purpose API for converting regular numerals to Hebrew ones" #537 belong to the same category. type name="hebr" description="Hebrew numerals — algorithmic" |
Awesome. Reopened it. Feel free to force push to the same branch of this original PR |
Closing it in favor of #553. |
Arabic and many other languages have classical shapes for digits (National Digits) That are different from the conventional Western Digits (European). Arabic digits have the same semantic meaning as the European digits. The difference is Only a difference in glyphs.
This module is used to shape the digits contained in any string from Arabic to European And vice versa.