Skip to content

Commit 5ba25c1

Browse files
committed
Create S_native_querylocale_i() and use it
This new function differs from the already-existing plain querylocale_i() in that it returns in the platform's native format, instead of the internal=to-perl one. The internal one is used generally so that code doesn't have to cope with multiple possible formats. However, the format of the new locale in Perl_setlocale() is going to be in native format. We effectively translate it into our internal one at the input edge, and that is used thereafter. But until this commit, the translation back to native format at the output edge was incomplete. This mostly worked because native format differs from locale.c internal format in just two ways: One is the locale for LC_NUMERIC. perl keeps it generally in the C locale, except for brief intervals which higher level code specifies, when the real locale is swapped in. (Actually, this isn't quite true. If the real locale is indistinguishable from C as far as LC_NUMERIC goes, perl is happy to use it rather than C, so as to save swapping.) locale.c had the code in it to translate the internal format back to native, so it worked for this case. The other is LC_ALL when not all categories are set to the same locale. Windows and Linux use 'name=value;' pairs notation, while things derived from BSD (and others) use a positional notation in which only the values are given, and the system knows which category a given value is for from its position in the string. Perl worked fine for the name=value pairs notation, because that is the same as its internal one, so no translation got done, but until this commit, there were issues on positional platforms. This seldom got in the way since most people, if they set the locale at all, will just set LC_ALL to some single 'foo'. What this commit effectively does is change Perl_setlocale() to return the value in the native format which the libc functions are expecting. This differs from what it used to return only on platforms which use the positional notation and only for LC_ALL when not all categories are set to the same locale. The new function subsumes much of the work previously done in Perl_setlocale(), and it is able to simplify some of that work.
1 parent 8569281 commit 5ba25c1

File tree

5 files changed

+97
-89
lines changed

5 files changed

+97
-89
lines changed

embed.fnc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4409,6 +4409,8 @@ RS |unsigned int|get_category_index_helper \
44094409
|const line_t caller_line
44104410
Ri |const char *|mortalized_pv_copy \
44114411
|NULLOK const char * const pv
4412+
S |const char *|native_querylocale_i \
4413+
|const unsigned int cat_index
44124414
S |void |output_check_environment_warning \
44134415
|NULLOK const char * const language \
44144416
|NULLOK const char * const lc_all \

embed.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1301,6 +1301,7 @@
13011301
# define calculate_LC_ALL_string(a,b,c,d) S_calculate_LC_ALL_string(aTHX_ a,b,c,d)
13021302
# define get_category_index_helper(a,b,c) S_get_category_index_helper(aTHX_ a,b,c)
13031303
# define mortalized_pv_copy(a) S_mortalized_pv_copy(aTHX_ a)
1304+
# define native_querylocale_i(a) S_native_querylocale_i(aTHX_ a)
13041305
# define output_check_environment_warning(a,b,c) S_output_check_environment_warning(aTHX_ a,b,c)
13051306
# define save_to_buffer(a,b,c) S_save_to_buffer(aTHX_ a,b,c)
13061307
# define set_save_buffer_min_size(a,b,c) S_set_save_buffer_min_size(aTHX_ a,b,c)

locale.c

Lines changed: 89 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,8 @@
195195
* dot. The macro query_nominal_locale_i() can be used to get the nominal
196196
* locale that an external caller would expect, for all categories except
197197
* LC_ALL. For that, you can use the function
198-
* S_calculate_LC_ALL_string().
198+
* S_calculate_LC_ALL_string(). Or S_native_querylocale_i() will operate
199+
* on any category.
199200
*
200201
* The underlying C API that this implements uses category numbers, hence the
201202
* code is structured to use '_r' at the API level to convert to indexes, which
@@ -4135,7 +4136,62 @@ S_win32_setlocale(pTHX_ int category, const char* locale)
41354136
}
41364137

41374138
# endif
4138-
#endif /* USE_LOCALE */
4139+
4140+
STATIC const char *
4141+
S_native_querylocale_i(pTHX_ const unsigned int cat_index)
4142+
{
4143+
/* Determine the current locale and return it in the form the platform's
4144+
* native locale handling understands. This is different only from our
4145+
* internal form for the LC_ALL category, as platforms differ in how they
4146+
* represent that.
4147+
*
4148+
* This is only called from Perl_setlocale(). As such it returns in
4149+
* PL_setlocale_buf */
4150+
4151+
# ifdef USE_LOCALE_NUMERIC
4152+
4153+
/* We have the LC_NUMERIC name saved, because we are normally switched into
4154+
* the C locale (or equivalent) for it. */
4155+
if (cat_index == LC_NUMERIC_INDEX_) {
4156+
4157+
/* We don't have to copy this return value, as it is a per-thread
4158+
* variable, and won't change until a future setlocale */
4159+
return PL_numeric_name;
4160+
}
4161+
4162+
# endif
4163+
# ifdef LC_ALL
4164+
4165+
if (cat_index != LC_ALL_INDEX_)
4166+
4167+
# endif
4168+
4169+
{
4170+
/* Here, not LC_ALL, and not LC_NUMERIC: the actual and native values
4171+
* match */
4172+
return save_to_buffer(querylocale_i(cat_index),
4173+
&PL_setlocale_buf, &PL_setlocale_bufsize);
4174+
}
4175+
4176+
/* Below, querying LC_ALL */
4177+
4178+
# ifdef LC_ALL
4179+
# ifdef USE_PL_CURLOCALES
4180+
# define LC_ALL_ARG PL_curlocales
4181+
# else
4182+
# define LC_ALL_ARG NULL /* Causes calculate_LC_ALL_string() to find the
4183+
locale using a querylocale function */
4184+
# endif
4185+
4186+
return calculate_LC_ALL_string(LC_ALL_ARG, EXTERNAL_FORMAT_FOR_QUERY,
4187+
WANT_PL_setlocale_buf,
4188+
__LINE__);
4189+
# undef LC_ALL_ARG
4190+
# endif /* has LC_ALL */
4191+
4192+
}
4193+
4194+
#endif /* USE_LOCALE */
41394195

41404196
/*
41414197
=for apidoc Perl_setlocale
@@ -4200,7 +4256,6 @@ Perl_setlocale(const int category, const char * locale)
42004256

42014257
#else
42024258

4203-
const char * retval;
42044259
dTHX;
42054260

42064261
DEBUG_L(PerlIO_printf(Perl_debug_log,
@@ -4233,88 +4288,18 @@ Perl_setlocale(const int category, const char * locale)
42334288
return NULL;
42344289
}
42354290

4291+
/* Get current locale */
4292+
const char * current_locale = native_querylocale_i(cat_index);
4293+
42364294
/* A NULL locale means only query what the current one is. */
42374295
if (locale == NULL) {
4296+
return current_locale;
4297+
}
42384298

4239-
# ifndef USE_LOCALE_NUMERIC
4240-
4241-
/* Without LC_NUMERIC, it's trivial; we just return the value */
4242-
return save_to_buffer(querylocale_i(cat_index),
4243-
&PL_setlocale_buf, &PL_setlocale_bufsize);
4244-
# else
4245-
4246-
/* We have the LC_NUMERIC name saved, because we are normally switched
4247-
* into the C locale (or equivalent) for it. */
4248-
if (category == LC_NUMERIC) {
4249-
DEBUG_L(PerlIO_printf(Perl_debug_log,
4250-
"Perl_setlocale(LC_NUMERIC, NULL) returning stashed '%s'\n",
4251-
PL_numeric_name));
4252-
4253-
/* We don't have to copy this return value, as it is a per-thread
4254-
* variable, and won't change until a future setlocale */
4255-
return PL_numeric_name;
4256-
}
4257-
4258-
# ifndef LC_ALL
4259-
4260-
/* Without LC_ALL, just return the value */
4261-
return save_to_buffer(querylocale_i(cat_index),
4262-
&PL_setlocale_buf, &PL_setlocale_bufsize);
4263-
4264-
# else
4265-
4266-
/* Here, LC_ALL is available on this platform. It's the one
4267-
* complicating category (because it can contain a toggled LC_NUMERIC
4268-
* value), for all the remaining ones (we took care of LC_NUMERIC
4269-
* above), just return the value */
4270-
if (category != LC_ALL) {
4271-
return save_to_buffer(querylocale_i(cat_index),
4272-
&PL_setlocale_buf, &PL_setlocale_bufsize);
4273-
}
4274-
4275-
bool toggled = FALSE;
4276-
4277-
/* For an LC_ALL query, switch back to the underlying numeric locale
4278-
* (if we aren't there already) so as to get the correct results. Our
4279-
* records for all the other categories are valid without switching */
4280-
if (! PL_numeric_underlying) {
4281-
set_numeric_underlying(__FILE__, __LINE__);
4282-
toggled = TRUE;
4283-
}
4284-
4285-
retval = querylocale_c(LC_ALL);
4286-
4287-
if (toggled) {
4288-
set_numeric_standard(__FILE__, __LINE__);
4289-
}
4290-
4291-
DEBUG_L(PerlIO_printf(Perl_debug_log, "%s\n",
4292-
setlocale_debug_string_i(cat_index, locale, retval)));
4293-
4294-
return save_to_buffer(retval, &PL_setlocale_buf, &PL_setlocale_bufsize);
4295-
4296-
# endif /* Has LC_ALL */
4297-
# endif /* Has LC_NUMERIC */
4298-
4299-
} /* End of querying the current locale */
4300-
4301-
retval = querylocale_i(cat_index);
4302-
4303-
/* If the new locale is the same as the current one, nothing is actually
4304-
* being changed, so do nothing. */
4305-
if ( strEQ(retval, locale)
4306-
&& ( ! affects_LC_NUMERIC(category)
4307-
4308-
# ifdef USE_LOCALE_NUMERIC
4309-
4310-
|| strEQ(locale, PL_numeric_name)
4311-
4312-
# endif
4313-
4314-
)) {
4299+
if (strEQ(current_locale, locale)) {
43154300
DEBUG_L(PerlIO_printf(Perl_debug_log,
4316-
"Already in requested locale: no action taken\n"));
4317-
return save_to_buffer(retval, &PL_setlocale_buf, &PL_setlocale_bufsize);
4301+
"Already in requested locale: no action taken\n"));
4302+
return current_locale;
43184303
}
43194304

43204305
/* Here, an actual change is being requested. Do it */
@@ -4324,19 +4309,34 @@ Perl_setlocale(const int category, const char * locale)
43244309
return NULL;
43254310
}
43264311

4327-
assert(strNE(retval, ""));
4328-
retval = save_to_buffer(querylocale_i(cat_index),
4329-
&PL_setlocale_buf, &PL_setlocale_bufsize);
4312+
/* At this point, the locale has been changed based on the requested value,
4313+
* and the querylocale_i() will return the actual new value that the system
4314+
* has for the category. That may not be the same as the input, as libc
4315+
* may have returned a synonymous locale name instead of the input one; or,
4316+
* if there are locale categories that we are compiled to ignore, any
4317+
* attempt to change them away from "C" is overruled */
4318+
current_locale = querylocale_i(cat_index);
43304319

4331-
/* Now that have changed locales, we have to update our records to
4332-
* correspond. Only certain categories have extra work to update. */
4320+
/* But certain categories need further work. For example we may need to
4321+
* calculate new folding or collation rules. And for LC_NUMERIC, we have
4322+
* to switch into a locale that has a dot radix. */
43334323
if (update_functions[cat_index]) {
4334-
update_functions[cat_index](aTHX_ retval, false);
4324+
update_functions[cat_index](aTHX_ current_locale,
4325+
/* No need to force recalculation, as
4326+
* aren't coming from a situation
4327+
* where Perl hasn't been controlling
4328+
* the locale, so has accurate
4329+
* records. */
4330+
false);
43354331
}
43364332

4337-
DEBUG_L(PerlIO_printf(Perl_debug_log, "returning '%s'\n", retval));
4333+
/* Make sure the result is in a stable buffer for the caller's use, and is
4334+
* in the expected format */
4335+
current_locale = native_querylocale_i(cat_index);
43384336

4339-
return retval;
4337+
DEBUG_L(PerlIO_printf(Perl_debug_log, "returning '%s'\n", current_locale));
4338+
4339+
return current_locale;
43404340

43414341
#endif
43424342

perl.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1329,6 +1329,7 @@ typedef enum {
13291329
typedef enum {
13301330
WANT_VOID,
13311331
WANT_TEMP_PV,
1332+
WANT_PL_setlocale_buf,
13321333
} calc_LC_ALL_return;
13331334

13341335
typedef enum {

proto.h

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)