Skip to content

Multiple Test Failures on Alpine Linux with Threads Enabled #20231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cjg-cguevara opened this issue Sep 3, 2022 · 4 comments · Fixed by #20241
Closed

Multiple Test Failures on Alpine Linux with Threads Enabled #20231

cjg-cguevara opened this issue Sep 3, 2022 · 4 comments · Fixed by #20241

Comments

@cjg-cguevara
Copy link

Multiple Test Failures on Alpine Linux with Threads Enabled

Examples:
https://perl5.test-smoke.org/report/5022293
https://perl5.test-smoke.org/logfile/5022293
https://perl5.test-smoke.org/report/5022279
https://perl5.test-smoke.org/logfile/5022279

Appears to have started sometime after f7e7b4d.

Note that Alpine Linux uses musl instead of glibc:

$ ldd --version
musl libc (x86_64)
Version 1.2.3
Dynamic Program Loader
Usage: /lib/ld-musl-x86_64.so.1 [options] [--] pathname
@Leont
Copy link
Contributor

Leont commented Sep 3, 2022

I think debugging this will require a stacktrace.

@bram-perl
Copy link

On Alpine:

$ ./Configure -des -Dusedevel -Dcc=gcc -Duseithreads
$ make
$ make test_prep
$ cd t
$ ./perl -I../lib ../dist/threads/t/kill.t
locale.c: 864: panic: Unexpected character in locale category name '2E; errno=2
$ LC_ALL=C ./perl -I../lib ../dist/threads/t/kill.t
1..18
ok 1 - Loaded
ok 2 - Created thread
ok 3 - Thread sleeping
ok 4 - Signalled thread
ok 5 - Thread received signal
ok 6 - No thread return value
ok 7 - Thread termination warning
ok 8 - Semaphore created
ok 9 - Created thread
ok 10 - Thread received semaphore
ok 11 - Suspended thread
ok 12 - Thread suspending
ok 13 - Thread resuming
ok 14 - Signalled thread to terminate
ok 15 - Thread caught termination signal
ok 16 - Thread done
ok 17 - Thread return value
ok 18 - Ignore signal to terminated thread
$ env | grep ^LC
LC_COLLATE=C
$ env | grep ^LANG
LANG=C.UTF-8

@bram-perl
Copy link

It appears that it started triggering errors since commit a7ff7ac
This is the same commit that caused #20155
For that issue a branch/PR exists that (may) fix it; but that branch doesn't fix the issues of this ticket :(

A simpler test:

./perl -Ilib -wle 'use threads; use threads::shared;'
locale.c: 864: panic: Unexpected character in locale category name '2E; errno=2

Running with PERL_DEBUG_LOCALE_INIT:

$ PERL_DEBUG_LOCALE_INIT=1 ./perl -Ilib -wle 'BEGIN { print "\n"; } use threads; use threads::shared;'
ocale.c: 4731: created C object 7f541bf9e000
locale.c: 733: my_querylocale_i(LC_ALL) on ffffffffffffffff
locale.c: 757: my_querylocale_i(LC_ALL) returning 'C'
locale.c: 1073: locale.c: 418: index of category 6 (LC_ALL) is 6
emulate_setlocale_i input=6 (LC_ALL), mask=0x7fffffff, new locale="C", current locale="C",index=6, object=ffffffffffffffff
locale.c: 1169: (4750): emulate_setlocale_i now using C object=7f541bf9e000
locale.c: 1181: (4750): emulate_setlocale_i will stay in C object
locale.c: 1271: (4750): emulate_setlocale_i now using 7f541bf9e000
locale.c: 1564: calculate_LC_ALL returning 'C'
locale.c: 4775: setlocale(LC_ALL, "") returned "C.UTF-8;C;C;C;C;C"
locale.c: 4801: setlocale(LC_CTYPE, NULL) returned "C.UTF-8"
locale.c: 4801: setlocale(LC_NUMERIC, NULL) returned "C"
locale.c: 4801: setlocale(LC_COLLATE, NULL) returned "C"
locale.c: 4801: setlocale(LC_TIME, NULL) returned "C"
locale.c: 4801: setlocale(LC_MESSAGES, NULL) returned "C"
locale.c: 4801: setlocale(LC_MONETARY, NULL) returned "C"
locale.c: 733: my_querylocale_i(LC_CTYPE) on 7f541bf9e000
locale.c: 757: my_querylocale_i(LC_CTYPE) returning 'C'
locale.c: 1073: locale.c: 418: index of category 0 (LC_CTYPE) is 0
emulate_setlocale_i input=0 (LC_CTYPE), mask=0x1, new locale="C.UTF-8", current locale="C",index=0, object=7f541bf9e000
locale.c: 1169: (5057): emulate_setlocale_i now using C object=7f541bf9e000
locale.c: 1209: (5057): emulate_setlocale_i created 55d854b913f0 by duping the input
locale.c: 1256: (5057): emulate_setlocale_i created 55d854b913f0 while freeing 55d854b913f0
locale.c: 1271: (5057): emulate_setlocale_i now using 55d854b913f0
locale.c: 733: my_querylocale_i(LC_NUMERIC) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_NUMERIC) returning 'C'
locale.c: 1073: locale.c: 418: index of category 1 (LC_NUMERIC) is 1
emulate_setlocale_i input=1 (LC_NUMERIC), mask=0x2, new locale="C", current locale="C",index=1, object=55d854b913f0
locale.c: 1097: (5057): emulate_setlocale_i no-op to change to what it already was
locale.c: 733: my_querylocale_i(LC_COLLATE) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_COLLATE) returning 'C'
locale.c: 1073: locale.c: 418: index of category 3 (LC_COLLATE) is 2
emulate_setlocale_i input=3 (LC_COLLATE), mask=0x8, new locale="C", current locale="C",index=2, object=55d854b913f0
locale.c: 1097: (5057): emulate_setlocale_i no-op to change to what it already was
locale.c: 733: my_querylocale_i(LC_TIME) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_TIME) returning 'C'
locale.c: 1073: locale.c: 418: index of category 2 (LC_TIME) is 3
emulate_setlocale_i input=2 (LC_TIME), mask=0x4, new locale="C", current locale="C",index=3, object=55d854b913f0
locale.c: 1097: (5057): emulate_setlocale_i no-op to change to what it already was
locale.c: 733: my_querylocale_i(LC_MESSAGES) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_MESSAGES) returning 'C'
locale.c: 1073: locale.c: 418: index of category 5 (LC_MESSAGES) is 4
emulate_setlocale_i input=5 (LC_MESSAGES), mask=0x20, new locale="C", current locale="C",index=4, object=55d854b913f0
locale.c: 1097: (5057): emulate_setlocale_i no-op to change to what it already was
locale.c: 733: my_querylocale_i(LC_MONETARY) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_MONETARY) returning 'C'
locale.c: 1073: locale.c: 418: index of category 4 (LC_MONETARY) is 5
emulate_setlocale_i input=4 (LC_MONETARY), mask=0x10, new locale="C", current locale="C",index=5, object=55d854b913f0
locale.c: 1097: (5057): emulate_setlocale_i no-op to change to what it already was
locale.c: 1564: calculate_LC_ALL returning 'LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_COLLATE=C;LC_TIME=C;LC_MESSAGES=C;LC_MONETARY=C;'
locale.c: 733: my_querylocale_i(LC_CTYPE) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_CTYPE) returning 'C.UTF-8'
locale.c: 1861: Entering new_ctype(C.UTF-8)
locale.c: 3829: Entering my_langinfo item=14, using locale C.UTF-8
locale.c: 2810: Copying 'UTF-8' to 55d854b91668
locale.c: 6513: found codeset=UTF-8, is_utf8=1
locale.c: 2058: check_for_problems=1, MB_CUR_MAX=4
locale.c: 733: my_querylocale_i(LC_NUMERIC) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_NUMERIC) returning 'C'
locale.c: 1757: Called new_numeric with C, PL_numeric_name=C
locale.c: 1634: Locale radix is '.', ?UTF-8=0
locale.c: 733: my_querylocale_i(LC_COLLATE) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_COLLATE) returning 'C'


locale.c: 4731: created C object 7f541bf9e000
locale.c: 733: my_querylocale_i(LC_ALL) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_ALL) returning '(null)'
locale.c: 1073: locale.c: 418: index of category 6 (LC_ALL) is 6
emulate_setlocale_i input=6 (LC_ALL), mask=0x7fffffff, new locale="C.UTF-8;C;C;C;C;C", current locale="(null)",index=6, object=55d854b913f0
locale.c: 733: my_querylocale_i(LC_ALL) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_ALL) returning '(null)'
locale.c: 733: my_querylocale_i(LC_ALL) on 55d854b913f0
locale.c: 757: my_querylocale_i(LC_ALL) returning '(null)'
locale.c: 1073: locale.c: 418: index of category 6 (LC_ALL) is 6
emulate_setlocale_i input=6 (LC_ALL), mask=0x7fffffff, new locale="C", current locale="(null)",index=6, object=55d854b913f0
locale.c: 1169: (4750): emulate_setlocale_i now using C object=7f541bf9e000
locale.c: 1181: (4750): emulate_setlocale_i will stay in C object
locale.c: 1271: (4750): emulate_setlocale_i now using 7f541bf9e000
locale.c: 1564: calculate_LC_ALL returning 'C'
locale.c: 471: locale.c: 471: unlocking locale
locale.c: 864: panic: Unexpected character in locale category name '2E; errno=2

Adding some debugging around line 854:

s = C.UTF-8;C;C;C;C;C
e =

When using LANG=C it doesn't reach that code:

Using LANG=en_US:

locale.c: 4734: created C object 7f225a465000
locale.c: 733: my_querylocale_i(LC_ALL) on 564f3edb53f0
locale.c: 757: my_querylocale_i(LC_ALL) returning '(null)'
locale.c: 1076: locale.c: 418: index of category 6 (LC_ALL) is 6
emulate_setlocale_i input=6 (LC_ALL), mask=0x7fffffff, new locale="en_US;en_US;en_US;C;en_US;en_US", current locale="(null)",index=6, object=564f3edb53f0
locale.c: 733: my_querylocale_i(LC_ALL) on 564f3edb53f0
locale.c: 757: my_querylocale_i(LC_ALL) returning '(null)'
locale.c: 733: my_querylocale_i(LC_ALL) on 564f3edb53f0
locale.c: 757: my_querylocale_i(LC_ALL) returning '(null)'
locale.c: 1076: locale.c: 418: index of category 6 (LC_ALL) is 6
emulate_setlocale_i input=6 (LC_ALL), mask=0x7fffffff, new locale="C", current locale="(null)",index=6, object=564f3edb53f0
locale.c: 1172: (4753): emulate_setlocale_i now using C object=7f225a465000
locale.c: 1184: (4753): emulate_setlocale_i will stay in C object
locale.c: 1274: (4753): emulate_setlocale_i now using 7f225a465000
locale.c: 1567: calculate_LC_ALL returning 'C'
s = en_US;en_US;en_US;C;en_US;en_US
e =
locale.c: 471: locale.c: 471: unlocking locale
locale.c: 867: panic: Unexpected character in locale category name '3B; errno=2

@bram-perl
Copy link

Debugging further with @khwilliamson and patch provided by @khwilliamson :

diff --git a/locale.c b/locale.c
index ee184f84e2..2806de3619 100644
--- a/locale.c
+++ b/locale.c
@@ -4747,7 +4747,14 @@ Perl_init_i18nl10n(pTHX_ int printwarn)
 #  ifdef USE_PL_CURLOCALES

     /* Initialize our records.  If we have POSIX 2008, we have LC_ALL */
-    void_setlocale_c(LC_ALL, porcelain_setlocale(LC_ALL, NULL));
+    /* void_setlocale_c(LC_ALL, porcelain_setlocale(LC_ALL, NULL)); */
+    for (i = 0; i < NOMINAL_LC_ALL_INDEX; i++) {
+        (void) emulate_setlocale_i(i, curlocales[i],
+                                   RECALCULATE_LC_ALL_ON_FINAL_INTERATION,
+                                   __LINE__);
+    }
+
+

 #  endif


Running make test_harness:

All tests successful.
Files=2720, Tests=1186704, 1510 wallclock secs (112.14 usr 11.34 sys + 930.24 cusr 100.07 csys = 1153.79 CPU)
Result: PASS

khwilliamson added a commit to khwilliamson/perl5 that referenced this issue Sep 3, 2022
This fixes Perl#20231

LC_ALL is a compendium of the individual locale categories, such as
LC_CTYPE, LC_NUMERIC, ....  When all categories are in the same locale,
it acts just like an individual category.  But when the categories are
not in the same locale, some means must be used to indicate that.
Platforms differ in how they represent this.  Alpine uses:

    a;b;c;d;e;f

where each letter is replaced by the correct locale for a given
category.  Which category is in which position is deterministic, and
platform-specific.  Other platforms separate by a '/'.  And glibc uses a
more informative format:

    LC_CTYPE=a;LC_NUMBERIC=b; ...

This has the advantage that it's obvious to the reader what is what, and
the order in the string is irrelevant.

It might be possible, but painful, for a Configure probe to figure out
what the syntax is for the current platform.  I chose not to do that.  A
platform might come along with a novel syntax unanticipated by whatever
probe we came up with.

Instead, perl uses the glibc format internally, and when it needs to get
or set LC_ALL from the system, it loops through each category
individually, so that by the time it has done all of them, LC_ALL will
have been implicitly handled.

The breaking commit a7ff7ac failed to do that.
khwilliamson added a commit that referenced this issue Sep 7, 2022
This fixes #20231

LC_ALL is a compendium of the individual locale categories, such as
LC_CTYPE, LC_NUMERIC, ....  When all categories are in the same locale,
it acts just like an individual category.  But when the categories are
not in the same locale, some means must be used to indicate that.
Platforms differ in how they represent this.  Alpine uses:

    a;b;c;d;e;f

where each letter is replaced by the correct locale for a given
category.  Which category is in which position is deterministic, and
platform-specific.  Other platforms separate by a '/'.  And glibc uses a
more informative format:

    LC_CTYPE=a;LC_NUMBERIC=b; ...

This has the advantage that it's obvious to the reader what is what, and
the order in the string is irrelevant.

It might be possible, but painful, for a Configure probe to figure out
what the syntax is for the current platform.  I chose not to do that.  A
platform might come along with a novel syntax unanticipated by whatever
probe we came up with.

Instead, perl uses the glibc format internally, and when it needs to get
or set LC_ALL from the system, it loops through each category
individually, so that by the time it has done all of them, LC_ALL will
have been implicitly handled.

The breaking commit a7ff7ac failed to do that.
scottchiefbaker pushed a commit to scottchiefbaker/perl5 that referenced this issue Nov 3, 2022
This fixes Perl#20231

LC_ALL is a compendium of the individual locale categories, such as
LC_CTYPE, LC_NUMERIC, ....  When all categories are in the same locale,
it acts just like an individual category.  But when the categories are
not in the same locale, some means must be used to indicate that.
Platforms differ in how they represent this.  Alpine uses:

    a;b;c;d;e;f

where each letter is replaced by the correct locale for a given
category.  Which category is in which position is deterministic, and
platform-specific.  Other platforms separate by a '/'.  And glibc uses a
more informative format:

    LC_CTYPE=a;LC_NUMBERIC=b; ...

This has the advantage that it's obvious to the reader what is what, and
the order in the string is irrelevant.

It might be possible, but painful, for a Configure probe to figure out
what the syntax is for the current platform.  I chose not to do that.  A
platform might come along with a novel syntax unanticipated by whatever
probe we came up with.

Instead, perl uses the glibc format internally, and when it needs to get
or set LC_ALL from the system, it loops through each category
individually, so that by the time it has done all of them, LC_ALL will
have been implicitly handled.

The breaking commit a7ff7ac failed to do that.
khwilliamson added a commit that referenced this issue May 7, 2023
This reverts commit 9e254b0.
Date:   Wed Apr 5 12:26:26 2023 -0600

This fixes GH #21040

The reverted commit caused failures in platforms using the musl library,
notably Alpine Linux.  I came up with a fix for that, which instead
broke Windows.  In looking at that I realized the original fix is
incomplete, and that things are too precarious to try to fix so close to
5.38.0.  For example, I spent hours, due to a %p format printing 0 for
what turned out to be a non-NULL string pointer.  I think it has to do
do with the fact that the failing code is in the middle of transitioning
between threads, and the printing got confused as a result.

The reverted commit was part of a series fixing #20155 and #20231.  But
the earlier part of the series succeeded in fixing those, without that
commit, so reverting it should not cause things to break as a result.

This whole issue has to do with locales and threading.  Those still
don't play well together.  I have a series of well over 200 commits that
address this situation, for applying in early 5.39.  My point is that we
are a long way from solving these kinds of issues; and they don't come
up that much in the field because they just don't get used.  The
reverted commit would help if it worked properly, but it's not the only
thing wrong by a long shot.
khwilliamson added a commit that referenced this issue May 8, 2023
This reverts commit 9e254b0.
Date:   Wed Apr 5 12:26:26 2023 -0600

This fixes GH #21040

The reverted commit caused failures in platforms using the musl library,
notably Alpine Linux.  I came up with a fix for that, which instead
broke Windows.  In looking at that I realized the original fix is
incomplete, and that things are too precarious to try to fix so close to
5.38.0.  For example, I spent hours, due to a %p format printing 0 for
what turned out to be a non-NULL string pointer.  I think it has to do
do with the fact that the failing code is in the middle of transitioning
between threads, and the printing got confused as a result.

The reverted commit was part of a series fixing #20155 and #20231.  But
the earlier part of the series succeeded in fixing those, without that
commit, so reverting it should not cause things to break as a result.

This whole issue has to do with locales and threading.  Those still
don't play well together.  I have a series of well over 200 commits that
address this situation, for applying in early 5.39.  My point is that we
are a long way from solving these kinds of issues; and they don't come
up that much in the field because they just don't get used.  The
reverted commit would help if it worked properly, but it's not the only
thing wrong by a long shot.
pjacklam pushed a commit to pjacklam/perl5 that referenced this issue May 20, 2023
This reverts commit 9e254b0.
Date:   Wed Apr 5 12:26:26 2023 -0600

This fixes GH Perl#21040

The reverted commit caused failures in platforms using the musl library,
notably Alpine Linux.  I came up with a fix for that, which instead
broke Windows.  In looking at that I realized the original fix is
incomplete, and that things are too precarious to try to fix so close to
5.38.0.  For example, I spent hours, due to a %p format printing 0 for
what turned out to be a non-NULL string pointer.  I think it has to do
do with the fact that the failing code is in the middle of transitioning
between threads, and the printing got confused as a result.

The reverted commit was part of a series fixing Perl#20155 and Perl#20231.  But
the earlier part of the series succeeded in fixing those, without that
commit, so reverting it should not cause things to break as a result.

This whole issue has to do with locales and threading.  Those still
don't play well together.  I have a series of well over 200 commits that
address this situation, for applying in early 5.39.  My point is that we
are a long way from solving these kinds of issues; and they don't come
up that much in the field because they just don't get used.  The
reverted commit would help if it worked properly, but it's not the only
thing wrong by a long shot.
pjacklam pushed a commit to pjacklam/perl5 that referenced this issue May 20, 2023
This reverts commit 9e254b0.
Date:   Wed Apr 5 12:26:26 2023 -0600

This fixes GH Perl#21040

The reverted commit caused failures in platforms using the musl library,
notably Alpine Linux.  I came up with a fix for that, which instead
broke Windows.  In looking at that I realized the original fix is
incomplete, and that things are too precarious to try to fix so close to
5.38.0.  For example, I spent hours, due to a %p format printing 0 for
what turned out to be a non-NULL string pointer.  I think it has to do
do with the fact that the failing code is in the middle of transitioning
between threads, and the printing got confused as a result.

The reverted commit was part of a series fixing Perl#20155 and Perl#20231.  But
the earlier part of the series succeeded in fixing those, without that
commit, so reverting it should not cause things to break as a result.

This whole issue has to do with locales and threading.  Those still
don't play well together.  I have a series of well over 200 commits that
address this situation, for applying in early 5.39.  My point is that we
are a long way from solving these kinds of issues; and they don't come
up that much in the field because they just don't get used.  The
reverted commit would help if it worked properly, but it's not the only
thing wrong by a long shot.
khwilliamson added a commit to khwilliamson/perl5 that referenced this issue Jul 10, 2023
This reverts commit 9e254b0.
Date:   Wed Apr 5 12:26:26 2023 -0600

This fixes GH Perl#21040

The reverted commit caused failures in platforms using the musl library,
notably Alpine Linux.  I came up with a fix for that, which instead
broke Windows.  In looking at that I realized the original fix is
incomplete, and that things are too precarious to try to fix so close to
5.38.0.  For example, I spent hours, due to a %p format printing 0 for
what turned out to be a non-NULL string pointer.  I think it has to do
do with the fact that the failing code is in the middle of transitioning
between threads, and the printing got confused as a result.

The reverted commit was part of a series fixing Perl#20155 and Perl#20231.  But
the earlier part of the series succeeded in fixing those, without that
commit, so reverting it should not cause things to break as a result.

This whole issue has to do with locales and threading.  Those still
don't play well together.  I have a series of well over 200 commits that
address this situation, for applying in early 5.39.  My point is that we
are a long way from solving these kinds of issues; and they don't come
up that much in the field because they just don't get used.  The
reverted commit would help if it worked properly, but it's not the only
thing wrong by a long shot.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants