Skip to content

Handle thread changing with libraries that have own thread status #20361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Oct 18, 2022

Conversation

khwilliamson
Copy link
Contributor

@khwilliamson khwilliamson commented Sep 29, 2022

Fixes #20155

@khwilliamson khwilliamson changed the title perl.h: Move PERL_CALLCONV definition to earlier Handle thread changing with libraries that have own thread status Sep 29, 2022
@khwilliamson khwilliamson force-pushed the smoke-me/khw-attributes branch from 5ea0082 to d354b7d Compare September 29, 2022 21:10
@khwilliamson khwilliamson force-pushed the smoke-me/khw-attributes branch from ab82fb0 to 66b227e Compare September 30, 2022 13:22
@khwilliamson khwilliamson force-pushed the smoke-me/khw-attributes branch 3 times, most recently from 4418b6a to 306adb3 Compare October 1, 2022 20:32
@khwilliamson khwilliamson marked this pull request as ready for review October 2, 2022 03:59
@khwilliamson khwilliamson requested a review from Leont October 2, 2022 04:19
@khwilliamson khwilliamson force-pushed the smoke-me/khw-attributes branch 4 times, most recently from c0705fc to 18d3710 Compare October 7, 2022 19:49
@khwilliamson
Copy link
Contributor Author

This now depends on #20370

@khwilliamson
Copy link
Contributor Author

#20370 has now been merged

This moves the code for cloning the locale variables to a single place,
consolidating some that had the same cpp directives.  It showed that one
variable was cloned twice; the redundant one is now removed.
A thread is supposed to start with the locale set to the C locale.  We
were duping the parent values, and later overriding.  Better to set to C
from the beginning.
So far this hadn't been an issue, but it will be for a future commit.
If there aren't threads, yes locales are trivially thread-safe, but
the code that gets executed to make them so doesn't need to get
compiled, and that is controlled by this #define.
This has gotten two twisty little mazy over time.  Clean it up, add
comments, and make sure the logic is the same on both.
Some configurations require us to store the current locale for each
category.  Prior to this commit, this was done in the array
PL_curlocales, with the entry for LC_ALL being in the highest element.

Future commits will need just the value for LC_ALL in some other
configurations, without needing the rest of the array.  This commit
splits off the LC_ALL element into its own per-interpreter variable to
accommodate those.  It always had to have special handling anyway beyond
the rest of the array elements,
This is in preparation for it to be used in more instances in future
commits.  It uses a symbol that won't be defined until those commits.
This prevents some unnecessary steps, that the next commit would turn
into memory leaks.
This is a step in solving #20155

The POSIX 2008 locale API introduces per-thread locales.  But the
previous global locale system is retained, probably for backward
compatibility.

The POSIX 2008 interface causes memory to be malloc'd that needs to be
freed.  In order to do this, the caller must first stop using that
memory, by switching to another locale.  perl accomplishes this during
termination by switching to the global locale, which is always available
and doesn't need to be freed.

Perl has long assumed that all that was needed to switch threads was to
change out tTHX.  That's because that structure was intended to hold all
the information for a given thread.  But it turns out that this doesn't
work when some library independently holds information about the
thread's state.  And there are now some libraries that do that.

What was happening in this case was that perl thought that it was
sufficient to switch tTHX to change to a different thread in order to do
the freeing of memory, and then used the POSIX 2008 function to change
to the global locale so that the memory could be safely freed.  But the
POSIX 2008 function doesn't care about tTHX, and actually was typically
operating on a different thread, and so changed that thread to the global
locale instead of the intended thread.  Often that was the top-level
thread, thread 0.  That caused whatever thread it was to no longer be in
the expected locale, and to no longer be thread-safe with regards to
localess,

This commit causes locale_term(), which has always been called from the
actual terminating thread that POSIX 2008 knows about, to change to the
global thread and free the memory.

It also creates a new per-interpreter variable that effectively maps the
tTHX thread to the associated POSIX 2008 memory.  During
perl_destruct(), it frees the memory this variable points to, instead of
blindly assuming the memory to free is the current tTHX thread's.

This fixes the symptoms associtated with #20155, but doesn't solve the
whole problem.  In general, a library that has independent thread status
needs to be updated to the new thread when Perl changes threads using
tTHX.  Future commits will do this.
As noted in the previous commit, some library functions now keep
per-thread state.  So far the only ones we care about are libc
locale-changing ones.

When perl changes threads by swapping out tTHX, those library functions
need to be informed about the new value so that they remain in sync with
what perl thinks the locale should be.

This commit creates a function to do this, and changes the
thread-changing macros to also call this as part of the change.

For POSIX 2008, the function just calls uselocale() using the
per-interpreter object introduced previously.

For Windows, this commit adds a per-interpreter string of the current
LC_ALL, and the function calls setlocale on that.  We keep the same
string for POSIX 2008 implementations that lack querylocale(), so this
commit just enables that variable on Windows as well.  The code is
already in place to free the memory the string occupies when done.

The commit also creates a mechanism to skip this during thread
destruction.  A thread in its death throes doesn't need to have accurate
locale information, and the information needed to map from thread to
what libc needs to know gets destroyed as part of those throes, while
relics of the thread remain.  I couldn't find a way to accurately know
if we are dealing with a relic or not, so the solution I adopted was to
just not switch during destruction.

This commit completes fixing #20155.
This partially reverts ebb1d9c.

The real fix for this problem has now been committed, so the workaround
can be reverted, leaving the tests.
@khwilliamson khwilliamson force-pushed the smoke-me/khw-attributes branch from 42d4db3 to 0f44207 Compare October 12, 2022 12:15
@khwilliamson khwilliamson merged commit 505ba11 into blead Oct 18, 2022
@khwilliamson khwilliamson deleted the smoke-me/khw-attributes branch October 18, 2022 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Locale patch breaks dist/threads/t/kill.t by triggering attribute.pm XS version mismatch errors.
1 participant