-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Avoid locale dependent <ctype.h> functions like isascii(), isdigit(), tolower() #108767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
type-feature
A feature request or enhancement
Comments
vstinner
added a commit
to vstinner/cpython
that referenced
this issue
Sep 1, 2023
Convert the following macros to static inline functions: * Py_ISLOWER() * Py_ISUPPER() * Py_ISALPHA() * Py_ISDIGIT() * Py_ISXDIGIT() * Py_ISALNUM() * Py_ISSPACE() * Py_TOLOWER() * Py_TOUPPER() * Py_CHARMASK()
The use in |
vstinner
added a commit
to vstinner/cpython
that referenced
this issue
Sep 1, 2023
Replace <ctype.h> locale dependent isdigit() with Python locale independent Py_ISDIGIT() function in _PyBytes_FormatEx().
I also prefer to leave the Windows launcher program unchanged:
|
vstinner
added a commit
to vstinner/cpython
that referenced
this issue
Sep 1, 2023
Replace <ctype.h> locale dependent functions with Python "pyctype.h" locale independent functions: * Replace isalpha() with Py_ISALPHA(). * Replace isdigit() with Py_ISDIGIT(). * Replace isxdigit() with Py_ISXDIGIT(). * Replace tolower() with Py_TOLOWER(). Leave Modules/_sre/sre.c unchanged, it uses locale dependent functions on purpose.
By the way, /* On 4.4BSD-descendants, ctype functions serves the whole range of
* wchar_t character set rather than single byte code points only.
* This characteristic can break some operations of string object
* including str.upper() and str.split() on UTF-8 locales. This
* workaround was provided by Tim Robbins of FreeBSD project.
*/
#if defined(__APPLE__)
# define _PY_PORT_CTYPE_UTF8_ISSUE
#endif
#ifdef _PY_PORT_CTYPE_UTF8_ISSUE
#ifndef __cplusplus
/* The workaround below is unsafe in C++ because
* the <locale> defines these symbols as real functions,
* with a slightly different signature.
* See issue #10910
*/
#include <ctype.h>
#include <wctype.h>
#undef isalnum
#define isalnum(c) iswalnum(btowc(c))
#undef isalpha
#define isalpha(c) iswalpha(btowc(c))
#undef islower
#define islower(c) iswlower(btowc(c))
#undef isspace
#define isspace(c) iswspace(btowc(c))
#undef isupper
#define isupper(c) iswupper(btowc(c))
#undef tolower
#define tolower(c) towlower(btowc(c))
#undef toupper
#define toupper(c) towupper(btowc(c))
#endif
#endif |
vstinner
added a commit
that referenced
this issue
Sep 1, 2023
Replace <ctype.h> locale dependent functions with Python "pyctype.h" locale independent functions: * Replace isalpha() with Py_ISALPHA(). * Replace isdigit() with Py_ISDIGIT(). * Replace isxdigit() with Py_ISXDIGIT(). * Replace tolower() with Py_TOLOWER(). Leave Modules/_sre/sre.c unchanged, it uses locale dependent functions on purpose. Include explicitly <ctype.h> in _decimal.c to get isascii().
vstinner
added a commit
to vstinner/cpython
that referenced
this issue
Sep 2, 2023
Convert the following macros to static inline functions: * Py_ISLOWER() * Py_ISUPPER() * Py_ISALPHA() * Py_ISDIGIT() * Py_ISXDIGIT() * Py_ISALNUM() * Py_ISSPACE() * Py_TOLOWER() * Py_TOUPPER() * Py_CHARMASK()
vstinner
added a commit
to vstinner/cpython
that referenced
this issue
Sep 3, 2023
Convert the following macros to static inline functions: * Py_ISLOWER() * Py_ISUPPER() * Py_ISALPHA() * Py_ISDIGIT() * Py_ISXDIGIT() * Py_ISALNUM() * Py_ISSPACE() * Py_TOLOWER() * Py_TOUPPER() * Py_CHARMASK() Changes: * sre_lower_ascii() now casts Py_TOLOWER() argument to "unsigned char" and cast the result to "unsigned int". * bytesobject.c and bytearrayobject.c now pass an "int" argument to Py_CHARMASK(), instead of a "Py_ssize_t" argument.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Feature or enhancement
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
Proposal:
The following C files use <ctype.h> functions which depend on the current LC_CTYPE locale:
I propose to replace them with Python C API functions which don't depend on the locale, like Py_ISDIGIT() and Py_TOLOWER().
Linked PRs
The text was updated successfully, but these errors were encountered: