-
Notifications
You must be signed in to change notification settings - Fork 577
/(?i:...)/ loses passed in charset #11967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
From @khwilliamsonThis is a bug report for perl from khw@karl.(none), Setting flags in a regular expression using the (?foo:...) notation Spotted by Yves Orton Flags: Site configuration information for perl 5.15.7: Configured by khw at Mon Feb 20 07:52:27 MST 2012. Summary of my perl5 (revision 5 version 15 subversion 7) configuration: Locally applied patches: @INC for perl 5.15.7: /home/khw/blead/lib/perl5/site_perl/5.15.7/i686-linux-thread-multi-64int-ld Environment for perl 5.15.7: PATH=/home/khw/bin:/home/khw/print/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/usr/games:/home/khw/cxoffice/bin |
From @khwilliamsonOn 02/20/2012 11:21 AM, karl williamson (via RT) wrote: Here is a patch for this bug, that was just spotted by Yves. This bug Should this go into 5.16?
|
From @khwilliamson0001-perl-111174-foo-.-loses-passed-in-charset.patchFrom b06487a9b23cdb9e91baed3868ac2fa8881235db Mon Sep 17 00:00:00 2001
From: Karl Williamson <[email protected]>
Date: Mon, 20 Feb 2012 11:27:03 -0700
Subject: [PATCH 1/2] [perl #111174] (?foo:...) loses passed in charset
This commit looks for the passed-in charset, and overrides it only if it
is /d and the pattern requires /u. Previously the passed-in value was
ignored.
---
regcomp.c | 9 ++++++---
t/re/pat.t | 11 ++++++++++-
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/regcomp.c b/regcomp.c
index dd5a37c..a0597ca 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -8010,9 +8010,12 @@ S_reg(pTHX_ RExC_state_t *pRExC_state, I32 paren, I32 *flagp,U32 depth)
U32 posflags = 0, negflags = 0;
U32 *flagsp = &posflags;
char has_charset_modifier = '\0';
- regex_charset cs = (RExC_utf8 || RExC_uni_semantics)
- ? REGEX_UNICODE_CHARSET
- : REGEX_DEPENDS_CHARSET;
+ regex_charset cs = get_regex_charset(RExC_flags);
+ if (cs == REGEX_DEPENDS_CHARSET
+ && (RExC_utf8 || RExC_uni_semantics))
+ {
+ cs = REGEX_UNICODE_CHARSET;
+ }
while (*RExC_parse) {
/* && strchr("iogcmsx", *RExC_parse) */
diff --git a/t/re/pat.t b/t/re/pat.t
index b4b7ac4..184f1f4 100644
--- a/t/re/pat.t
+++ b/t/re/pat.t
@@ -21,7 +21,7 @@ BEGIN {
require './test.pl';
}
-plan tests => 469; # Update this when adding/deleting tests.
+plan tests => 472; # Update this when adding/deleting tests.
run_tests() unless caller;
@@ -1253,6 +1253,15 @@ EOP
$anch_count++ while $str=~/^.*/mg;
is $anch_count, 1, 'while "\n"=~/^.*/mg should match only once';
}
+
+ { # [perl #111174]
+ use re '/u';
+ like "\xe0", qr/(?i:\xc0)/, "(?i: shouldn't lose the passed in /u";
+ use re '/a';
+ unlike "\x{100}", qr/(?i:\w)/, "(?i: shouldn't lose the passed in /a";
+ use re '/aa';
+ unlike 'k', qr/(?i:\N{KELVIN SIGN})/, "(?i: shouldn't lose the passed in /aa";
+ }
} # End of sub run_tests
1;
--
1.7.7.1
|
From @nwc10On Mon, Feb 20, 2012 at 11:42:35AM -0700, Karl Williamson wrote:
Well, valgrind thinks that it looks like this: $ valgrind ./perl -Ilib -E ' "\xe0" =~ /(?i:\w)/' That has some potential for mischief, doesn't it? Nicholas Clark |
The RT System itself - Status changed from 'new' to 'open' |
From @khwilliamsonOn 02/20/2012 11:46 AM, Nicholas Clark wrote:
That was a poor choice of wording on my part. What I meant was that the That means the issue raised below is from some other cause, and it
|
From @rjbs* Karl Williamson <public@khwilliamson.com> [2012-02-20T13:42:35]
I lean toward being in favor of it. Other opinions? Espcially objections? -- |
From @cpansproutOn Mon Feb 20 18:17:29 2012, perl.p5p@rjbs.manxome.org wrote:
I think any kind of memory corruption or leak should be exempt from ‘no -- Father Chrysostomos |
From @khwilliamsonOn 02/20/2012 12:38 PM, Karl Williamson wrote:
Since I didn't get this to happen on my 32 bit machine, I tried on |
From @khwilliamsoncommit 96f5488 |
From [Unknown Contact. See original ticket]commit 96f5488 |
@khwilliamson - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#111174 (status was 'resolved')
Searchable as RT111174$
The text was updated successfully, but these errors were encountered: