Skip to content

Commit f1fb9b0

Browse files
committed
enable intuit under anchored \G, and fix a bug
Since 1999, regcomp has had approximately the following comment and code: /* XXXX Currently intuiting is not compatible with ANCH_GPOS. This should be changed ASAP! */ if ((r->check_substr || r->check_utf8) && !(r->extflags & RXf_ANCH_GPOS)) { r->extflags |= RXf_USE_INTUIT; .... However, it appears that since that time, intuit has had (at least some) support for achored \G added. Note also that the RXf_USE_INTUIT flag (up until a few commits go) was only used by *callers* of regexec() to decide whether to call intuit() first; regexec() itself also internally calls intuit() on occasion, and in those cases it directly checks just the check_substr and check_utf8 fields, rather than the RXf_USE_INTUIT flag; so in those cases it's using intuit even in the presence of anchored \G. So, in the grand perl tradition of "make the change and see if anything in the test suite breaks", that's what I've done for this commit (i.e. removed the RXf_ANCH_GPOS check above). So intuit is now normally called even in the presence of anchored \G. This means that something like "aaaa" =~ /\G.*xx/ will now quickly fail in intuit rather than more slowly failing in regmatch(). Note that I have no actual knowledge of whether intuit is *really* anchored-\G-safe. As it happens one thing in the test suite did break, and this was due to the following code, added back in 1997: if ( .... && !((RExC_seen & REG_SEEN_GPOS) || (r->extflags & RXf_ANCH_GPOS))) ) r->extflags |= RXf_CHECK_ALL; It was clearly meant to say that if either of those \G flags were present, don't set the RXf_CHECK_ALL flag (which enables intuit-only matches). But the '!' was set to cover the first condition only, rather than both. Presumably this had never been spotted before due to skipping intuit under anchored \G. [Actually this commit broke some other stuff too, not covered by the test suite. See the next commit. Hooray for git rebase -i and history re-writing!]
1 parent fefee43 commit f1fb9b0

File tree

1 file changed

+2
-4
lines changed

1 file changed

+2
-4
lines changed

regcomp.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6242,7 +6242,7 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_count,
62426242
&& data.last_start_min == 0 && data.last_end > 0
62436243
&& !RExC_seen_zerolen
62446244
&& !(RExC_seen & REG_SEEN_VERBARG)
6245-
&& (!(RExC_seen & REG_SEEN_GPOS) || (r->extflags & RXf_ANCH_GPOS)))
6245+
&& !((RExC_seen & REG_SEEN_GPOS) || (r->extflags & RXf_ANCH_GPOS)))
62466246
r->extflags |= RXf_CHECK_ALL;
62476247
scan_commit(pRExC_state, &data,&minlen,0);
62486248

@@ -6339,9 +6339,7 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_count,
63396339
r->check_offset_min = r->float_min_offset;
63406340
r->check_offset_max = r->float_max_offset;
63416341
}
6342-
/* XXXX Currently intuiting is not compatible with ANCH_GPOS.
6343-
This should be changed ASAP! */
6344-
if ((r->check_substr || r->check_utf8) && !(r->extflags & RXf_ANCH_GPOS)) {
6342+
if ((r->check_substr || r->check_utf8) ) {
63456343
r->extflags |= RXf_USE_INTUIT;
63466344
if (SvTAIL(r->check_substr ? r->check_substr : r->check_utf8))
63476345
r->extflags |= RXf_INTUIT_TAIL;

0 commit comments

Comments
 (0)