-
Notifications
You must be signed in to change notification settings - Fork 584
Assertion failure in S_find_byclass: ! is_utf8_pat #17278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Actually, this is a deeper problem than the blamed commit, which merely exposed the issue. perlre says about the (?...) construct Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately The case in this ticket ends up using this construct, but is encoded in UTF-8. The code that does the matching does not expect that a UTF-8 pattern would do /d type matching. It would be extra work to do this, and I don't think it is worth it, given how long this problem has taken to surface. I would like to change perlre and the compilation code to have a UTF-8 encoded pattern be encoded as /u when the caret is encountered instead of /d. Opinions? |
In thinking about this some more, I'm unsure what the caret should mean. At the time this was added, the caret was supposed to signify a fresh start, which is why this particular character was chosen, because it comes at the start typically in patterns. The test case in this issue actually had the pattern effectively be in the scope of 'use locale'. Should the caret mean the fresh start be what's in effect outside the pattern. What if instead of locale, it was 'use re "/a"'? Or should the caret mean what's in effect for the whole pattern base on things like 'use locale', but also the specific pattern trailing modifiers. To me it seems like the caret should merely reset whatever modifiers have been changed within the pattern to what's in effect at the start of its compilation |
This was an assertion failure in regexec.c under rare circumstances. A reduction of the fuzzed test case is now in pat_advanced.t The root cause of this was that the pattern being compiled was encoded in UTF-8 and 'use locale' was in effect, equivalent to the /l charset, and then the charset was reset inside the pattern, to /d. But /d in a UTF-8 patterns is illegal, hence the later assertion failure. The solution is to reset instead to /u when the pattern is UTF-8.
This was an assertion failure in regexec.c under rare circumstances. A reduction of the fuzzed test case is now in pat_advanced.t The root cause of this was that the pattern being compiled was encoded in UTF-8 and 'use locale' was in effect, equivalent to the /l charset, and then the charset was reset inside the pattern, to /d. But /d in a UTF-8 patterns is illegal, hence the later assertion failure. The solution is to reset instead to /u when the pattern is UTF-8. (cherry picked from commit bb58640)
This is a bug report for perl from [email protected],
generated with the help of perlbug 1.41 running under perl 5.31.6.
[Please describe your issue here]
While fuzzing perl v5.31.5-213-g9bec17d7c built with afl and run
under libdislocator, I found the following program
BEGIN{$^H=4}
$z="q!\341\200\200\341\200\200\341\200\200\340\240\200\340\240\200\340\240\200\343\200\200\343\200\200\340\240\200\340\240\200\340\240\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200\341\200\200!=~m!(?^i)[\303\200]!";
utf8::decode($z);
eval$z;
to cause an assertion failure
regexec.c:2236: S_find_byclass: Assertion `! is_utf8_pat' failed.
GDB stack trace is
This is regression between 5.28 and 5.30, bisect points to
commit b229619
Author: Karl Williamson [email protected]
Date: Tue Dec 25 22:56:48 2018 -0700
The removal of the regex compilation pass now makes these feasible and
desirable. Compilation now tries hard to optimize an ANYOF node into
something smaller and/or faster when feasible.
[Please do not change anything below this line]
Flags:
category=core
severity=medium
Site configuration information for perl 5.31.6:
Configured by dur-randir at Fri Nov 8 05:18:19 MSK 2019.
Summary of my perl5 (revision 5 version 31 subversion 6) configuration:
Commit id: 1462134
Platform:
osname=darwin
osvers=13.4.0
archname=darwin-2level
uname='darwin isengard.local 13.4.0 darwin kernel version 13.4.0: mon jan 11 18:17:34 pst 2016; root:xnu-2422.115.15~1release_x86_64 x86_64 '
config_args='-de -Dusedevel -DDEBUGGING'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.9 -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include -DPERL_USE_SAFE_PUTENV'
optimize='-O3 -g'
cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.9 -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include'
ccversion=''
gccversion='4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -mmacosx-version-min=10.9 -fstack-protector -L/usr/local/lib -L/opt/local/lib'
libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib /opt/local/lib
libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
perllibs=-lpthread -ldl -lm -lutil -lc
libc=
so=dylib
useshrplib=false
libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=bundle
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags=' -mmacosx-version-min=10.9 -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector'
@inc for perl 5.31.6:
lib
/usr/local/lib/perl5/site_perl/5.31.6/darwin-2level
/usr/local/lib/perl5/site_perl/5.31.6
/usr/local/lib/perl5/5.31.6/darwin-2level
/usr/local/lib/perl5/5.31.6
Environment for perl 5.31.6:
DYLD_LIBRARY_PATH (unset)
HOME=/Users/dur-randir
LANG=en_US.UTF-8
LANGUAGE (unset)
LC_CTYPE=en_US.UTF-8
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/Users/dur-randir/perlbrew/bin:/Users/dur-randir/perlbrew/perls/perl-5.26.0/bin:/opt/local/bin:/usr/texbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/Library/TeX/texbin
PERLBREW_HOME=/Users/dur-randir/.perlbrew
PERLBREW_MANPATH=/Users/dur-randir/perlbrew/perls/perl-5.26.0/man
PERLBREW_PATH=/Users/dur-randir/perlbrew/bin:/Users/dur-randir/perlbrew/perls/perl-5.26.0/bin
PERLBREW_PERL=perl-5.26.0
PERLBREW_ROOT=/Users/dur-randir/perlbrew
PERLBREW_SHELLRC_VERSION=0.86
PERLBREW_VERSION=0.86
PERL_BADLANG (unset)
SHELL=/opt/local/bin/zsh
The text was updated successfully, but these errors were encountered: