Skip to content

panic: sv_len_utf8 cache 1 real 0 for at #17737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dur-randir opened this issue Apr 21, 2020 · 4 comments · Fixed by #17740
Closed

panic: sv_len_utf8 cache 1 real 0 for at #17737

dur-randir opened this issue Apr 21, 2020 · 4 comments · Fixed by #17740
Assignees

Comments

@dur-randir
Copy link
Member

dur-randir commented Apr 21, 2020

This is a bug report for perl from [email protected],
generated with the help of perlbug 1.41 running under perl 5.31.10.

[Please describe your issue here]

While fuzzing perl v5.31.9-70-g0c96aa4b7b built with afl and run
under libdislocator, I found the following program

length reverse
              chop for split
                            chr,"\x{c00}0"
00000000  6c 65 6e 67 74 68 20 72  65 76 65 72 73 65 0c 63  |length reverse.c|
00000010  68 6f 70 20 66 6f 72 20  73 70 6c 69 74 0b 63 68  |hop for split.ch|
00000020  72 2c 22 5c 78 7b 63 30  30 7d 00 00 30 22        |r,"\x{c00}..0"|

to emit panic():

panic: sv_len_utf8 cache 1 real 0 for at

GDB stack trace at the point of croak() is:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7c24535 in __GI_abort () at abort.c:79
#2  0x00005555557270b6 in Perl_vcroak (pat=0x555555b48a90 "panic: %s cache %lu real %lu for %-p", args=0x7fffffffdc90) at util.c:1745
#3  0x00005555557273e2 in Perl_croak (pat=0x555555b48a90 "panic: %s cache %lu real %lu for %-p") at util.c:1803
#4  0x00005555557d2ad1 in S_assert_uft8_cache_coherent (func=0x555555b489b3 "sv_len_utf8", from_cache=1, real=0, sv=0x555555c3afc8) at sv.c:7877
#5  0x00005555557d0c4e in Perl_sv_len_utf8_nomg (sv=0x555555c3afc8) at sv.c:7257
#6  0x0000555555811f4d in Perl_pp_length () at pp.c:3119
#7  0x00005555557206b2 in Perl_runops_debug () at dump.c:2571
#8  0x00005555555f083e in S_run_body (oldscope=1) at perl.c:2759
#9  0x00005555555efdbc in perl_run (my_perl=0x555555c15260) at perl.c:2682
#10 0x00005555555a2155 in main (argc=2, argv=0x7fffffffe1b8, env=0x7fffffffe1d0) at perlmain.c:134

This is a regression between 5.26 and 5.28, bisect points to 47836a1 is the first bad commit
commit 47836a1
Author: Zefram [email protected]
Date: Fri Dec 8 19:23:29 2017 +0000

don't lose mark when pp_reverse extends stack

Nullary reverse needs to extend the stack to push its result scalar.
It was actually extending the stack, but doing so invalidated MARK,
which it relied upon to place the stack pointer afterwards.  Upon stack
reallocation it was therefore leaving the stack pointer pointing to the
freed stack memory.  Reformulate stack manipulation to not rely on MARK
after extending.  Fixes [perl #132544].
[Please do not change anything below this line]
Flags:
category=core
severity=medium
Site configuration information for perl 5.31.10:

Configured by root at Fri Mar 13 17:15:02 MSK 2020.

Summary of my perl5 (revision 5 version 31 subversion 10) configuration:
Commit id: 0c96aa4
Platform:
osname=linux
osvers=4.19.0-8-amd64
archname=x86_64-linux
uname='linux dorothy 4.19.0-8-amd64 #1 smp debian 4.19.98-1 (2020-01-26) x86_64 gnulinux '
config_args='-de -Dusedevel -Doptimize=-O2'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
optimize='-O2'
cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='8.3.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.28.so
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.28'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

@inc for perl 5.31.10:
lib
/usr/local/lib/perl5/site_perl/5.31.10/x86_64-linux
/usr/local/lib/perl5/site_perl/5.31.10
/usr/local/lib/perl5/5.31.10/x86_64-linux
/usr/local/lib/perl5/5.31.10

Environment for perl 5.31.10:
HOME=/home/afl
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE=en_US.UTF-8
LC_TIME=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/afl/perlbrew/bin:/home/afl/perlbrew/perls/perl-5.30.0-dbg/bin:/opt/local/bin:/usr/texbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PERLBREW_HOME=/home/afl/.perlbrew
PERLBREW_MANPATH=/home/afl/perlbrew/perls/perl-5.30.0-dbg/man
PERLBREW_PATH=/home/afl/perlbrew/bin:/home/afl/perlbrew/perls/perl-5.30.0-dbg/bin
PERLBREW_PERL=perl-5.30.0-dbg
PERLBREW_ROOT=/home/afl/perlbrew
PERLBREW_SHELLRC_VERSION=0.88
PERLBREW_VERSION=0.88
PERL_BADLANG (unset)
@hvds
Copy link
Contributor

hvds commented Apr 21, 2020

Here's a slightly simpler version:

% ./miniperl -e 'length reverse for split "-", "\x{100}--0"'
panic: sv_len_utf8 cache 1 real 0 for  at -e line 1.
% 

I think something in pp_reverse should be calling magic_setutf8() on TARG, either directly or indirectly, but I've no idea at what point it should be doing that. I'm mildly surprised it didn't happen on the setsv(). The result, as far as I can tell, is that the utf8 pos/len cache on TARG persists from the first split result ("\x{100}") to the second (empty string) without its data being updated.

I suspect also that whatever is going wrong here, the same pattern may occur in other places too.

Not sure who knows more about utf8 pos/len caching - @khwilliamson, @tonycoz?

tonycoz added a commit to tonycoz/perl5 that referenced this issue Apr 22, 2020
@tonycoz
Copy link
Contributor

tonycoz commented Apr 22, 2020

sv_setsv() doesn't do set magic, which can be confusing. There's a macro SvSetMagicSV() which does do the magic.

We have had this happen before, I'll have a look over core to see if I can find any other examples.

@tonycoz tonycoz self-assigned this Apr 22, 2020
tonycoz added a commit to tonycoz/perl5 that referenced this issue Apr 22, 2020
tonycoz added a commit to tonycoz/perl5 that referenced this issue Apr 22, 2020
@tonycoz
Copy link
Contributor

tonycoz commented Apr 22, 2020

I found one other case of sv_setsv() without a needed SvSETMAGIC() where I could make a test case indicating a problem.

Code like:

$lexical = (index(...) == -1);

is optimized to a single OP_INDEX with flags indicating the comparison and the assignment.

The assignment (via sv_setsv()) doesn't call SvSETMAGIC().

This optimization had another problem in that the assignment was returning the constant assigned to the lexical rather than the lexical lvalue itself.

Both issues are fixed in my https://github.com/tonycoz/perl5/tree/17737-make-reverse-magical branch. which I'll make into a PR once CI is done.

There might be similar problem around line 1301 of gv.c, but I couldn't think of a way to make any misbehaviour visible.

@hvds
Copy link
Contributor

hvds commented Apr 22, 2020

Thanks @tonycoz++

tonycoz added a commit to tonycoz/perl5 that referenced this issue May 25, 2020
tonycoz added a commit to tonycoz/perl5 that referenced this issue May 25, 2020
related to Perl#17737 and fixes Perl#17739

re-work of my original patch that only pushes the final result
xsawyerx pushed a commit that referenced this issue May 27, 2020
related to #17737 and fixes #17739

re-work of my original patch that only pushes the final result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants