Skip to content

Copy_on_Write COW isn't working when the constant is the result of compile-time constant-folding #20586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LorenzoTa opened this issue Dec 6, 2022 · 15 comments
Assignees

Comments

@LorenzoTa
Copy link

LorenzoTa commented Dec 6, 2022

This is a bug report for perl from LorenzoTa,
generated with the help of perlbug 1.40 running under perl 5.26.0.


Hello esteemed ones,

it seems (in every version of perl and on every OS) that COW does not work as expected when the constant is the result of compile-time constant-folding.

In short:

my $x = 'a' x (2**30);

Allocates 2.104.612 K of OS memory while Devel::Size::total_size($x) reports 1.00 Gb

We spotted this wrong behaviour at perlmonks in the thread:

www.perlmonks.org/?node_id=11147727

where you can find many more details and test run on different envs, with the very same (unexpected) result, alongside comments of more experienced programmers: notably this one

More details:

# WRONG behaviour as it doubles the memory
my $x = 'a' x (2**30);

# RIGHT behaviour
foreach my $order ( qw(20 24 30 32) ) { my $x = 'a' x ( 2 ** $order ) }

# also RIGHT forcing RHS to runtime
$x = 'a' x (2**${\32})

Thanks for your work!

lorenzo

[Please do not change anything below this line]


Flags:
category=core
severity=medium

Site configuration information for perl 5.26.0:

Configured by strawberry-perl at Sat Sep 2 16:28:54 2017.

Summary of my perl5 (revision 5 version 26 subversion 0) configuration:

Platform:
osname=MSWin32
osvers=6.3
archname=MSWin32-x64-multi-thread
uname='Win32 strawberry-perl 5.26.0.2 #1 Sat Sep 2 16:25:32 2017 x64'
config_args='undef'
hint=recommended
useposix=true
d_sigaction=undef
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=undef
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='gcc'
ccflags =' -s -O2 -DWIN32 -DWIN64 -DCONSERVATIVE -D__USE_MINGW_ANSI_STDIO -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -fwrapv -fno-strict-aliasing -mms-bitfields'
optimize='-s -O2'
cppflags='-DWIN32'
ccversion=''
gccversion='7.1.0'
gccosandvers=''
intsize=4
longsize=4
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='long long'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='g++.exe'
ldflags ='-s -L"C:\EX_D\ulisseDUE\perl5.26.64bit\perl\lib\CORE" -L"C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib"'
libpth=C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib C:\EX_D\ulisseDUE\perl5.26.64bit\c\x86_64-w64-mingw32\lib C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib\gcc\x86_64-w64-mingw32\7.1.0
libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
libc=
so=dll
useshrplib=true
libperl=libperl526.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_win32.xs
dlext=xs.dll
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags='-mdll -s -L"C:\EX_D\ulisseDUE\perl5.26.64bit\perl\lib\CORE" -L"C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib"'


@inc for perl 5.26.0:
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/site/lib/MSWin32-x64-multi-thread
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/site/lib
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/vendor/lib
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/lib


Environment for perl 5.26.0:
HOME (unset)
LANG (unset)
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=C:\EX_D\ulisseDUE\perl5.26.64bit\perl\site\bin;C:\EX_D\ulisseDUE\perl5.26.64bit\perl\bin;C:\EX_D\ulisseDUE\perl5.26.64bit\c\bin;C:\EX_D\ulisseDUE\bin\UnxUtils\usr\local\wbin;C:\WINDOWS;C:\WINDOWS\system32;
PERL_BADLANG (unset)
PERL_JSON_BACKEND=JSON::XS
PERL_RL=Perl
PERL_YAML_BACKEND=YAML
SHELL (unset)

@demerphq
Copy link
Collaborator

demerphq commented Dec 6, 2022

This is a result of the constant optimization kicking in. The expression is evaluated at compile time and stored in the optree, and the result is then copied into the SV. It is not so much a bug as malbehavior of the optimization as we do not make any specific commitments about how much memory a given string will take. (I am being legalistic here, I agree this is undesirable.)

If COW is involved, it is that this is not a COW style copy. But i believe we do not store SV's in the optree, so we cannot use COW.

@richardleach
Copy link
Contributor

But i believe we do not store SV's in the optree, so we cannot use COW.

CONST ops do store pointers to SVs and so, at least on the face of it, COW might be possible. No COW flag on this one:

$ perl -MDevel::Peek -e'Dump("a"x10)'
SV = PV(0x562f4beb40f0) at 0x562f4bee0528
  REFCNT = 1
  FLAGS = (PADTMP,POK,READONLY,PROTECT,pPOK)
  PV = 0x562f4bee93b0 "aaaaaaaaaa"\0
  CUR = 10
  LEN = 16

@richardleach
Copy link
Contributor

Can get:

 ./perl -Ilib -MDevel::Peek -e 'Dump("a"x10)'
SV = PV(0x55f64b2aa070) at 0x55f64b2cd0e0
  REFCNT = 1
  FLAGS = (PADTMP,POK,IsCOW,READONLY,PROTECT,pPOK)
  PV = 0x55f64b2f61c0 "aaaaaaaaaa"\0
  CUR = 10
  LEN = 16
  COW_REFCNT = 0

by messing with S_fold_constants, but I dunno if this is appropriate or not:

diff --git a/op.c b/op.c
index 8851dce046..8706d1349f 100644
--- a/op.c
+++ b/op.c
@@ -4976,6 +4976,8 @@ S_fold_constants(pTHX_ OP *const o)
     is_stringify = type == OP_STRINGIFY && !o->op_folded;
     op_free(o);
     assert(sv);
+    if (SvPOK(sv))
+        SvIsCOW_on(sv);
     if (is_stringify)
         SvPADTMP_off(sv);
     else if (!SvIMMORTAL(sv)) {

Running tests to see if anything fails....

@richardleach
Copy link
Contributor

With blead:

$ /usr/bin/time -v ./perl -e 'my $x = "a"x (2**32)' 2>&1 | grep "Maximum resident"
	Maximum resident set size (kbytes): 7696288

With the above patch:

$ /usr/bin/time -v ./perl -e 'my $x = "a"x (2**32)' 2>&1 | grep "Maximum resident"
	Maximum resident set size (kbytes): 4199680

Two failing test files, but a quick look suggests it might just because the Dump() output doesn't match what is currently hardcoded:

  • ../ext/Devel-Peek/t/Peek.t
  • op/undef.t

But as mentioned, I don't have the experience to know whether this is a suitable patch or not. If longer standing committers think this is a decent approach, I'm happy to make a proper PR out of it.

@demerphq
Copy link
Collaborator

demerphq commented Dec 7, 2022

I think its worthy to investigate more. But, I am not clear that that is the correct approach. I think the IsCOW flag should only happen when the data is copied from the const op, not when it is created. (Maybe i read the output wrong tho.)

BTW, I am confused, i thought we had a hard rule that SV's cant be stored in the optree as it means we cant share optrees between threads. Eg, an optree can be compiled in one thread, thus allocating the SV's out of that threads pool, and then destroyed in another thread, leading to "SV freed to wrong pool" errors. Hence me thinking we had no SV's in the optree. Obviously the situation is a bit more complex than I understood.

@demerphq
Copy link
Collaborator

demerphq commented Dec 7, 2022

Also, note that even using COW, one could only create up to 254 copies of this string before it was duplicated in memory. (IMO that is something we should fix.)

@iabyn
Copy link
Contributor

iabyn commented Dec 7, 2022 via email

@richardleach
Copy link
Contributor

Normal string constants are marked as COW-able by Perl_ck_svconst(). Either that function needs calling against a newly-created const op which is the result of constant folding, or at least the actions in that function need carrying out.

Perl_ck_svconst does fire when S_fold_constants creates the new const:

    newop = newSVOP(OP_CONST, 0, MUTABLE_SV(sv));

However, by that time, sv already has the READONLY flag set:
FLAGS = (PADTMP,POK,READONLY,PROTECT,pPOK)
so the code branch which would make it COW-able does not get followed: https://github.com/Perl/perl5/blob/blead/op.c#L14425

So possibly we need this instead? (Perl_ck_svconst will add the READONLY flag.)

diff --git a/op.c b/op.c
index 8851dce046..a3ad0c823f 100644
--- a/op.c
+++ b/op.c
@@ -4980,7 +4980,9 @@ S_fold_constants(pTHX_ OP *const o)
         SvPADTMP_off(sv);
     else if (!SvIMMORTAL(sv)) {
         SvPADTMP_on(sv);
-        SvREADONLY_on(sv);
+        /* Used to SvREADONLY_on(sv); here, but this will be done
+           by newSVOP anyway and doing it now stops Perl_ck_svconst
+           from marking the SV as COW-able. */
     }
     newop = newSVOP(OP_CONST, 0, MUTABLE_SV(sv));
     if (!is_stringify) newop->op_folded = 1;

@richardleach
Copy link
Contributor

So possibly we need this instead?

The above seemed pretty reasonable to me, so I've turned it into a PR for review: #20595

@demerphq
Copy link
Collaborator

@richardleach I agree that PR should be standalone, but will you follow up with one for "x" with large multipliers? I think that should happen, but i have a feeling you are better placed to do it than me.

Thanks for taking this on. Its been a nice learning experience.

@richardleach
Copy link
Contributor

I agree that PR should be standalone, but will you follow up with one for "x" with large multipliers? I think that should happen, but i have a feeling you are better placed to do it than me.

I don't know how to do this off the top of my head but am happy to poke. We'd have to pick an arbitrary large multiplier, so would we want it to be controllable through e.g. an env var? (If so, I haven't looked at how to do that before either.)

Its been a nice learning experience.

For me too. :)

@richardleach
Copy link
Contributor

one for "x" with large multipliers

I don't know how to do this off the top of my head

Building on case OP_REPEAT: in S_fold_constants() looks like a likely possibility.

@richardleach
Copy link
Contributor

one for "x" with large multipliers

Building on case OP_REPEAT: in S_fold_constants() looks like a likely possibility.

Possibly something like this, with additional overflow checks:

diff --git a/op.c b/op.c
index c04f63c55c..74029cf946 100644
--- a/op.c
+++ b/op.c
@@ -4871,6 +4871,14 @@ S_fold_constants(pTHX_ OP *const o)
         break;
     case OP_REPEAT:
         if (o->op_private & OPpREPEAT_DOLIST) goto nope;
+        if (OP_TYPE_IS(cBINOPo->op_last, OP_CONST)) {
+            SV *constsv = cSVOPx_sv(cBINOPo->op_last);
+            SvIV_please_nomg(constsv);
+            if (SvIOKp(constsv) && (
+                (SvIOK_UV(constsv) && SvUVX(constsv) > 1000000) ||
+                (SvIVX(constsv) > 1000000)))
+                goto nope;
+        }
         break;
     case OP_SREFGEN:
         if (cUNOPx(cUNOPo->op_first)->op_first->op_type != OP_CONST

@richardleach
Copy link
Contributor

You could still defeat the size limit on OP_REPEAT by breaking up the construction, e.g.
my $x = "a"x500001 . "b"x500001
Not sure what comprehensive solution there is for that. Seems like we'd have to let S_fold_constants_eval() run and then examine the results of it to decide whether to keep or discard them. However, as in the example above, the interpreter could have allocated the same amount of memory regardless. So I dunno.

@iabyn
Copy link
Contributor

iabyn commented Dec 12, 2022 via email

@richardleach richardleach self-assigned this Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants