-
Notifications
You must be signed in to change notification settings - Fork 578
Copy_on_Write COW isn't working when the constant is the result of compile-time constant-folding #20586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a result of the constant optimization kicking in. The expression is evaluated at compile time and stored in the optree, and the result is then copied into the SV. It is not so much a bug as malbehavior of the optimization as we do not make any specific commitments about how much memory a given string will take. (I am being legalistic here, I agree this is undesirable.) If COW is involved, it is that this is not a COW style copy. But i believe we do not store SV's in the optree, so we cannot use COW. |
CONST ops do store pointers to SVs and so, at least on the face of it, COW might be possible. No COW flag on this one:
|
Can get:
by messing with S_fold_constants, but I dunno if this is appropriate or not:
Running tests to see if anything fails.... |
With blead:
With the above patch:
Two failing test files, but a quick look suggests it might just because the Dump() output doesn't match what is currently hardcoded:
But as mentioned, I don't have the experience to know whether this is a suitable patch or not. If longer standing committers think this is a decent approach, I'm happy to make a proper PR out of it. |
I think its worthy to investigate more. But, I am not clear that that is the correct approach. I think the IsCOW flag should only happen when the data is copied from the const op, not when it is created. (Maybe i read the output wrong tho.) BTW, I am confused, i thought we had a hard rule that SV's cant be stored in the optree as it means we cant share optrees between threads. Eg, an optree can be compiled in one thread, thus allocating the SV's out of that threads pool, and then destroyed in another thread, leading to "SV freed to wrong pool" errors. Hence me thinking we had no SV's in the optree. Obviously the situation is a bit more complex than I understood. |
Also, note that even using COW, one could only create up to 254 copies of this string before it was duplicated in memory. (IMO that is something we should fix.) |
To reply to various points in this ticket so far.
(I first diagnosed the issue on perlmonks).
The issue is that, while string constants are usually marked as COW-able,
string constants created as a result of constant folding aren't:
$ p -MDevel::Peek -e'my $x = "aaaaaaaaaa"; Dump $x' 2>&1 | grep COW
FLAGS = (POK,IsCOW,pPOK)
COW_REFCNT = 1
$ p -MDevel::Peek -e'my $x = "aaaaa" . "aaaaa"; Dump $x' 2>&1 | grep COW
$ p -MDevel::Peek -e'my $x = "a" x 10; Dump $x' 2>&1 | grep COW
$
Normal string constants are marked as COW-able by Perl_ck_svconst().
Either that function needs calling against a newly-created const op which
is the result of constant folding, or at least the actions in that
function need carrying out.
On threaded builds, the pointer from an OP_CONST op to the constant SV
is stored in the pad (similarly to GV pointers for OP_GVSV etc), and
there's a different pad for each thread.
As for the 255 COW copy limit, that doesn't bother me much. Most of the
time the COW count is going to be a small number, often just 1, as a
string gets "passed along" between various variables, like in
my ($string, ...) = @_;
On the rare occasions where the 254 limit is exceeded, then the string
gets copied each time, which is no worse than when we didn't have COW at
all.
Finally, a comment about constant folding and the 'x' operator.
At the moment we just blindly fold, regardless of the (constant) value of
the RHS. So all these get constant-folded:
if (rare_condition_1) {
$x = 'a' x (1<<30);
} elsif (rare_condition_2) {
$x = 'b' x (1<<30);
} elsif ...
so even though they may never be executed, multiple 1Gb constant strings
have been created at compile time. Similarly,
@A = (1) x (1<<30)
creates (at compile time) an AV with a billion elements.
Perhaps we should only constant fold 'x' when the RHS is below a certain
threshold?
…--
The Enterprise successfully ferries an alien VIP from one place to another
without serious incident.
-- Things That Never Happen in "Star Trek" #7
|
However, by that time, So possibly we need this instead? (
|
The above seemed pretty reasonable to me, so I've turned it into a PR for review: #20595 |
@richardleach I agree that PR should be standalone, but will you follow up with one for "x" with large multipliers? I think that should happen, but i have a feeling you are better placed to do it than me. Thanks for taking this on. Its been a nice learning experience. |
I don't know how to do this off the top of my head but am happy to poke. We'd have to pick an arbitrary large multiplier, so would we want it to be controllable through e.g. an env var? (If so, I haven't looked at how to do that before either.)
For me too. :) |
Building on |
Possibly something like this, with additional overflow checks:
|
You could still defeat the size limit on |
On Sat, Dec 10, 2022 at 05:19:49PM -0800, Richard Leach wrote:
You could still defeat the size limit on `OP_REPEAT` by breaking up the
construction, e.g. `my $x = "a"x500001 . "b"x500001`
Not sure what comprehensive solution there is for that. Seems like we'd have to let `S_fold_constants_eval()` run and then examine the results of it to decide whether to keep or discard them. However, as in the example above, the interpreter could have allocated the same amount of memory regardless. So I dunno.
I think the upper limit should be quite small (and not user-modifiable):
say 100 or 1000 in the case of a string, and perhaps even smaller (or
zero) in the case of list 'x'. If it doesn't get folded, then the only bad
outcome is that each time round a loop or function call, the 'x' is being
executed, which of itself it not terribly inefficient. It might be worth
observing how PADTMPs with buffer swiping, and/or COW, behave in practice
with multiple uses of 'x' in likely large-number scenarios, such as
foo("\0" x 1024*104);
my $buf = "\0" x 1024*104;
etc. Especially in terms of whether buffers are repeatedly malloc()ed and
free()d, or just done once and moved around between PADTMPs, SVs etc.
Also worth thing about how many buffers are in existence at the same time
and whether constant folding decreases that number.
[Basically there are lots of considerations which I don't have the answer
to right now, and am unlikely to have the time to find out right now ]
…--
"You may not work around any technical limitations in the software"
-- Windows Vista license
|
This is a bug report for perl from LorenzoTa,
generated with the help of perlbug 1.40 running under perl 5.26.0.
Hello esteemed ones,
it seems (in every version of perl and on every OS) that COW does not work as expected when the constant is the result of compile-time constant-folding.
In short:
my $x = 'a' x (2**30);
Allocates 2.104.612 K of OS memory while Devel::Size::total_size($x) reports 1.00 Gb
We spotted this wrong behaviour at perlmonks in the thread:
www.perlmonks.org/?node_id=11147727
where you can find many more details and test run on different envs, with the very same (unexpected) result, alongside comments of more experienced programmers: notably this one
More details:
# WRONG behaviour as it doubles the memory
my $x = 'a' x (2**30);
# RIGHT behaviour
foreach my $order ( qw(20 24 30 32) ) { my $x = 'a' x ( 2 ** $order ) }
# also RIGHT forcing RHS to runtime
$x = 'a' x (2**${\32})
Thanks for your work!
lorenzo
[Please do not change anything below this line]
Flags:
category=core
severity=medium
Site configuration information for perl 5.26.0:
Configured by strawberry-perl at Sat Sep 2 16:28:54 2017.
Summary of my perl5 (revision 5 version 26 subversion 0) configuration:
Platform:
osname=MSWin32
osvers=6.3
archname=MSWin32-x64-multi-thread
uname='Win32 strawberry-perl 5.26.0.2 #1 Sat Sep 2 16:25:32 2017 x64'
config_args='undef'
hint=recommended
useposix=true
d_sigaction=undef
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=undef
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='gcc'
ccflags =' -s -O2 -DWIN32 -DWIN64 -DCONSERVATIVE -D__USE_MINGW_ANSI_STDIO -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -fwrapv -fno-strict-aliasing -mms-bitfields'
optimize='-s -O2'
cppflags='-DWIN32'
ccversion=''
gccversion='7.1.0'
gccosandvers=''
intsize=4
longsize=4
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='long long'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='g++.exe'
ldflags ='-s -L"C:\EX_D\ulisseDUE\perl5.26.64bit\perl\lib\CORE" -L"C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib"'
libpth=C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib C:\EX_D\ulisseDUE\perl5.26.64bit\c\x86_64-w64-mingw32\lib C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib\gcc\x86_64-w64-mingw32\7.1.0
libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
libc=
so=dll
useshrplib=true
libperl=libperl526.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_win32.xs
dlext=xs.dll
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags='-mdll -s -L"C:\EX_D\ulisseDUE\perl5.26.64bit\perl\lib\CORE" -L"C:\EX_D\ulisseDUE\perl5.26.64bit\c\lib"'
@inc for perl 5.26.0:
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/site/lib/MSWin32-x64-multi-thread
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/site/lib
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/vendor/lib
C:/EX_D/ulisseDUE/perl5.26.64bit/perl/lib
Environment for perl 5.26.0:
HOME (unset)
LANG (unset)
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=C:\EX_D\ulisseDUE\perl5.26.64bit\perl\site\bin;C:\EX_D\ulisseDUE\perl5.26.64bit\perl\bin;C:\EX_D\ulisseDUE\perl5.26.64bit\c\bin;C:\EX_D\ulisseDUE\bin\UnxUtils\usr\local\wbin;C:\WINDOWS;C:\WINDOWS\system32;
PERL_BADLANG (unset)
PERL_JSON_BACKEND=JSON::XS
PERL_RL=Perl
PERL_YAML_BACKEND=YAML
SHELL (unset)
The text was updated successfully, but these errors were encountered: