Skip to content

SV PV COW 255 doesn't support strings under 9 SvCUR() bytes long #23261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bulk88 opened this issue May 5, 2025 · 5 comments
Open

SV PV COW 255 doesn't support strings under 9 SvCUR() bytes long #23261

bulk88 opened this issue May 5, 2025 · 5 comments

Comments

@bulk88
Copy link
Contributor

bulk88 commented May 5, 2025

Description
from sv.c

#ifndef SV_COW_MAX_WASTE_FACTOR_THRESHOLD
#    define SV_COW_MAX_WASTE_FACTOR_THRESHOLD   2    /* COW iff len < (cur * K) */
#endif
#ifndef SV_COWBUF_WASTE_FACTOR_THRESHOLD
#    define SV_COWBUF_WASTE_FACTOR_THRESHOLD    2    /* COW iff len < (cur * K) */
#endif

8 bytes or smaller strings can't 255 COW because of ^^^^^ added in 5.19.12 in

#13800
e8c6a47

Author: Yves Orton [email protected]
Date: 5/11/2014 6:37:33 AM
Message:
Implement "max waste" thresholds to avoid problems with COW and deliberately overallocated pvs
https://rt.perl.org/Ticket/Display.html?id=121796

Steps to Reproduce


use Devel::Peek;
 #use Inline  ('force', 'noclean');

use Inline C => Config =>
 PRE_HEAD => '#define PERL_NO_GET_CONTEXT 1';
use Inline C => Config => BUILD_NOISY => 1;

use Inline C => <<'END_OF_C_CODE';


SV* do3(SV* ssv) {
 dTHX;
 U32 flags = 0;
    SV* dsv;
    if ((SvFLAGS(ssv) & ((SVf_OK|SVs_GMG) &~(SVp_NOK))) == SVf_NOK)
        dsv = newSVnv(SvNVX(ssv));
    else if ((SvFLAGS(ssv) & ((SVf_OK|SVs_GMG) &~(SVp_IOK))) == SVf_IOK) {
        IV iv = SvIVX(ssv);
        dsv = SvUOK(ssv) ? newSVuv((UV)iv) : newSViv(iv);
    }
    else {
        U32 type = SvPOK(ssv) ? SVt_PV : SvNOK(ssv) ? SVt_NV : SVt_IV;
        if (flags & SVs_TEMP)
            dsv = newSV_type_mortal(type);
        else
            dsv = newSV_type(type);
        sv_dump_depth(dsv, 3);
        sv_dump_depth(ssv, 3);
        sv_setsv_flags(dsv, ssv, SV_GMAGIC | SV_NOSTEAL
            | SV_COW_SHARED_HASH_KEYS | SV_COW_OTHER_PVS | SV_DO_COW_SVSETSV);
        sv_dump_depth(ssv, 3);
        sv_dump_depth(dsv, 3);
        return dsv;
    }
    if (flags & SVs_TEMP)
        dsv = sv_2mortal(dsv);
    return dsv;
}


END_OF_C_CODE


$, = "\n";

use v5.30;
use strict;
use warnings;
#$DB::single = 1;

#$DB::single = 1;

use version;
my $s1;
my $s2;
my $s3;
my $s4;
warn "\nwith 8 bytes\n\n";
$s1 = substr(int(rand(10)),0,1).'.345678';
$s2 = do3($s1);
warn "\nnow with 9 bytes\n\n";
$s1 = substr(int(rand(10)),0,1).'.3456789';
$s3 = do3($s1);
warn "\nwith 4 bytes\n\n";
$s1 = substr(int(rand(10)),0,1).'.34';
$s4 = do3($s1);
with 8 bytes

SV = PV(0x2abb18) at 0x2ab0f0
  REFCNT = 1
  FLAGS = ()
  PV = 0
SV = PV(0x2abae8) at 0x244ce88
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x231e048 "4.345678"\0
  CUR = 8
  LEN = 16
SV = PV(0x2abae8) at 0x244ce88
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x231e048 "4.345678"\0
  CUR = 8
  LEN = 16
SV = PV(0x2abb18) at 0x2ab0f0
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x2496e98 "4.345678"\0
  CUR = 8
  LEN = 16

now with 9 bytes

SV = PV(0x2abb48) at 0x2ab0f0
  REFCNT = 1
  FLAGS = ()
  PV = 0
SV = PV(0x2abae8) at 0x244ce88
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x231e048 "7.3456789"\0
  CUR = 9
  LEN = 16
SV = PV(0x2abae8) at 0x244ce88
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x231e048 "7.3456789"\0
  CUR = 9
  LEN = 16
  COW_REFCNT = 1
SV = PV(0x2abb48) at 0x2ab0f0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x231e048 "7.3456789"\0
  CUR = 9
  LEN = 16
  COW_REFCNT = 1

with 4 bytes

SV = PV(0x2abb68) at 0x2ab0f0
  REFCNT = 1
  FLAGS = ()
  PV = 0
SV = PV(0x2abae8) at 0x244ce88
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x2495098 "8.34"\0
  CUR = 4
  LEN = 16
SV = PV(0x2abae8) at 0x244ce88
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x2495098 "8.34"\0
  CUR = 4
  LEN = 16
SV = PV(0x2abb68) at 0x2ab0f0
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x2497018 "8.34"\0
  CUR = 4
  LEN = 16

Expected behavior
That tiny strings can SVPV 255 COW. 8 bytes and under are often tokens/fields/keys/prop names/string-ified integers/stringified 1 or 0 bools/api func calls or var names/json prop names. All of those are very high frequency to see bounce around at runtime on end user code.

Perl_sv_grow()/PERL_STRLEN_ROUNDUP/PERL_STRLEN_EXPAND_SHIFT/PERL_STRLEN_NEW_MIN/newSVpvn/etc should not be discriminating against short strings held in Perl API's Newx() backed, API min buf length of 0x10==SvLEN().

Perl configuration

Summary of my perl5 (revision 5 version 41 subversion 7) configuration:
  Derived from: 73172a67eaae5671dffc06b427f005810d151472
  Platform:
    osname=MSWin32
    osvers=6.1.7601
    archname=MSWin32-x64-multi-thread
    uname=''
    config_args='undef'
    hint=recommended
    useposix=true
    d_sigaction=undef
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=undef
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cl'
    ccflags ='-nologo -GF -W3 -MD -TC -DWIN32 -D_CONSOLE -DNO_STRICT -DWIN64 -D_
CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -D_WINSOCK_DEPRECATED_NO_WAR
NINGS -DPERL_TEXTMODE_SCRIPTS -DMULTIPLICITY -DPERL_IMPLICIT_SYS -DWIN32_NO_REGI
STRY -DUSE_PERLIO'
    optimize='-O1 -Zi -GL -fp:precise'
    cppflags='-DWIN32'
    ccversion='19.36.32535'
    gccversion=''
    gccosandvers=''
    intsize=4
    longsize=4
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=undef
    longlongsize=8
    d_longdbl=define
    longdblsize=8
    longdblkind=0
    ivtype='__int64'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='__int64'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='link'
    ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf -ltcg -libpath:"c:\pb64\
lib\CORE" -machine:AMD64 -subsystem:console,"5.02"'
    libpth="C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSV
C\14.36.32532\lib\x64"
    libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.li
b advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.l
ib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
 vcruntime.lib ucrt.lib
    perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg3
2.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_
32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt
.lib vcruntime.lib ucrt.lib
    libc=ucrt.lib
    so=dll
    useshrplib=true
    libperl=perl541.lib
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs
    dlext=dll
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf -ltcg -libpath:"c:
\pb64\lib\CORE" -machine:AMD64 -subsystem:console,"5.02"'


Characteristics of this binary (from libperl):
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_TIMES
    HAVE_INTERP_INTERN
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_IMPLICIT_SYS
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_NO_REGISTRY
    USE_PERLIO
    USE_PERL_ATOF
    USE_THREAD_SAFE_LOCALE
  Locally applied patches:
    uncommitted-changes
  Built under MSWin32
  Compiled at Dec 20 2024 10:03:46
  %ENV:
    PERL_JSON_BACKEND="Cpanel::JSON::XS"
    PERL_YAML_BACKEND="YAML::XS"
  @INC:
    C:/pb64/site/lib/MSWin32-x64-multi-thread
    C:/pb64/site/lib
    C:/pb64/lib
@khwilliamson
Copy link
Contributor

I don't understand why COW is advantageous on small strings.

@bulk88
Copy link
Contributor Author

bulk88 commented May 9, 2025

I don't understand why COW is advantageous on small strings.

Do you want 500 or 1000 copies of the 5 letter string "undef" in ur perl process? Plus round to 16 bytes plus 8 or 16 bytes for malloc internals, plus 24 bytes for threaded win perl?

How about hv_intrnext() and list to list assignment, 500 or 1000 more copies of 5 byte string "undef", how many times is perl supposed to malloc()/free() cycle a 1 byte long string like "1" or "undef"? It's a performance problem.

A profiling tool says the pvx buffer of newSVpvs(""); on win64 cost 70 bytes. Most allocs in a perl vm on the bellcurve are mid 70s bytes peaking at 82-85 bytes. So yes COW for strings 8 bytes and under are very necessary. They cost TEN times their own length in final memory cost. IdK why on win64 70 bytes is the lowest alloc perl makes, it's not a unit of eight so I'm guessing the profiler has private access to the windows malloc system and windows keeps 8 bytes behind the pointer, plus a secret 4 bytes somewhere else in address space of overhead and the other four bytes are anti-security exploit intergrity metadata.

@tonycoz
Copy link
Contributor

tonycoz commented May 15, 2025

8 bytes or smaller strings can't 255 COW because of ^^^^^ added in 5.19.12 in

$ perl -MDevel::Peek -e 'Dump("Hello")'
SV = PV(0x55a3bb5c8f10) at 0x55a3bb5f4df8
  REFCNT = 1
  FLAGS = (POK,IsCOW,READONLY,PROTECT,pPOK)
  PV = 0x55a3bb5fd5f0 "Hello"\0
  CUR = 5
  LEN = 10
  COW_REFCNT = 0
$ perl -MDevel::Peek -e 'my $x = "Hello"; Dump($x)'
SV = PV(0x55afb6cebea0) at 0x55afb6d17ff0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55afb6d205f0 "Hello"\0
  CUR = 5
  LEN = 10
  COW_REFCNT = 1

though this is a special case for literals and copies of them.

From memory when the current CoW was introduced CoWing short strings ended up slower than doing a normal copy, YMMV for systems with a slow malloc() (looking at a certain popular non-POSIX platform).

@bulk88
Copy link
Contributor Author

bulk88 commented May 16, 2025

8 bytes or smaller strings can't 255 COW because of ^^^^^ added in 5.19.12 in

$ perl -MDevel::Peek -e 'Dump("Hello")'
SV = PV(0x55a3bb5c8f10) at 0x55a3bb5f4df8
  REFCNT = 1
  FLAGS = (POK,IsCOW,READONLY,PROTECT,pPOK)
  PV = 0x55a3bb5fd5f0 "Hello"\0
  CUR = 5
  LEN = 10
  COW_REFCNT = 0
$ perl -MDevel::Peek -e 'my $x = "Hello"; Dump($x)'
SV = PV(0x55afb6cebea0) at 0x55afb6d17ff0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55afb6d205f0 "Hello"\0
  CUR = 5
  LEN = 10
  COW_REFCNT = 1

though this is a special case for literals and copies of them.

From memory when the current CoW was introduced CoWing short strings ended up slower than doing a normal copy, YMMV for systems with a slow malloc() (looking at a certain popular non-POSIX platform).

I'd be open or give a thumbs up, if memcpy() to lval SV* is done for 16 or 32 bytes and under over COW, if and only if, lval SV*, already has a Newx() block big enough to accept the new string. I'm thinking of pad SV*s specifically, but there might be other cases in perl C API where destsv already has a pre-existing Newx() block in it. I'm against doing a free() just to stuff a COW 255 into destsv's SxPVX for the sake for 50 or 100 bytes of malloc memory. A realloc() counts as a free() in my book since a realloc() will always be doing a free() internally since Perl always rounds upto next 16 byte unit for all the SVPV APIs, which guarantees a bucket jump on most OSes.

Sending 500 or 1000 bytes of CPU instructions will go through the cpu by calling free(), just to save 50 bytes with a COW. Eghhh. Modern systems have the DDR RAM to not be that desperate over private bytes/spinning rust paging file access anymore. But COW 255 vs Newx()+memcpy bc SvPVX is empty, COW 255 should always be done. There is a huge amount of sv_setsv()s inside interp core, mostly guarding against GMG weirdness and guarding against SvREADONLY() immortals or SV* grammer literals. These need to be FAST. They aren't right now for <= 8 byte strings.

For sv_newmortal() or mortalized brand new SVs, COW 255 absolutely needs to be working. Mortal SVs are almost always PP = operator, or PP foreach() or PP map() that even if sourcesv is ultimately never modified, they are making mortal SV* copies of incoming list context stuff, b/c all SV GMG getters need to fire off before the "real work" is done by=, foreach(), map() , or TIEHASH/TIESCALAR. or sub new { my $self= shift; $self->{fields} = [customer_type_init(4)] ; } sub customer_type_init( my $type = $_[0]; if( $type == 4) { return ('a', 'b', 'c');} } . I want array ref in $self->{fields} to have SVPV COW 255s, not Newx blocks.

The current COW APIs in blead, IMO have a very unoptimized path, that if(SV_THINKFIRST(sv)) force_normal_drop_cow(sv); sv_catpvs(sv, "longer than 16 bytes !!!!!!!!"); is a very painful F11 single step path to happen, since force_normal_drop_cow(sv); can't over-alloc the Newx() size, for the "longer than 16 bytes !!!!!!!!" string 1 microsecond away. Easy to fix design flaw, but tuits budget at Perl Bankcorp LLC is low.

I made this bug ticket, since I don't know if the right logic should be to ignore the slack/waste logic if total string length <= 16 or <=8 and PVs always get COW 255 copied, or the slack/waste logic needs other algo tweaks.

#ifndef SV_COW_THRESHOLD
#    define SV_COW_THRESHOLD                    0   /* COW iff len > K */
#endif
#ifndef SV_COWBUF_THRESHOLD
#    define SV_COWBUF_THRESHOLD                 1250 /* COW iff len > K */
#endif
#ifndef SV_COW_MAX_WASTE_THRESHOLD
#    define SV_COW_MAX_WASTE_THRESHOLD          80   /* COW iff (len - cur) < K */
#endif
#ifndef SV_COWBUF_WASTE_THRESHOLD
#    define SV_COWBUF_WASTE_THRESHOLD           80   /* COW iff (len - cur) < K */
#endif 

1250 and 80 are the other 2 cutoffs. My complaint is about SvLEN()/2 cutoff, I don't have a tech opinion on 1250 and 80 cutoffs. 80 is obviously a console line. But maybe it should be 2^7 == 128, or 128-2, or 96, or 96-2 instead or something similar. That handles a small amount of latin-1 2 byte utf8 chars, and \r + \n to fit inside the 80 cutoff.

@richardleach
Copy link
Contributor

https://rt.perl.org/Ticket/Display.html?id=121796

I've only skimmed that ticket, but the key point/blocker seemed to be that Perl_sv_gets needed reworking. Steffan and Yves noted that:

we can speed up the code in sv_gets() by something like an order of
magnitude (or so) by using a scan and then copy strategy.

Did that happen, or is that a task still to be done?

These open issues refer to sv_gets, so I guess there's still work to be done there:

Would we be best served by a single meta-issue that summarises the problems with sv_gets succintly and has links to the above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants