Skip to content

Commit 13b0f67

Browse files
committed
re-enable Copy-on-Write by default.
COW was first introduced (and enabled by default) in 5.17.7. It was disabled by default in 5.17.10, because it was though to have too many rough edges for the 5.18.0 release. By re-enabling it now, early in the 5.19.x release cycle, hopefully it will be ready for production use by 5.20. This commit mainly reverts 9f351b4 and e1fd413 (with modifications), then updates perldelta.
1 parent 13d1b68 commit 13b0f67

File tree

7 files changed

+123
-31
lines changed

7 files changed

+123
-31
lines changed

perl.h

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2302,11 +2302,10 @@ typedef AV PAD;
23022302
typedef AV PADNAMELIST;
23032303
typedef SV PADNAME;
23042304

2305-
/* XXX for 5.18, disable the COW by default
2306-
* #if !defined(PERL_OLD_COPY_ON_WRITE) && !defined(PERL_NEW_COPY_ON_WRITE) && !defined(PERL_NO_COW)
2307-
* # define PERL_NEW_COPY_ON_WRITE
2308-
* #endif
2309-
*/
2305+
/* enable PERL_NEW_COPY_ON_WRITE by default */
2306+
#if !defined(PERL_OLD_COPY_ON_WRITE) && !defined(PERL_NEW_COPY_ON_WRITE) && !defined(PERL_NO_COW)
2307+
# define PERL_NEW_COPY_ON_WRITE
2308+
#endif
23102309

23112310
#if defined(PERL_OLD_COPY_ON_WRITE) || defined(PERL_NEW_COPY_ON_WRITE)
23122311
# if defined(PERL_OLD_COPY_ON_WRITE) && defined(PERL_NEW_COPY_ON_WRITE)

pod/perldelta.pod

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,21 @@ There may well be none in a stable release.
8787

8888
=item *
8989

90-
XXX
90+
Perl has a new copy-on-write mechanism that avoids the need to copy the
91+
internal string buffer when assigning from one scalar to another. This
92+
makes copying large strings appear much faster. Modifying one of the two
93+
(or more) strings after an assignment will force a copy internally. This
94+
makes it unnecessary to pass strings by reference for efficiency.
95+
96+
This feature was already available in 5.18.0, but wasn't enabled by
97+
default. It is the default now, and so you no longer need build perl with
98+
the F<Configure> argument:
99+
100+
-Accflags=PERL_NEW_COPY_ON_WRITE
101+
102+
It can be disabled (for now) in a perl build with:
103+
104+
-Accflags=PERL_NO_COW
91105

92106
=back
93107

@@ -338,6 +352,57 @@ well.
338352

339353
=item *
340354

355+
Perl's new copy-on-write mechanism (which is now enabled by default),
356+
allows any C<SvPOK> scalar to be automatically upgraded to a copy-on-write
357+
scalar when copied. A reference count on the string buffer is stored in
358+
the string buffer itself.
359+
360+
For example:
361+
362+
$ perl -MDevel::Peek -e'$a="abc"; $b = $a; Dump $a; Dump $b'
363+
SV = PV(0x260cd80) at 0x2620ad8
364+
REFCNT = 1
365+
FLAGS = (POK,IsCOW,pPOK)
366+
PV = 0x2619bc0 "abc"\0
367+
CUR = 3
368+
LEN = 16
369+
COW_REFCNT = 1
370+
SV = PV(0x260ce30) at 0x2620b20
371+
REFCNT = 1
372+
FLAGS = (POK,IsCOW,pPOK)
373+
PV = 0x2619bc0 "abc"\0
374+
CUR = 3
375+
LEN = 16
376+
COW_REFCNT = 1
377+
378+
Note that both scalars share the same PV buffer and have a COW_REFCNT
379+
greater than zero.
380+
381+
This means that XS code which wishes to modify the C<SvPVX()> buffer of an
382+
SV should call C<SvPV_force()> or similar first, to ensure a valid (and
383+
unshared) buffer, and to call C<SvSETMAGIC()> afterwards. This in fact has
384+
always been the case (for example hash keys were already copy-on-write);
385+
this change just spreads the COW behaviour to a wider variety of SVs.
386+
387+
One important difference is that before 5.18.0, shared hash-key scalars
388+
used to have the C<SvREADONLY> flag set; this is no longer the case.
389+
390+
This new behaviour can still be disabled by running F<Configure> with
391+
B<-Accflags=-DPERL_NO_COW>. This option will probably be removed in Perl
392+
5.22.
393+
394+
=item *
395+
396+
C<PL_sawampersand> is now a constant. The switch this variable provided
397+
(to enable/disable the pre-match copy depending on whether C<$&> had been
398+
seen) has been removed and replaced with copy-on-write, eliminating a few
399+
bugs.
400+
401+
The previous behaviour can still be enabled by running F<Configure> with
402+
B<-Accflags=-DPERL_SAWAMPERSAND>.
403+
404+
=item *
405+
341406
The functions C<my_swap>, C<my_htonl> and C<my_ntohl> have been removed.
342407
It is unclear why these functions were ever marked as I<A>, part of the
343408
API. XS code can't call them directly, as it can't rely on them being

pod/perlre.pod

Lines changed: 20 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,10 @@ X</p> X<regex, preserve> X<regexp, preserve>
9191
Preserve the string matched such that ${^PREMATCH}, ${^MATCH}, and
9292
${^POSTMATCH} are available for use after matching.
9393

94+
In Perl 5.20 and higher this is ignored. Due to a new copy-on-write
95+
mechanism, ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} will be available
96+
after the match regardless of the modifier.
97+
9498
=item g and c
9599
X</g> X</c>
96100

@@ -872,30 +876,35 @@ B<NOTE>: Failed matches in Perl do not reset the match variables,
872876
which makes it easier to write code that tests for a series of more
873877
specific cases and remembers the best match.
874878

875-
B<WARNING>: Once Perl sees that you need one of C<$&>, C<$`>, or
879+
B<WARNING>: If your code is to run on Perl 5.16 or earlier,
880+
beware that once Perl sees that you need one of C<$&>, C<$`>, or
876881
C<$'> anywhere in the program, it has to provide them for every
877-
pattern match. This may substantially slow your program. Perl
878-
uses the same mechanism to produce C<$1>, C<$2>, etc, so you also pay a
879-
price for each pattern that contains capturing parentheses. (To
880-
avoid this cost while retaining the grouping behaviour, use the
882+
pattern match. This may substantially slow your program.
883+
884+
Perl uses the same mechanism to produce C<$1>, C<$2>, etc, so you also
885+
pay a price for each pattern that contains capturing parentheses.
886+
(To avoid this cost while retaining the grouping behaviour, use the
881887
extended regular expression C<(?: ... )> instead.) But if you never
882888
use C<$&>, C<$`> or C<$'>, then patterns I<without> capturing
883889
parentheses will not be penalized. So avoid C<$&>, C<$'>, and C<$`>
884890
if you can, but if you can't (and some algorithms really appreciate
885891
them), once you've used them once, use them at will, because you've
886-
already paid the price. As of 5.17.4, the presence of each of the three
887-
variables in a program is recorded separately, and depending on
888-
circumstances, perl may be able be more efficient knowing that only C<$&>
889-
rather than all three have been seen, for example.
892+
already paid the price.
890893
X<$&> X<$`> X<$'>
891894

892-
As a workaround for this problem, Perl 5.10.0 introduces C<${^PREMATCH}>,
895+
Perl 5.16 introduced a slightly more efficient mechanism that notes
896+
separately whether each of C<$`>, C<$&>, and C<$'> have been seen, and
897+
thus may only need to copy part of the string. Perl 5.20 introduced a
898+
much more efficient copy-on-write mechanism which eliminates any slowdown.
899+
900+
As another workaround for this problem, Perl 5.10.0 introduced C<${^PREMATCH}>,
893901
C<${^MATCH}> and C<${^POSTMATCH}>, which are equivalent to C<$`>, C<$&>
894902
and C<$'>, B<except> that they are only guaranteed to be defined after a
895903
successful match that was executed with the C</p> (preserve) modifier.
896904
The use of these variables incurs no global performance penalty, unlike
897905
their punctuation char equivalents, however at the trade-off that you
898-
have to tell perl when you want to use them.
906+
have to tell perl when you want to use them. As of Perl 5.20, these three
907+
variables are equivalent to C<$`>, C<$&> and C<$'>, and C</p> is ignored.
899908
X</p> X<p modifier>
900909

901910
=head2 Quoting metacharacters

pod/perlreref.pod

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -275,13 +275,15 @@ There is no quantifier C<{,n}>. That's interpreted as a literal string.
275275
${^MATCH} Entire matched string
276276
${^POSTMATCH} Everything after to matched string
277277

278+
Note to those still using Perl 5.18 or earlier:
278279
The use of C<$`>, C<$&> or C<$'> will slow down B<all> regex use
279280
within your program. Consult L<perlvar> for C<@->
280281
to see equivalent expressions that won't cause slow down.
281282
See also L<Devel::SawAmpersand>. Starting with Perl 5.10, you
282283
can also use the equivalent variables C<${^PREMATCH}>, C<${^MATCH}>
283284
and C<${^POSTMATCH}>, but for them to be defined, you have to
284285
specify the C</p> (preserve) modifier on your regular expression.
286+
In Perl 5.20, the use of C<$`>, C<$&> and C<$'> makes no speed difference.
285287

286288
$1, $2 ... hold the Xth captured expr
287289
$+ Last parenthesized pattern match

pod/perlretut.pod

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -900,7 +900,10 @@ of the string after the match. An example:
900900

901901
In the second match, C<$`> equals C<''> because the regexp matched at the
902902
first character position in the string and stopped; it never saw the
903-
second 'the'. It is important to note that using C<$`> and C<$'>
903+
second 'the'.
904+
905+
If your code is to run on Perl versions earlier than
906+
5.20, it is worthwhile to note that using C<$`> and C<$'>
904907
slows down regexp matching quite a bit, while C<$&> slows it down to a
905908
lesser extent, because if they are used in one regexp in a program,
906909
they are generated for I<all> regexps in the program. So if raw
@@ -913,8 +916,11 @@ C<@+> instead:
913916
$' is the same as substr( $x, $+[0] )
914917

915918
As of Perl 5.10, the C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}>
916-
variables may be used. These are only set if the C</p> modifier is present.
917-
Consequently they do not penalize the rest of the program.
919+
variables may be used. These are only set if the C</p> modifier is
920+
present. Consequently they do not penalize the rest of the program. In
921+
Perl 5.20, C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}> are available
922+
whether the C</p> has been used or not (the modifier is ignored), and
923+
C<$`>, C<$'> and C<$&> do not cause any speed difference.
918924

919925
=head2 Non-capturing groupings
920926

pod/perlvar.pod

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -783,7 +783,7 @@ we have not made another match:
783783
$1 is Mutt; $2 is Jeff
784784
$1 is Wallace; $2 is Grommit
785785

786-
Due to an unfortunate accident of Perl's implementation, C<use
786+
If you are using Perl v5.18 or earlier, note that C<use
787787
English> imposes a considerable performance penalty on all regular
788788
expression matches in a program because it uses the C<$`>, C<$&>, and
789789
C<$'>, regardless of whether they occur in the scope of C<use
@@ -800,6 +800,9 @@ Since Perl v5.10.0, you can use the C</p> match operator flag and the
800800
C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> variables instead
801801
so you only suffer the performance penalties.
802802

803+
If you are using Perl v5.20.0 or higher, you do not need to worry about
804+
this, as the three naughty variables are no longer naughty.
805+
803806
=over 8
804807

805808
=item $<I<digits>> ($1, $2, ...)
@@ -822,7 +825,8 @@ The string matched by the last successful pattern match (not counting
822825
any matches hidden within a BLOCK or C<eval()> enclosed by the current
823826
BLOCK).
824827

825-
The use of this variable anywhere in a program imposes a considerable
828+
In Perl v5.18 and earlier, the use of this variable
829+
anywhere in a program imposes a considerable
826830
performance penalty on all regular expression matches. To avoid this
827831
penalty, you can extract the same substring by using L</@->. Starting
828832
with Perl v5.10.0, you can use the C</p> match flag and the C<${^MATCH}>
@@ -836,9 +840,11 @@ Mnemonic: like C<&> in some editors.
836840
X<${^MATCH}>
837841

838842
This is similar to C<$&> (C<$MATCH>) except that it does not incur the
839-
performance penalty associated with that variable, and is only guaranteed
843+
performance penalty associated with that variable.
844+
In Perl v5.18 and earlier, it is only guaranteed
840845
to return a defined value when the pattern was compiled or executed with
841-
the C</p> modifier.
846+
the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so
847+
C<${^MATCH}> does the same thing as C<$MATCH>.
842848

843849
This variable was added in Perl v5.10.0.
844850

@@ -853,7 +859,8 @@ The string preceding whatever was matched by the last successful
853859
pattern match, not counting any matches hidden within a BLOCK or C<eval>
854860
enclosed by the current BLOCK.
855861

856-
The use of this variable anywhere in a program imposes a considerable
862+
In Perl v5.18 and earlier, the use of this variable
863+
anywhere in a program imposes a considerable
857864
performance penalty on all regular expression matches. To avoid this
858865
penalty, you can extract the same substring by using L</@->. Starting
859866
with Perl v5.10.0, you can use the C</p> match flag and the
@@ -868,9 +875,11 @@ Mnemonic: C<`> often precedes a quoted string.
868875
X<$`> X<${^PREMATCH}>
869876

870877
This is similar to C<$`> ($PREMATCH) except that it does not incur the
871-
performance penalty associated with that variable, and is only guaranteed
878+
performance penalty associated with that variable.
879+
In Perl v5.18 and earlier, it is only guaranteed
872880
to return a defined value when the pattern was compiled or executed with
873-
the C</p> modifier.
881+
the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so
882+
C<${^PREMATCH}> does the same thing as C<$PREMATCH>.
874883

875884
This variable was added in Perl v5.10.0
876885

@@ -889,7 +898,8 @@ enclosed by the current BLOCK). Example:
889898
/def/;
890899
print "$`:$&:$'\n"; # prints abc:def:ghi
891900

892-
The use of this variable anywhere in a program imposes a considerable
901+
In Perl v5.18 and earlier, the use of this variable
902+
anywhere in a program imposes a considerable
893903
performance penalty on all regular expression matches.
894904
To avoid this penalty, you can extract the same substring by
895905
using L</@->. Starting with Perl v5.10.0, you can use the C</p> match flag
@@ -904,9 +914,11 @@ Mnemonic: C<'> often follows a quoted string.
904914
X<${^POSTMATCH}> X<$'> X<$POSTMATCH>
905915

906916
This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the
907-
performance penalty associated with that variable, and is only guaranteed
917+
performance penalty associated with that variable.
918+
In Perl v5.18 and earlier, it is only guaranteed
908919
to return a defined value when the pattern was compiled or executed with
909-
the C</p> modifier.
920+
the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so
921+
C<${^POSTMATCH}> does the same thing as C<$POSTMATCH>.
910922

911923
This variable was added in Perl v5.10.0.
912924

t/re/pat_rt_report.t

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1147,7 +1147,6 @@ EOP
11471147

11481148
{
11491149
# [perl #4289] First mention $& after a match
1150-
local $::TODO = "these tests fail without Copy-on-Write enabled";
11511150
fresh_perl_is(
11521151
'$_ = "abc"; /b/g; $_ = "hello"; print eval q|$&|, "\n"',
11531152
"b\n", {}, '$& first mentioned after match');

0 commit comments

Comments
 (0)