Skip to content

Commit 3d2ba98

Browse files
author
Zefram
committed
better documentation of reference counts
1 parent 6cc7638 commit 3d2ba98

File tree

1 file changed

+98
-61
lines changed

1 file changed

+98
-61
lines changed

pod/perlguts.pod

Lines changed: 98 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -798,68 +798,116 @@ Perl uses a reference count-driven garbage collection mechanism. SVs,
798798
AVs, or HVs (xV for short in the following) start their life with a
799799
reference count of 1. If the reference count of an xV ever drops to 0,
800800
then it will be destroyed and its memory made available for reuse.
801-
802-
This normally doesn't happen at the Perl level unless a variable is
803-
undef'ed or the last variable holding a reference to it is changed or
804-
overwritten. At the internal level, however, reference counts can be
805-
manipulated with the following macros:
801+
At the most basic internal level, reference counts can be manipulated
802+
with the following macros:
806803

807804
int SvREFCNT(SV* sv);
808805
SV* SvREFCNT_inc(SV* sv);
809806
void SvREFCNT_dec(SV* sv);
810807

811-
However, there is one other function which manipulates the reference
812-
count of its argument. The C<newRV_inc> function, you will recall,
813-
creates a reference to the specified argument. As a side effect,
814-
it increments the argument's reference count. If this is not what
815-
you want, use C<newRV_noinc> instead.
816-
817-
For example, imagine you want to return a reference from an XSUB function.
818-
Inside the XSUB routine, you create an SV which initially has a reference
819-
count of one. Then you call C<newRV_inc>, passing it the just-created SV.
820-
This returns the reference as a new SV, but the reference count of the
821-
SV you passed to C<newRV_inc> has been incremented to two. Now you
822-
return the reference from the XSUB routine and forget about the SV.
823-
But Perl hasn't! Whenever the returned reference is destroyed, the
824-
reference count of the original SV is decreased to one and nothing happens.
825-
The SV will hang around without any way to access it until Perl itself
826-
terminates. This is a memory leak.
827-
828-
The correct procedure, then, is to use C<newRV_noinc> instead of
829-
C<newRV_inc>. Then, if and when the last reference is destroyed,
830-
the reference count of the SV will go to zero and it will be destroyed,
831-
stopping any memory leak.
808+
(There are also suffixed versions of the increment and decrement macros,
809+
for situations where the full generality of these basic macros can be
810+
exchanged for some performance.)
811+
812+
However, the way a programmer should think about references is not so
813+
much in terms of the bare reference count, but in terms of I<ownership>
814+
of references. A reference to an xV can be owned by any of a variety
815+
of entities: another xV, the Perl interpreter, an XS data structure,
816+
a piece of running code, or a dynamic scope. An xV generally does not
817+
know what entities own the references to it; it only knows how many
818+
references there are, which is the reference count.
819+
820+
To correctly maintain reference counts, it is essential to keep track
821+
of what references the XS code is manipulating. The programmer should
822+
always know where a reference has come from and who owns it, and be
823+
aware of any creation or destruction of references, and any transfers
824+
of ownership. Because ownership isn't represented explicitly in the xV
825+
data structures, only the reference count need be actually maintained
826+
by the code, and that means that this understanding of ownership is not
827+
actually evident in the code. For example, transferring ownership of a
828+
reference from one owner to another doesn't change the reference count
829+
at all, so may be achieved with no actual code. (The transferring code
830+
doesn't touch the referenced object, but does need to ensure that the
831+
former owner knows that it no longer owns the reference, and that the
832+
new owner knows that it now does.)
833+
834+
An xV that is visible at the Perl level should not become unreferenced
835+
and thus be destroyed. Normally, an object will only become unreferenced
836+
when it is no longer visible, often by the same means that makes it
837+
invisible. For example, a Perl reference value (RV) owns a reference to
838+
its referent, so if the RV is overwritten that reference gets destroyed,
839+
and the no-longer-reachable referent may be destroyed as a result.
840+
841+
Many functions have some kind of reference manipulation as
842+
part of their purpose. Sometimes this is documented in terms
843+
of ownership of references, and sometimes it is (less helpfully)
844+
documented in terms of changes to reference counts. For example, the
845+
L<newRV_inc()|perlapi/newRV_inc> function is documented to create a new RV
846+
(with reference count 1) and increment the reference count of the referent
847+
that was supplied by the caller. This is best understood as creating
848+
a new reference to the referent, which is owned by the created RV,
849+
and returning to the caller ownership of the sole reference to the RV.
850+
The L<newRV_noinc()|perlapi/newRV_noinc> function instead does not
851+
increment the reference count of the referent, but the RV nevertheless
852+
ends up owning a reference to the referent. It is therefore implied
853+
that the caller of C<newRV_noinc()> is relinquishing a reference to the
854+
referent, making this conceptually a more complicated operation even
855+
though it does less to the data structures.
856+
857+
For example, imagine you want to return a reference from an XSUB
858+
function. Inside the XSUB routine, you create an SV which initially
859+
has just a single reference, owned by the XSUB routine. This reference
860+
needs to be disposed of before the routine is complete, otherwise it
861+
will leak, preventing the SV from ever being destroyed. So to create
862+
an RV referencing the SV, it is most convenient to pass the SV to
863+
C<newRV_noinc()>, which consumes that reference. Now the XSUB routine
864+
no longer owns a reference to the SV, but does own a reference to the RV,
865+
which in turn owns a reference to the SV. The ownership of the reference
866+
to the RV is then transferred by the process of returning the RV from
867+
the XSUB.
832868

833869
There are some convenience functions available that can help with the
834870
destruction of xVs. These functions introduce the concept of "mortality".
835-
An xV that is mortal has had its reference count marked to be decremented,
836-
but not actually decremented, until "a short time later". Generally the
837-
term "short time later" means a single Perl statement, such as a call to
838-
an XSUB function. The actual determinant for when mortal xVs have their
839-
reference count decremented depends on two macros, SAVETMPS and FREETMPS.
840-
See L<perlcall> and L<perlxs> for more details on these macros.
841-
842-
"Mortalization" then is at its simplest a deferred C<SvREFCNT_dec>.
843-
However, if you mortalize a variable twice, the reference count will
844-
later be decremented twice.
845-
846-
"Mortal" SVs are mainly used for SVs that are placed on perl's stack.
847-
For example an SV which is created just to pass a number to a called sub
848-
is made mortal to have it cleaned up automatically when it's popped off
849-
the stack. Similarly, results returned by XSUBs (which are pushed on the
850-
stack) are often made mortal.
851-
852-
To create a mortal variable, use the functions:
871+
Much documentation speaks of an xV itself being mortal, but this is
872+
misleading. It is really I<a reference to> an xV that is mortal, and it
873+
is possible for there to be more than one mortal reference to a single xV.
874+
For a reference to be mortal means that it is owned by the temps stack,
875+
one of perl's many internal stacks, which will destroy that reference
876+
"a short time later". Usually the "short time later" is the end of
877+
the current Perl statement. However, it gets more complicated around
878+
dynamic scopes: there can be multiple sets of mortal references hanging
879+
around at the same time, with different death dates. Internally, the
880+
actual determinant for when mortal xV references are destroyed depends
881+
on two macros, SAVETMPS and FREETMPS. See L<perlcall> and L<perlxs>
882+
for more details on these macros.
883+
884+
Mortal references are mainly used for xVs that are placed on perl's
885+
main stack. The stack is problematic for reference tracking, because it
886+
contains a lot of xV references, but doesn't own those references: they
887+
are not counted. Currently, there are many bugs resulting from xVs being
888+
destroyed while referenced by the stack, because the stack's uncounted
889+
references aren't enough to keep the xVs alive. So when putting an
890+
(uncounted) reference on the stack, it is vitally important to ensure that
891+
there will be a counted reference to the same xV that will last at least
892+
as long as the uncounted reference. But it's also important that that
893+
counted reference be cleaned up at an appropriate time, and not unduly
894+
prolong the xV's life. For there to be a mortal reference is often the
895+
best way to satisfy this requirement, especially if the xV was created
896+
especially to be put on the stack and would otherwise be unreferenced.
897+
898+
To create a mortal reference, use the functions:
853899

854900
SV* sv_newmortal()
855-
SV* sv_2mortal(SV*)
856901
SV* sv_mortalcopy(SV*)
902+
SV* sv_2mortal(SV*)
857903

858-
The first call creates a mortal SV (with no value), the second converts an existing
859-
SV to a mortal SV (and thus defers a call to C<SvREFCNT_dec>), and the
860-
third creates a mortal copy of an existing SV.
861-
Because C<sv_newmortal> gives the new SV no value, it must normally be given one
862-
via C<sv_setpv>, C<sv_setiv>, etc. :
904+
C<sv_newmortal()> creates an SV (with the undefined value) whose sole
905+
reference is mortal. C<sv_mortalcopy()> creates an xV whose value is a
906+
copy of a supplied xV and whose sole reference is mortal. C<sv_2mortal()>
907+
mortalises an existing xV reference: it transfers ownership of a reference
908+
from the caller to the temps stack. Because C<sv_newmortal> gives the new
909+
SV no value, it must normally be given one via C<sv_setpv>, C<sv_setiv>,
910+
etc. :
863911

864912
SV *tmp = sv_newmortal();
865913
sv_setiv(tmp, an_integer);
@@ -868,17 +916,6 @@ As that is multiple C statements it is quite common so see this idiom instead:
868916

869917
SV *tmp = sv_2mortal(newSViv(an_integer));
870918

871-
872-
You should be careful about creating mortal variables. Strange things
873-
can happen if you make the same value mortal within multiple contexts,
874-
or if you make a variable mortal multiple
875-
times. Thinking of "Mortalization"
876-
as deferred C<SvREFCNT_dec> should help to minimize such problems.
877-
For example if you are passing an SV which you I<know> has a high enough REFCNT
878-
to survive its use on the stack you need not do any mortalization.
879-
If you are not sure then doing an C<SvREFCNT_inc> and C<sv_2mortal>, or
880-
making a C<sv_mortalcopy> is safer.
881-
882919
The mortal routines are not just for SVs; AVs and HVs can be
883920
made mortal by passing their address (type-casted to C<SV*>) to the
884921
C<sv_2mortal> or C<sv_mortalcopy> routines.

0 commit comments

Comments
 (0)