Skip to content

Plug PVLV REGEXP body leak #19115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

nwc10
Copy link
Contributor

@nwc10 nwc10 commented Sep 11, 2021

This was leaking a body:

#!./perl -w

use strict;

my %h;

sub bar {
    my $r = qr/abc/;
    $_[0] = $$r;
};

bar($h{foo});

The leak has actually been present since 8d919b0 but was only exposed by tests added in 2017.

With this leak fixed, and the obscure leak in hv.c fixed, the testsuite is clean under -DPURIFY

PVLV scalars are used to implement various obscure features of perl, such as
elements in tied hashes and hash lookups used as function parameters. To
make these work, PVLVs must be capable of holding all scalar types. First
class REGEXPs are larger than PVLVs, hence there is special case code to
attach REGEXP bodies to PVLVs, for code that both dereferences a regex object
and then assigns that value to a PVLV scalar. This was first implemented in
Oct 2012 by commit 8d919b0:
    Allow regexp-to-pvlv assignment

and improved in July 2017 with commit df6b4bd:

It turns out that the implementation would leak a scalar body, for the
"normal" use case (of this obscure feature). This wasn't noticed for two
reasons

1) bodies are allocated from arenas, and arenas are correctly freed during
   full destruction.
2) the body is not leaked if the the PVLV scalar is then modified

The regression tests added by commit 8d919b0 were also testing
the corner case of concatenating to one of these values - ie that
sv_force_normal_flags() handled unwinding the special cases correctly.
To avoid code duplication, the tests added by that commit caused all PVLV
scalars to be passed to sv_force_normal_flags(), so didn't actually test
regular destruction of such scalars.

The tests added in Aug 2017 by commit 622c613:
    PVLV-as-REGEXP: avoid PVX double free

    With v5.27.2-30-gdf6b4bd, I changed the way that PVLVs store a regexp
    value (by making the xpv_len field point to a regexp struct). There was a
    bug in this, which caused the PVX buffer to be double freed.

    Several REGEXP SVs can share a PVX buffer. Only one of them will have a
    non-zero xpv_len field, and that SV gets to free the buffer.

    After the commit above, the non-zero xpv_len was triggering an extra free.

    This was showing up in smokes as failures in re/recompile.t when invoked
    with PERL_DESTRUCT_LEVEL=2 (which t/TEST does).

were actually the first to expose this bug, even though it had been present
since 2012.

The bug is only visible with a full leak test (eg valgrind --leak-check=full
or an ASAN build) with -DPURIFY (to disable arenas), hence why it hasn't
been noticed.
hv_common can perform read-only actions on PL_strtab, but not write actions.
The code that detects this and croaks had been just after the allocation of
a new HE *, and hence was leaking it. Re-order the code to avoid the leak.

The leak usually wasn't noticeable as HEs are allocated from arenas, and
arenas are correctly freed during full destruction. However, building with
-DPURIFY replaces arenas with individual allocations, making this leak
visible. It's unlikely to have been hit by any production code, but it was
causing leaks during the regression tests.

Also change embed.fnc so that S_new_HE's prototype is not declared under
-DPURIFY, as the static function itself is not defined in this case. This
fixes a compiler warning.
@nwc10 nwc10 requested review from khwilliamson and iabyn and removed request for khwilliamson September 11, 2021 15:26
@nwc10
Copy link
Contributor Author

nwc10 commented Sep 11, 2021

@iabyn it's not your leak, but you've looked at that regexp code more recently than me, so I hope you can spot if I did something stupid.

@nwc10 nwc10 changed the title Plug PVIV REGEXP body leak Plug PVLV REGEXP body leak Sep 11, 2021
@nwc10
Copy link
Contributor Author

nwc10 commented Sep 23, 2021

For reference/to avoid duplication - there's been discussion about two commits on the branch that this PR references over on the PR #19140

@nwc10
Copy link
Contributor Author

nwc10 commented Sep 26, 2021

These commits were subsumed into #19140

@nwc10 nwc10 closed this Sep 26, 2021
@nwc10 nwc10 deleted the plug-PVLV-REGEXP-body-leak branch September 26, 2021 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant