Skip to content

Add optimization for simple multideref function calls #20991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

atrodo
Copy link
Contributor

@atrodo atrodo commented Mar 30, 2023

As an experiment and research into the perl internals, I am attempting to add an optimization for simple, but somewhat common, functions that simply return values from the first argument. This draft PR represents my efforts so far, but I am running into issues with my poor understanding of how the stack operates.

The optimization looks for functions that are of the form { return $_[0]->{X} }, marks them as a multideref accessor, and then runs the mutlideref instead of creating a stack frame and calling the body of the function. It is currently disabled unless the PERL_MULTIDEREF_ACC environment variable has a non-zero number. Below is script that exercises the optimization when enabled; currently functions x and n are optimized, but I am hoping to get all the functions, except for new, to be optimizable.

use strict;
use warnings;

my $n = bless {};

package Point {
    sub new {
        return bless { x => rand, n => $n }, 'Point';
    }
    sub x {
        return $_[0]->{x};
    }
    sub y {
        my $self = shift;
        return $self->{x};
    }
    sub z {
        1;
        return $_[0]->{x};
    }
    sub n  { return $_[0]->{n} }
    sub nn { 1; return $_[0]->{n} }
    sub s  { $_[0]->{s} = $_[1] }
    sub m {
        return $_[0]->{m}
          if @_ > 1;
        $_[0]->{m} = $_[1];
    }
}

my $a = Point->new;
warn $a->n;
warn $a->{x};
warn $a->x;
warn $a->n->x;

__END__
use Benchmark qw/cmpthese/;

cmpthese(
    30_000_000,
    {
        'multideref'   => sub { $a->x },
        'static_fn'    => sub { Point::x($a) },
        'two_step'     => sub { $a->y },
        'const_prefix' => sub { $a->z },
        'direct'       => sub { $a->{x} },
    }
);

There are several things that I'm unsure about, specifically the location of the optimization since no optimization is in peep although that felt the most appropriate place, the addition of the multideref accessor flag on control ops, the global static AV optimization I used (which I know I need to figure out), and generally if I am doing/structuring things right, even though it mostly works.

However, the main issue I'm having is that, in the script above, calling $a->n->x causes a Can't call method "x" on an undefined value error even though $a->n returns the correct object. I suspect its because I'm putting the stack into a unexpected state, but I don't know how the stack works well enough in order to know what the expected state is suppose to be. Any assistance in understanding what I am doing wrong would be appreciated.

@iabyn
Copy link
Contributor

iabyn commented Mar 30, 2023 via email

@atrodo
Copy link
Contributor Author

atrodo commented Mar 31, 2023

While we try and encourage new contributors, I'm not minded to devote more
than the few minutes I've already spent trying to review your work because
you have made it hard to do so.

I am very sorry, I did not mean to cause any frustration on anyone's part. I created this as a draft PR in GitHub and thought about it in those terms, but I see now that I could have done a lot more to make this easier on others. This optimisation and PR are part of an educational process for me, I should have made that clearer. I am in the process of trying to understand more of the perl internals personally, and was using this idea as a way to do such. I know that this optimisation has a long way to go before before it could be included, if its viable at all.

I greatly appreciate and value the amount of time you have taken to respond and as such will keep my responses short.

I think it works by marking candidate optimisible subs (ones
consisting of just a single multideref op) by adding some info in the
sub's first NEXTSTATE op.

Then in pp_entersub, you've got some runtime code which, when about to
call a perl (as opposed to XS) sub, checks whether the sub has this info
in the NS op, and instead calls the body of of pp_multideref, which has
been extracted from the pp function as a static sub.

Is this right?

Your understanding of the operation is correct.

First, some major observations.

Optimisations like this makes me extremely twitchy. It potentially slows
down every sub call and every multideref, and it's not clear whether
that slowdown is worth it for any possible speed up when calling
accessor-type methods. Ad an aside, I can't see the results of
benchmarking in your email.

I agree, I am not yet sure this optimization has any value. I am currently working through this optimisation, asking the question, personally, if this is even viable. You have some great questions and I don't have answers to them at this point. I will make sure I can answer these questions and concerns in any future PRs.

I intentionally did not add any benchmarking numbers as I have no faith in whatever results I do have. Since the code currently breaks some normal function calls, I am unable to conduct any good benchmarks yet.

Other observations:

Setting an environment variable isn't the usual way to enable an
optimisation. In perl it's usually always enabled (or via a build option
if its new and controversial). It absolutely should not be used at
run-time - i.e. being checked for by every sub call.

The environment variable is a side effect of my process: without being able to selectively disable the optimsation, make cannot finish. It is not a permanent fixture.

You say you're confused by the stack, but it's difficult to help as you
don't make it clear what areas of it confuse you. In general, perl ops
expect pointers to their args on the stack, pop them off, and pop their
results onto the stack. So pp_add() pops two args off the stack and pushes
one result. List ops expect a variable number of args, the base of which
is indicated by the top entry on the markstack.

I appreciate the explanation. The actual issue I am having trouble with is around the actual calling of S_multideref. My understanding of the code that I have right now is that it clears the stack (SP = MARK) and closes the arguments list (PUTBACK) before calling S_multideref. The ordering of those doesn't make sense to me, but swapping the order of them means the argument stack is not cleared. In simple cases like $object->accessor, the result ends up on the stack. In the slightly more complex case of $object->accessor->accessor, undef ends up on the stack instead. I suspect it's because of my misunderstanding of how these macros interact with each other. I will investigate more.

It you want to indicate that the sub is optimised for a direct multideref
call, it would be better to set a CVf_ flag in the CV's CvFLAGS() field:
it would then be quicker for pp_entersub to check for this special case.

This is a good suggestion. The way it is structured now is because of I discovered how things work; the body of a CV is passed to the peephole optimizer, not the CV itself, and that was easier to use since all CV bodies are passed to it. A single if statement in pp_entersub would be better than what I have now, so I will aim for that.

For fine-grained benchmarks, you should be looking into adding tests to
t/perf/benchmarks and running Porting/bench.pl on them.

I appreciate the tip, I will make sure to use that to validate if this optimsation creates a gain, not a general slowdown.

Again, I am very sorry about making this difficult. I should have done a better job of expressing the problem I was actually looking for help with. I will take all of your feedback and plan to close this PR in the near future.

@demerphq
Copy link
Collaborator

The environment variable is a side effect of my process: without being able to selectively disable the optimsation, make cannot finish. It is not a permanent fixture.

Add a var to intrpvar.h and then initialize it from the environment in perl_construct(). You can find similar logic for how hash seed initialization is done. Be aware that anything set up in intrpvar.h needs support in Perl_parser_dup() in sv.c.

@iabyn
Copy link
Contributor

iabyn commented Mar 31, 2023 via email

@atrodo
Copy link
Contributor Author

atrodo commented Apr 1, 2023

Thank you for that explanation, it is extremely helpful in understanding what is happening in both pp_multideref and pp_entersub. That will be very helpful as I continue my discovery.

@atrodo atrodo closed this Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants