-
Notifications
You must be signed in to change notification settings - Fork 577
Smoke me/fix caller issues #18568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smoke me/fix caller issues #18568
Conversation
I wanted to add some demonstration of the bug this PR fixes. Consider the following program:
This produces the following output:
So somehow we end up with a set of stack traces from the same code path, taken at different depths, and each one is different, and the data they show is clearly wrong. lib/Mod5.pm could not be source the second deepest call frame. There is a clear inconsistency, the frame before shows that a "use Mod0;" call was executed, yet the next frame think it was called from lib/Mod5.pm line 2, which is obviously wrong, and very misleading to someone not familiar with this bug. Now, some may argue exactly what is correct here. I have seen some discussion about what type of lie to tell about these frames, since they do not strictly speaking come from code directly and I do not stake our a position here beyond that I think what this patch does is a reasonable thing. However, what I am really concerned with and pretty insistent on is that the current behavior is not correct. A stack trace is supposed to show the path through the code. It cannot be that taking the Kth step on the path changes the record of what previous steps were. Sure, there is useful information interspersed with this junk data, but IMO we would better not showing anything than showing what we do, and IMO Dave M's patch is an even better solution. That does not mean that there is not an alternative better solution, but I think there cannot be any debate that the current behavior is buggy and confusing. Compare the above mess with what the patch produces:
We now have clarity and consistency. The bottom of the stack stays consistent as we go deeper into the dependency tree. This is more or less what I would expect to see and I expect anyone reading it would understand (with a bit of doc reading) what it was showing. |
In case it isnt obvious consider the following modification of the test code:
With this PR applied:
Without this PR:
I think it incumbent on anyone objecting to this PR to either provide an alternative solution, or provide a compelling reason why this test isn't reasonable. Surely we all accept that the stack should stay consistent as we go deeper as we look at it at different levels? It seems to me to be absurd to argue otherwise. |
Furthermore on this I checked into the failure in DBIx::Class. It would seem that t/53lean_startup.t uses a set of heuristics (which aren't entirely clear to me) to notice what plug ins are being loaded. The old behavior of caller() meant that those heuristics started noticing that Class::C3::Componentised is loading a module, MRO::Compat, which it has been loading all along. The solution is pretty simple:
After applying this patch the test pass. It seems to me that the reality is the test code is making faulty assumptions in the first place since it seems it failed to notice this before, even though it was indeed being loaded. The stack frame counter variable $up changes from 3 to 13 in the old to new code because in the old code it thinks the BEGIN block is called from main, but it not, the BEGIN is executed in package Class::C3::Componentized, so with a fixed caller() the heuristic no longer gets confused and notices what it should have noticed in the first place. I dont really understand the strategy employed in this test file. If i was testing this I wouldn't do any of this crazy stack walking. I would just track what was required() and then later check to see that it doesn't contain anything I don't expect, without any heuristics about the stack. The stack contains unreliable data and the code doesn't detect when that data is not reliable nor do anything about it. It is a very error prone strategy. GIGO. |
If it is required, I could check how it works on recent perl's. |
Sorr
On Fri, 16 Sept 2022, 23:40 James E Keenan, ***@***.***> wrote:
@demerphq <https://github.com/demerphq> is this PR closable? It looks
like the collective unreversion has since been applied by f6387cf
<f6387cf>
.
@demerphq <https://github.com/demerphq>, as @hvds
<https://github.com/hvds> asked: Is this p.r. closable? If it should
remain open, can you resolve merge conflicts so @iabyn
<https://github.com/iabyn> can
Sorry i won't be able to review this properly until Thursday. If anyone
wants to do their own check please do. I thought this was all merged
already tho...
Yves
… |
That would be appreciated, as it would permit us, once blead is modified, to consider a backport to the maintenance branch of 5.36 (and maybe 5.34). |
I am closing this ticket. The fix was finally applied in the 5.35.x and released in 5.36. Thanks for the report confirming this @KES777 |
There are and have been multiple bug reports over the years related to caller() showing bogus data about require call frames.
For example issue #15109 and I believe #4125 and someewhere there is one I created as well. I am sure if we dig we will find more. The issue comes from using a single global variable to store the file and line data for the "fake" parts of the expansion of
use Foo;
into
BEGIN{ require Foo; }
Using one variable meant that certain exceptions were correct, but walking the callstack would end up pointers to the same data structure from multiple distinct frames in the stack, thus making every require in the call stack look as though there were "from" the same line and file, that of the most recent require. For the exact same code, depending on where in the call stack an exception is thrown mutliple frames will appear to change their origin.
This is not and cannot be correct. A given BEGIN block fires only once, it cannot appear in the callstack more than one, and it cannot be recursive.
Dave M then fixed the problem with various patches. These fixes then broke modules which had come to depend on the buggy implementation and the patches were reverted at the time. This PR reverts those reverts. The code which depends on the buggy stack data should be fixed so the rest of us have sane callstacks to debug our code with.
See Issue #17663 for details on the revert. I believe the statement contained in #15109 outlines the problem sufficiently. See also the test in the PR.