Skip to content

Struct redefinition caching, to support "undo" workflow #59127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Jul 29, 2025

JuliaCon 2025 Hackathon Project!

This is to help improve the user experience of struct definition with Revise.
The focus of this change is to improve the following scenario, which could happen today on master, after Revise supports invalidation of methods and structs from struct changes:

  1. A user creates some complex test data of type X via a slow process. They run their tests, and eventually decide there's something to improve in a struct X.
  2. They change the definition of X in src/, then rerun their test.
  3. Revise will delete all of the methods defined on the old X, and reevaluate all the methods w/ the new X.
  4. The user gets a MethodError because no methods exist for typeof(test_data), @world(X, past...). Hopefully the error message explains that they need to recreate their test data.
  5. At this point: the user decides that it's not worth it to change the struct right now. They'll try this change in the future. So they UNDO THEIR CHANGES TO the struct X, hoping to resume testing.
  6. Now they try to run the test again, but the tests still fail since the old X doesn't exist, even though the definition of the struct X matches exactly the definition of typeof(X).

After this change, step 6 would succeed, since we will find the existing definition of the struct typeof(X), and use that cached definition when evaluating the restored definition for X.


TODO:

  • Adjust lowering to get the binding and pass it through to typebody!, removing the current "hack" to find the binding from the module by name.
  • Make equiv_type a proper function declaration in julia-internal.h, with the normal naming conventions.

Co-Authored-By: @topolarity

@NHDaly NHDaly requested review from Keno, topolarity and timholy July 29, 2025 04:11
@Keno
Copy link
Member

Keno commented Jul 29, 2025

The concept seems sound to me. However, there is a bit of a semantic complication with respect to the type parameters. The reason that lowering does this whole complicated thing in the first place is that it will try use the old type parameters to facilitate the equivalence query. When I added the partitioning, I added some code to re-substitute the parameters from the old to the new. I think in order to make this work, you'll want to essentially do the opposite of that - scan all the existing partitions, check equiv_typedef, if it matches, substitute in the variables from the old dt and then check field equivalence. I think that should work. Also note that there may be more than one past type that passes equiv_types, so you do need to check all of them.

@Keno
Copy link
Member

Keno commented Jul 29, 2025

  • Make equiv_type a proper function declaration in julia-internal.h, with the normal naming conventions.

Not required, since it's only used in one place. In fact, you'll be able to delete the _equiv_typedef intrinsic.

@timholy
Copy link
Member

timholy commented Jul 29, 2025

We may need a mechanism to signal this outcome to Revise. The code that checks equivalence is currently in https://github.com/timholy/Revise.jl/blob/437e2d1ce846d6330be731cdc190e39818f1eeb7/src/lowered.jl#L452-L470 (not yet merged, part of timholy/Revise.jl#894). Would an alternative option be to have Revise do this analysis itself?

There might be one more complication: suppose the user performs the following sequence:

  • modify Foo
  • define at the REPL struct UsesFoo f::Foo end
  • edit Foo's definition back to the old version

So now UsesFoo is effectively dead until it gets redefined to use the old Foo, which Revise should do automatically. But now you're in an interesting situation of not being able to use world age to decide what's current: you can have either an old Foo and a new UsesFoo, or a new Foo and an old UsesFoo. What happens if you edit both in a single file change, which takes priority?

@Keno
Copy link
Member

Keno commented Jul 29, 2025

Why does Revise care? If the type is not the same as the immediate prior one, it still needs to redefine all the methods.

@NHDaly
Copy link
Member Author

NHDaly commented Jul 29, 2025

Yeah, I agree with Keno - I don't think Revise needs to care. From Revise's perspective, we have still redefined the struct. We've just revised it "back" to one that was defined in the past. But Revise still needs to do exactly the same things it would without this PR: delete all the invalidated methods, consts and structs, and re-evaluate them according to the new type. (It's just that that "new" type happens to have been one that julia has seen before, at some point in the past.)

@timholy
Copy link
Member

timholy commented Jul 29, 2025

To be clearer, imagine the following baseline situation:

  • file1.jl: contains the definition of Foo
  • file2.jl: no struct definition

Now revise the definition of Foo and let Revise do it's work.

Then in one go (without issuing a REPL command in between):

  • edit file1.jl back to the old definition
  • edit file2.jl to introduce UsesFoo

Revise does not have a clear definition of the order in which changes should be made. Suppose it initially revises file2.jl, in which case it will define UsesFoo in terms of the intermediate definition of Foo. But then it revises file1.jl and discovers the changed definition of Foo.

However, it's not 100% clear that the undo functionality makes this more complicated to handle. Revise probably could handle it with a dependency tree that ignores world-age. But circular definitions (which we allow) could make this a little harder.

@NHDaly
Copy link
Member Author

NHDaly commented Jul 29, 2025

Interesting. I'm still not sure that I see the issue.

If you revise file1.jl first, then when you hit file2.jl, you will define UsesFoo in terms of the new definition (which points to an older Type object in memory).

Conversely, if you revise file2.jl first, you will define UsesFoo using the "current" Foo definition. Then, when you revise file1.jl, you will evaluate the new definition of Foo, which restores the Foo binding back to an older Type object in memory. Then, just as if you had made any change to Foo, you must now follow all the invalidations and re-evaluate UsesFoo again, too.

Indeed, i think that nothing about that process changes as a result of this PR.
From Revise's perspective, the work remains the same: Any time the source text changes for a struct, we must reevaluate that struct. If the output type has changed, we must follow all invalidations and reevaluate those methods, consts, and structs as well.

Does that sound right to you?

@timholy
Copy link
Member

timholy commented Jul 29, 2025

Agreed, except for the possibility of cycles. However, I think these have to be internal to a single type. Example: #42297

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants