Skip to content

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented May 30, 2025

A bit of hacking to get back near to the same performance as before by using the GlobalRef to optimize the getglobal lookup for now and avoiding the extra Vararg function indirection which forced some extra boxing and lookups.

julia> @btime foo(1.5)
  22.892 ns (1 allocation: 16 bytes)  # v1.11
 141.543 ns (3 allocations: 48 bytes) # master
  38.759 ns (2 allocations: 32 bytes) # PR

The remaining difference is split about equally between the need now to box the world counter value for invoke_in_world and the extra cost of scanning the partition table for Base.sin to find the current entry.

Fix #58334

A bit of hacking to get back near to the same performance as before by
using the GlobalRef to optimize the getglobal lookup for now and
avoiding the extra Vararg function indirection which forced some extra
boxing and lookups.

julia> @Btime foo(1.5)
  22.892 ns (1 allocation: 16 bytes)  # v1.11
 141.543 ns (3 allocations: 48 bytes) # master
  38.759 ns (2 allocations: 32 bytes) # PR

The remaining difference is split about equally between the need now to
box the world counter value for invoke_in_world and the extra cost of
scanning the partition table for `Base.sin` to find the current entry.

Fix #58334
@vtjnash vtjnash added the backport 1.12 Change should be backported to release-1.12 label May 30, 2025
@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Jun 2, 2025
@KristofferC KristofferC mentioned this pull request Jun 2, 2025
58 tasks
@JeffBezanson JeffBezanson merged commit f12256b into master Jun 2, 2025
8 checks passed
@JeffBezanson JeffBezanson deleted the jn/58334 branch June 2, 2025 18:45
@inkydragon inkydragon removed the merge me PR is reviewed. Merge when all tests are passing label Jun 3, 2025
@KristofferC KristofferC mentioned this pull request Jun 6, 2025
60 tasks
topolarity pushed a commit that referenced this pull request Jul 1, 2025
A bit of hacking to get back near to the same performance as before by
using the GlobalRef to optimize the getglobal lookup for now and
avoiding the extra Vararg function indirection which forced some extra
boxing and lookups.

    julia> @Btime foo(1.5)
      22.892 ns (1 allocation: 16 bytes)  # v1.11
     141.543 ns (3 allocations: 48 bytes) # master
      38.759 ns (2 allocations: 32 bytes) # PR

The remaining difference is split about equally between the need now to
box the world counter value for invoke_in_world and the extra cost of
scanning the partition table for `Base.sin` to find the current entry.

Fix #58334

(cherry picked from commit f12256b)
@topolarity topolarity removed the backport 1.12 Change should be backported to release-1.12 label Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

getglobal with binding partitions is much slower than a dynamic dispatch
4 participants