-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[opt] store properties of stack alloc_ref in tuple #30810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opt] store properties of stack alloc_ref in tuple #30810
Conversation
If possible, create a tuple that holds the stored properties of a class and replace as many uses of the class as possible with the tuple. Then store the tuple elements into the class properties. This allows both LLVM and the SIL optimizer to produce better codegen.
@swift-ci please benchmark |
* Ref element addrs that had uses outside "firstUnknonwUse" would still be replaced. * Some strong releases need to be removed when erasing the alloc_ref.
@swift-ci please benchmark |
1 similar comment
@swift-ci please benchmark |
Performance: -O
Code size: -O
|
The observer benchmarks seemed to have huge improvements in #30743 but I don't see them here. I'll see if I can figure out why. I think most of the regression comes from alloc_refs with tail elements. Those don't seem to have much if any benefit from this optimization and get hurt (as seen in the substring tests especially). I'll try skipping those and see what the impact is. #30812 and this patch together will be really great. |
It seems like this may be causing some regressions without gains in any cases so, skipping those alloc_refs is benefitial.
@swift-ci please benchmark |
1 similar comment
@swift-ci please benchmark |
Performance: -O
Code size: -O
|
Welp. I guess that wasn't the solution 😕 Sorry for waisting buildbot resources. |
@eeckstein sorry for all the messages. What do you think of the direction here? If you think I should go ahead with this patch then I can look into the build errors / tests. |
@zoecarver I think your optimization makes sense. I still have to review the changes in detail. |
@eeckstein no rush. I'll add some more tests/clean it up a bit. |
@zoecarver You approach using the dominator tree is not working, because the dominance relation cannot be used to decide if one instruction is executed "before" another instruction.
All three blocks 'b', 'c' and 'd', are direct dominator children of 'a', which means it's not telling you that 'd' is executed after 'b' and 'c'. Even sorting (in getOrderedNonDebugUses) will not work because the dominance relation is not a strict order ('not a dom b' does not imply 'b dom a'). What you would need to solve this problem is a data flow analysis. Thinking about it again, I released that we already have this kind of optimization: it's redundant load elimination and dead store elimination. These optimizations are doing the data flow analysis required to solve the problem. The only difference is, they are not grouping the object content into a tuple, but use a separate SSA value for each property. I tried a simple example and found: it does not work! Looks like that access markers (begin_access, end_access) prevent redundant load elimination. We should fix this. PS: Please don't be discouraged by my review comments. I really like that you are working on the optimizer! |
@eeckstein that's good to know. Thanks for all that information. I didn't realize load/store elimination already did this, I agree, updating those will be a much better solution than this patch. Which is kind of a hack. I was running into the begin/end access issue in this patch too, that's why I created #30812. I don't think we ever completely remove alloc_refs even if they're not used. If that's true, I'll try to fix that too because otherwise, their lifetime functions get in the way of llvm optimizations. |
Just so I make sure I understand what you're saying with the example you gave, this implementation won't ever cause miscompiles but, it might miss some things that it could optimize, right? In other words, using dominance info in the way I am is "safe" but misses some cases? |
@eeckstein fixed in #31078 :) |
To answer you question: no, this implementation would cause miscompiles (most likely) |
Refs #30736 #30787 #30743
As discussed in the above PRs the SIL optimizer is much better at optimizing tuples than classes. This patch moves all stored properties of a class into a tuple so that they are able to be optimized before being stored back into the class. If possible, the class is removed altogether.
This is only one of the many possible ways to implement this optimization. If we decide that this is the path we want to take, I will add more extensive tests.