-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Performance improved a lot in the last 2 weeks #1624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Good to see the rusty handbrake cable repaired after all these years :-) |
Thanks. =) Funnily enough @sandwichmaker and I have known each other for about 15 years or so, and yet I think this might be the first time we've directly worked on something together. Hopefully later today (or tomorrow) I'll have TLS straightened out (support non-keyword TLS, fast for Android, etc.). It's getting close, though. I'll need @sandwichmaker's help to run the Android benchmarks, though... nudge nudge. |
I've uploaded #1625 which might eliminate a few more cycles. I'd be interested in knowing if it did, since I'm not measuring cycle counts, just time. |
Unfortunately the ongoing problems with the TLS code led me to making the old memory.c the default again in 0.3.3. You can still get the latest revision of the new code (based on PR #1739 with interim fixes for some new races it introduced) by building with USE_TLS=1. Hopefully this can be made the default in 0.3.4 once remaining issues like those seen in #1735 are understood and resolved. |
[Not a bug just wanted to put this somewhere useful in github]
Below is a performance summary from git 36c4523 to what you get at the end of my other pull request, on a Core i9 cpu (so with AVX 512); basically all things done i the last two weeks. You can see at the small matrixes end the benefit of the work from @sandwhichmaker and @oon3m0oo ; the threading improvements in the mid of the range and the AVX512 across all of the range.
The text was updated successfully, but these errors were encountered: