-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: eliminate some bounds checks #5364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The calls to runtime.panicindex are annotated by the line number of the index expressions so that they show nicely in stack traces. Panics on the same line could be merged into a single one but I think being able to differentiate them by their PC is useful for debugging. The subject of redundant bounds check has already been raised a few times, I'm surprised I can't find an existing issue about it. |
I saw in the Go 1.6 planning that @dr2chase might want to look at bounds check elimination. I think work is still on-going. It seems like a perfect fit for the SSA compiler. For fun I have created all the cases I can think of, and added it as a boundscheck.go gist. It can be copied directly to the playground. My main thought is that with a SSA tree it would be possible to attach a possible "minimum" and "maximum" value to each slice and variable. For values that we don't know, or escape, we set it to the maximum/minimum value of the type. |
There are a bunch more involving more complicated expressions. Some examples
Or the more common (and slightly more complicated)
And there are lots involving multiple slices
|
It's slipped to 1.7, the analysis ought to exist in buggy form later this week or early next week. In the short run I'm not targeting multiplication, but I am trying to get the following cases:
The multiplication expression is harder than you might think because of integer overflow.
Suppose m = 4, n = 1<<62. Just remember, we design languages to let integers overflow and wrap because it's "more efficient". |
In the long term some way of indicating a guaranteed slice length for the loop - for instance by reslicing, eg. Combined that with some way of indicating a maximum lookup value |
By-the-way, this idiom ought to work somewhat better (for analysis purposes):
I'll put this in my queue. |
Our actual use case involves matrix views, and so it can't be as simple as the above code. Our real code is closer to
note in particular that in general stride != m. Honestly though, I think the highest impact you could have on optimizing gonum code is (code snippets assume correct bounds checking has been done above)
All of these also come in the "strided" form you mention above, where instead i is incremented by a constant value. If these three expressions could have bounds checks removed and SIMD, there would be a huge speedup in the BLAS benchmarks. If you're interested for testing purposes, we have a bunch of benchmarks at godoc.org/github.com/gonum/blas/native, and godoc.org/github.com/gonum/matrix/mat64 . Thanks for thinking about compiler optimizations! |
Oh, to add, those benchmarks at present use assembly for the three cases I mentioned above. The pure go versions can be run with the -noasm tag. |
Same problem with potential overflow here:
because
and the inner loop can run and run and run. |
I see. If the check is
Is that then correct? |
(also assuming m, n and stride are each individually checked to be >= 0) |
I suspect it is not correct, given suitably hacked modulo arithmetic.
Let stride = n = 1<<62, m = 4, then maxIdx = 0. I think that division is ultimately what you want, though I'm nowhere near there yet on that inference. |
Fixed. See:
(And maybe other SSA changes) |
The text was updated successfully, but these errors were encountered: