Skip to content

replace @mulWithOverflow(u64, x, radix, &x) with @mul(u64, a, b) that returns a double width type #1350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shawnl opened this issue Aug 7, 2018 · 9 comments
Labels
optimization proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@shawnl
Copy link
Contributor

shawnl commented Aug 7, 2018

`@mul(comptime T: type, a: T, b: T) @inttype(t.is_signed, T.bit_count * 2)

some architectures, such as mips[1] and sh4[2] do not have overflowing multiplication. Instead they return a multiplication result that is twice as big as the input types. This is how multiplication is done in hardware, the result. For 128X128-,>256 we will need a new u256 type and upstream work (also for 256X256->512[3] and 512X512->1024[4])

[1] https://stackoverflow.com/questions/16050338/mips-integer-multiplication-and-division
[2] http://www.shared-ptr.com/sh_insns.
[3] https://stackoverflow.com/questions/34234407/is-there-hardware-support-for-128bit-integers-in-modern-processors/50776753#50776753 The x86/x64, however, is a superscalar CPU, and the registers you know of are merely the architectural registers. Behind the scenes, there are a lot more registers that help optimize the CPU pipeline to perform out of order instructions using multiple ALUs. While the x64 may not be a 128-bit CPU, SSE/SSE2 introduced native 128-bit math, AVX introduced 256-bit native integer math, and AVX2 introduced 512-bit integer math. When returning from functions you will return the value in the 128-bit XMM0 SSE/SSE2 register, 256-bit AVX results in YMM0, and 512-bit AVX2 results in ZMM0; these, however, are add-ons to the x86/x64, not the primary architecture and support is entirely compiler and release platform (such as Python) dependent.
[4] https://en.wikipedia.org/wiki/AVX-512

@andrewrk
Copy link
Member

andrewrk commented Aug 7, 2018

Please elaborate:

  • explanation of use case
  • proposed API
  • example test case that should pass if this is implemented

I'll re-open when you provide these things.

@andrewrk andrewrk closed this as completed Aug 7, 2018
@shawnl shawnl changed the title @multiply that returns result twice as wide as input. replace @mulWithOverflow(u64, x, radix, &x) with @multply(u64, x, radix, &low, &high) Aug 7, 2018
@shawnl shawnl changed the title replace @mulWithOverflow(u64, x, radix, &x) with @multply(u64, x, radix, &low, &high) replace @mulWithOverflow(u64, x, radix, &x) with @multiply(u64, x, radix, &low, &high) Aug 7, 2018
@shawnl
Copy link
Contributor Author

shawnl commented Aug 7, 2018

Sorry I was on my phone

@shawnl shawnl changed the title replace @mulWithOverflow(u64, x, radix, &x) with @multiply(u64, x, radix, &low, &high) replace @mulWithOverflow(u64, x, radix, &x) with @mul(u64, x, radix, &low, &high) Aug 7, 2018
@andrewrk andrewrk reopened this Aug 7, 2018
@andrewrk
Copy link
Member

andrewrk commented Aug 7, 2018

Thanks! No worries, just wanted to throw the ball back in your court

@andrewrk andrewrk added this to the 0.4.0 milestone Aug 7, 2018
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Aug 7, 2018
@shawnl shawnl changed the title replace @mulWithOverflow(u64, x, radix, &x) with @mul(u64, x, radix, &low, &high) replace @mulWithOverflow(u64, x, radix, &x) with @mul(u64, x, radix, &high, &low) Aug 7, 2018
@shawnl
Copy link
Contributor Author

shawnl commented Aug 7, 2018

can you add the optimization tag, because this would provide access to these instructions.

@andrewrk andrewrk added optimization upstream An issue with a third party project that Zig uses. labels Aug 7, 2018
@andrewrk
Copy link
Member

andrewrk commented Aug 7, 2018

I also added upstream, because arguably this should exist in LLVM. https://llvm.org/docs/LangRef.html#llvm-smul-with-overflow-intrinsics

@shawnl
Copy link
Contributor Author

shawnl commented Aug 7, 2018

the sparc origins of llvm is showing. reported upstream: https://bugs.llvm.org/show_bug.cgi?id=38475

@shawnl
Copy link
Contributor Author

shawnl commented Aug 8, 2018

If you look at the upstream hug it looks like we do not need upstream work because value range propagation uses the 32*32->64 instructions.

@andrewrk
Copy link
Member

andrewrk commented Aug 8, 2018

I see, thanks for the follow-up. Here is my counter proposal:

@mul(comptime T: type, x: T, y: T) @IntType(T.is_signed, T.bit_count * 2)

No possibility of failure, no pointers, it just has a double-wide return type.

@andrewrk andrewrk removed the upstream An issue with a third party project that Zig uses. label Aug 8, 2018
@shawnl shawnl changed the title replace @mulWithOverflow(u64, x, radix, &x) with @mul(u64, x, radix, &high, &low) replace @mulWithOverflow(u64, x, radix, &x) with @mul(u64, a, b) that returns a double width type Aug 8, 2018
@andrewrk
Copy link
Member

andrewrk commented Feb 15, 2019

I think this may be better as a userland function. The language intrinsic should probably stay mapped directly to the LLVM intrinsic.

in std.math:

fn wideMul(comptime T: type, x: T, y: T) @IntType(T.is_signed, T.bit_count * 2) {
    const ResultInt = @IntType(T.is_signed, T.bit_count * 2);
    return ResultInt(x) * ResultInt(y);
}

This is what Zig would generate anyway for this intrinsic, so making it a compiler primitive would not be an optimization.

Feel free to make additional arguments for this intrinsic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

2 participants