-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Change zig int promotion / cast rules #7967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
WeI want to add that if there is interest in this feature, it's possible to describe the above algorithm in a much simpler manner. But if there is scant interest I won't bother. To reiterate, the behaviour is basically "behave as if the calculation is done with infinite bit width everywhere then truncate with runtime trap/UB if truncation is not lossless, and issue a compile time error if the target type is smaller than one or more of the original types of the operands" |
Thanks for the synopsis, @lerno. It looks like point 9 and your second example are in conflict. The target type in your second example is Doesn't the the overall size depend on the operation? I.e. addition can only add a bit (assuming the operands are the same signedness), but multiplication can add the bit lengths together. For instance, multiplying two I am a little concerned already that there is a lot of automatic coercion going on and that violates the Zen of Zig. I thought the rules were roughly:
Etc. |
@kyle-github I think I need to clarify a bit. Basically any arithmetic operation would in itself trap/UB, so essentially we have two places we trap: (1) in the narrowing (2) in the operation itself. This is why there is no need to do the widening you do in 3/4. I've updated the algorithm above significantly. Hopefully it's more clear now. |
I would like to amend the proposal with a simpler model:
An open question is how casts interact with this. Behaviour in the "complex examples" are same for the above algorithm. Downside here is that something like Because the algorithm uses u64 -> i128 to prevent intermediary value overflow, we have similar issues with pure u64. So |
What does BABW mean?
Promote
Explicit casts are usually very undescriptive/implicit about behavior (take C++ with 4 different casts and more methods to cast pointers up or down). Rust thinks how to move completely away from them and most languages try to, because the behavior is bad for code review.
|
@matu3ba I define BABW as the lowest bit width to perform calculations, typically 32 bits on 32/64 bit targets. 16 bits on 16 bit targets.
Yes, the left hand side. The change is depth first, casting the leaves will naturally propagate the change upwards. Given
An example I often take is:
I've read through the correspondence on the Rust mailing-list around 2014. There were many different ideas at the time. I didn't see any proposal for using a bigger type. The cost of trapping was discussed a lot, and ideally many would have preferred traps on both release and debug, but fear of bad performance seemed to have weighted most heavily. There is the "As-if infinitely ranged integer" model (also known as the AIR model) which potentially allowed trapping with delayed checks, so that the performance cost would only be in the range of 6%. However, since this was an academic paper with unclear ramifications, their tight deadline (they were going to 1.0 before the end of the year) made them dismiss this solution. Some argued for wrapping semantics, but with special trapping overflow operators, but not enough people liked that idea. |
In regards to However I also want to stress that the problem is not just about trapping, but also that unsigned integer overflow has very significant footguns. (1) +/- is no longer associative nor commutative (2) it is very complicated to safely combine unsigned and signed numbers. This problem becomes more problematic when there is explicit widening: The below would be valid Zig: someU32 = someU8 * someU16 + someU32; But for Rust the widening must be explicit: someU32 = u32::from(u16::from(someU8) * someU16) + someU32; Here it's easier to see where potential overflows can occur. Since I need to cast anyway, it's simpler to do: someU32 = u32::from(someU8) * u32::from(someU16) + someU32; Which doesn't run into the same risk of overflow. Because the casts are explicit, the unsigned overflow trapping isn't as dangerous. Because the widening is implicit in Zig, it's even less clear where we run into chances of overflow. |
@lerno Footguns are always bad and not being able to specify local expression behavior with as minimal amount of keypresses and visual clutter is too. What about defining compiler intrinsics like someU32 = @op_to_lhs(someU8 * someU16 + someU32);
someU32 = @safe(someU8 * someU16 + someU32);
someU32 = @fast(someU8 * someU16 + someU32); or adapting grammar like someU32 =cast_lhs someU8 * someU16 + someU32;
someU32 =safe someU8 * someU16 + someU32;
someU32 =fast someU8 * someU16 + someU32; or shorter someU32 =c someU8 * someU16 + someU32;
someU32 =s someU8 * someU16 + someU32;
someU32 =f someU8 * someU16 + someU32; ? I think the default-semantic of |
Although you could do this I @matu3ba I think the problem is deeper and is about a more fundamental stance on implicit widening and trapping. In order to make code secure and bug free the language should try to make the correct way the obvious solution. If we consider the simple |
This seems like it would fix this footgun I've ran into a few times: // u1 + u1 will over if both conditions are true instead
// of producing the desired `2`.
var index: u32 = @boolToInt(foo) + @boolToInt(bar); |
I have a lot of differently-sized types and seem to run into this issue all over the place. var x: u8 = 0;
var y: u8 = 0;
var index: isize = y * 256 + x; Gives this error:
Fixing this would cut down on a ton of needless manual casting that makes the code less readable. |
I wanted to add some more thoughts to this: If we look at something like (I'm going to use a simplified notation here to just show the types) i32 = i16 + i8 + i32 This is a very dangerous expression in Zig with today's semantics, because it will be unclear to the reader what the actual semantics are. The above will in code be The widening and arbitrary bitsize addition destroys associativity: To me it was tempting to treat To illustrate why this could be bad, consider if the left hand side is a struct that you did not define. If the struct widens the storage, your code changes semantics. We have safe widenings though: When I say "safe" I mean that there is no way to reorder the widenings for different behaviour. So if we again look at
Thus it is unsafe, but this is safe: I've tested this in my language C3 and so far it's been extremely low friction: the cases when you cast are actually when you really need it. To summarize the C3 approach:
Previous approaches I've tested and discarded:
|
Optimal for the register allocator would be to handle things as uniform as possible (case 2) unless becoming bigger than the addressable register size (called word size of the architecture) or if there is register pressure and things are moved on the stack (then case 1 is faster as more stuff fits into L1 cache and thats probably also why Zig chose to use it for now). As far as I understood the consensus so far Zig does not want to support arbitrary field reordering like C with the sequence point footguns with the argument that the supported use cases by Zig would not have significant perf advantage from that (it is time-consuming to optimize for long expressions anyway). So the options that you suggest and my opinion on them are:
|
@matu3ba I think you misunderstand my point. There are no "options I suggest". I think that it is clear that neither implicit promotion method works well. The point is not about whether the compiler is allowed to reorder arguments, but rather if the reader understands the semantics of the code and whether the semantics of the code is susceptible to Hidden spooky action at a distance. My main point is that an expression What is notable about this is that by disallowing the cast on binary arithmetics the approaches (1) and (2) always yield the same result for the allowed cases. |
I close this in favour of #16310 |
Uh oh!
There was an error while loading. Please reload this page.
Some background and comparisons can be found in the following blog articles: On arithmetics and overflow, Overflow trapping in practice.
Current rules in Zig using peer resolution + assigned type has issues. One is that occasionally the type of a literal cannot be resolved, but also the peer resolution gives unexpected results. This proposal does affect the mod arithmetics as well, since they become less useful.
The main idea of the proposal is make all integer arithmetics act as if it was done with bigint, but trap / UB if the conversion any truncation occurs.
EDIT I've updated the algorithm quite a bit, although the overall behaviour (as stated above) should be the same.
EDIT2 Added blog articles as references.
To explain the algorithm, we need to define the following:
In addition to this we define a base arithmetics bit width (BABW), which typically is 32 or 64 which is the platform dependent minimum type that is efficient to perform calculations on and a max arithmetics bit width (MABW) which is the corresponding maximum efficient type.
The following occurs:
i16 b = (a + b) * c
we first analyse,a
, thenb
, thena + b
andc
, then(a + b) * c
. In each of these we pass along the i16 type.i16 s = 1 + @as(i16, 1)
would be an error as the right sub expression is of type i32, which is wider than i16.i16 + i31 => i32
i32 + u32 => i64
i8 + u16 => i32
u8 + u16
=>u32
. This is the resulting type. If the required width would exceed MABW, then this is a compile time error.u64 = i8 + u16
=> resulting type isi64
.Examples:
Some more complex examples:
As an optional extra rule we can say that given an expression with a high type A and a low type B, doing
&
with a constant of bit size N will reduce the size of the low type to N:Note that #7416 also touches on some aspects of this.
The text was updated successfully, but these errors were encountered: