-
Notifications
You must be signed in to change notification settings - Fork 13.6k

Description
_Summary_
I propose a space optimization for variables of type Option<E>
when E
is a nullary, integral enum type.
_Motivation_
There's no need to waste memory for storing a separate tag in variables of type Option<E>
if E
is an integral enum type and the set of valid values of E
does not cover all possible bit patterns. Any bit pattern (of the size of E
) that doesn't represent a valid value of type E
could be used by the compiler to represent the None
value of type Option<E>
.
_Details_
Given a nullary, integral enum type E
, the compiler should check if some bit pattern exists which does not represent a valid value of type E
(the only valid values are the ones determined by the nullary enum variants of E
). If such "invalid" bit patterns are found, the compiler should use one of them to represent the None
value of type Option<E>
and omit storing the tag in variables of type Option<E>
. If more than one such "invalid" bit pattern exists, there should be a language defined method to deterministically determine which one of those bit patterns is used to represent the None
value. I think the bit pattern of None
should be language defined rather than implementation defined in order to make Option<E>
values serialized to disk more stable between different compilers / compiler versions.
In determining whether a certain value of such space optimized type Option<E>
is None
or not, the algorithm should simply check whether or not the binary representation of said value is equal to the binary representation of the language defined "invalid" value.
Activity
Option<T>
for integral enumT
rust-lang/rfcs#84partial_cmp
method toPartialOrd
rust-lang/rfcs#100erickt commentedon Jun 3, 2014
I whipped up a dummy example to see how this would optimize
Option<Result<T, E>>
types, and it has a pretty nice 8% speedup: https://gist.github.com/erickt/8a6be5c8a2542eaf0c45. This would be especially helpful for mylibserialize
rewrite RFC.pczarn commentedon Jun 5, 2014
Don't you think it could be done transitively for all tagged unions and integral enum types?
I have three optimizations on my mind:
#[repr(
int type)]
on non-integral enums#[packed]
on enums to force all possible optimizationsI think the bit pattern of a variant should should be as close to 0 as possible in the order of declaration, just like in integral enums. It could stay undefined or become implementation defined
huonw commentedon Jun 5, 2014
Why wouldn't this be the default? (That is, why would all optimisations be applied by default.)
pczarn commentedon Jun 13, 2014
@huonw, because it's a space-time tradeoff. Ideally, all possible values of an ADT could be represented within a minimal number of bits. However, values can be moved out of enums, so they should exist somewhere with proper alignment. Matching complex packed enums could still get expensive without simple discriminants:
pczarn commentedon Oct 1, 2014
Of course, sorry, I was referring to something else.
discriminant_value
intrinsic #2090723 remaining items