Add section on records being constant. #2413

lrhn · 2022-08-18T13:37:52Z

A proposal for constant specifying records being constant.
Hope I covered all the corners.

Basically, only two paragraphs are normative, the rest is elaboration.

My proposal for constant behavior.

eernstg

For now, I've just commented on the situation in this review, but looking at #2337, I think the matter could benefit from some further discussion in that issue (or in some other issue, if needed).

working/0546-patterns/records-feature-specification.md

eernstg · 2022-08-19T08:57:51Z

working/0546-patterns/records-feature-specification.md

+_This means that a record expression can be used in the initializer list
+of a constant non-redirecting generative constructor._
+
+Constant *object* instantiations create deeply immutable and canonicalied objects.


Suggested change

Constant *object* instantiations create deeply immutable and canonicalied objects.

Constant *object* instantiations create deeply immutable and canonicalized objects.

This whole paragraph would be 'rationale', so the typical formatting would be to use an italic font, just like commentary.

Looking at the rest of the document, it doesn't seem to be as strongly separating rationale from normative text.
I tried making everything except paragraph 1 and 3 italic, and it looked weird.

I think that's OK. It can look weird in this context, but that's just because we ought to change the rest of this document similarly, and in the meantime there's nothing lost by doing it here.

I try to use italics for non-normative text when it's in the middle of otherwise clearly normative sections, mostly as a signal to implementers that they don't have to treat it as gospel. But in introductory sections that are higher level or motivational, almost the whole thing is non-normative, so I don't bother highlighting it.

I admit I could probably be clearer and more disciplined about this. :)

eernstg · 2022-08-19T09:17:38Z

working/0546-patterns/records-feature-specification.md

+if all its record field expressions are potentially constant or constant expressions.
+
+_This means that a record expression can be used in the initializer list
+of a constant non-redirecting generative constructor._


This is a subtle and probably substantial extension of the expressive power of (potentially) constant expressions.

I can see that the next paragraphs contain considerations about the questions that came to my mind immediately; I've responded to those considerations below, and there is a certain amount of overlap, but I still think it makes sense to have the following considerations in one place:

It means that we would allow declarations like const C(int i): this.myRecord = (1, i);.

However, it is not a given that we will also allow const D(int i): this.myList = const [1, i];, or similar declarations using sets or maps (not to mention <constObjectExpression>s).

The obvious difference is that it is "not true" that [1, i] is a constant list, even though it will become a constant list after actual argument substitution in all those cases where the constructor is invoked with a constant expression as the actual argument passed to i. So we could allow const [1, i] in the initializer list, and then make it an error at each constructor invocation if the substitution of the actual argument for i doesn't produce a constant expression.

Do we intend to do the same thing for records, or would we accept C(DateTime.now().millisecondsSinceEpoch);? Is the treatment of identical() on records sufficiently supportive of the current notion of canonicalization to say that constant records are canonicalized? Or perhaps we just don't say that?

How about using an explicit const modifier at the front of a record expression in order to stipulate that this particular record is constant, thus giving records the same treatments as other composite entities like lists and maps?

It means that we would allow declarations like const C(int i): this.myRecord = (1, i);.

That's actually a consequence of the "potentially constant" paragraph below, not this one. Its true that we'd allow it.

We should definitely not allow const [1, i] (with or without the leading const). We use different implementation classes with different behavior for constant and non-constant lists.
A C(1, i) (class constant constructor invocation with constant/potentially constant arguments) is more likely, because that uses the same class for constant and non-constant invocations, but because constructors can execute (some) code, we can get into problems with recursive constructor calls. And it differs wrt. canonicalization, which records don't (at least you can't tell if they do when a single record can be non-identical to itself).

Records are special. They may or may not correspond to an object at all, and you can't tell if they do.
A record like (c1, c2) really is just the juxtaposition of its values. It's just two values, nothing more or less. If each value can occur, so can both.

A list has structure on top of its contents. We use different classes for const lists and non-const lists. Creating a list is an object creation expression.

Not so for records. Example:

class C { final int a, b; final (int, int) p1; const C(int a, int b, int c, int d) : a = a, b = b, p = (c, d); (int, int) get p2 => (a, b); }

There is no fundamental difference between the initializers for a, b, p1.$0 and p1.$1, and I don't want there to be one. The only difference is the grouping of the values, where p.$0 and p.$1 can be accessed together as p1, just as a and b can be accessed through p2. The p1 variable still just (potentially) two adjacent storage locations in the class.

We can add const in front of a record, but it feels spurious. It doesn't cause canonicalization, because records don't have a consistent identity anyway. It ensures that the field values are constants, but either that doesn't matter, it's enforced anyway, or you can put const on each field expression if necessary, like ({x = (const C(), const [])}).
(I'd be fine with allowing const expression for any expression, which then puts that expression into a constant context. The const (e1, e2) would just be shorthand for (const e1, const e2), and const 4 is valid and constant.

I don't know if we intend to do the same thing for structs. If we can, I'd probably be happy with it. A StructName(arg1,...,argn) would be constant (aka. is valid where a constant is required) if it invokes the primary constructor, and the argument expressions are constant.
However, since a struct can have non-primary constructors too, and those would presumably require const to invoke, it would be better for consistency, with other struct constructors and other const constructors in general, to require a const in front. So, "probably not" to doing the same for structs, or other constructs with constructors.

Records are special. They may or may not correspond to an object at all, and you can't tell if they do.

I can't see any way to support a reference of type Object whose value is assigned/initialized to a record, other than boxing (or using an existing boxed representation of) the record, unless that whole reference can be removed during optimization. The type could also be any top type, or it could be a type variable whose value is Object or a top type.

This means that records can be known to have a heap entity representation at some point. Any usage of that representation at the record type might unbox it, but the Object typed reference is not a weak reference and the fact that it is a reference to a heap entity makes it more object-like than the wording

It's just two values, nothing more or less

implies.

We can add const in front of a record, but it feels spurious. It doesn't cause canonicalization

It could cause canonicalization.

allowing const expression for any expression

Looks like that would be a thin layer of syntactic sugar: Putting const freshVariableName = expression; at the top level and then replacing const expression by freshVariableName.

eernstg · 2022-08-19T10:37:54Z

working/0546-patterns/records-feature-specification.md

+Records are always unmodifiable, and if their field values are deeply immutable,
+like constants values, the records are also deeply immutable.
+It's meaningless to consider whether record constants are canonicalized,
+since records do not have a persistent identity.


The purpose of having explicitly constant records and specifying canonicalization would be that (1) this clarifies the intent, and makes it an error if some components of a const record literal are not constant expressions, and (2) this serves as a request for physical canonicalization. Granted, if there is a location in code where this kind of record is unboxed then we would not promise to find the canonical object if and when it is re-boxed. Still, I'd expect the explicit request for canonicalization to be helpful, in order to make identical work as a fast path for testing equality.

If you plan to depend on canonicalization, you're setting yourself up for failure.

I'd expect a significant percentage of record-typed instance variables to be inlined in the object structure. You lose canonicalization for those immediately at creation time.

Asking for canonicalization is asking for something you should not expect getting, which is why I actually prefer not giving you the option.

If you plan to depend on canonicalization, you're setting yourself up for failure.

That is true!

But it still clarifies the intent and allows for compile-time checking that we have a constant.

Of course, a top level const freshVariableName = ... would do that, too. So the question is just whether or not we want this syntactic sugar.

Yeah, I think canonicalization is a non-starter for records. The language has no formal notion of what it means to assign a value or bind it to a variable. So even if you have a canonicalized constant record, there's no guarantee that when you mention it that doing so doesn't copy it to a thing with a new identity. Even binding it to the parameter to the identical() function itself could cause the canonicalized record to be copied to a new stack-allocated storage and lose its original identity.

eernstg · 2022-08-19T10:44:00Z

working/0546-patterns/records-feature-specification.md

+to be a constant, like there is for object creation expressions. 
+A record expression with field values that are constant-created values, 
+will be indistinguishable from a similar expression created in a constant 
+context, since identity cannot be used as a distinguishing trait.


Identity cannot be used as a negatively distinguishing trait, that is, we can't conclude non-sameness from identical(r1, r2) == false.

However, this doesn't imply that identical(r1, r2) == true is useless, and we might be able to use explicitly constant records to ensure that this occurs more frequently.

I really, really hope people won't start trying to force canonicalization and depend on the identity of records for anything.
It's an optimization. If your code can't run well if identical returns false on all records, you should fix it. Or stop using records for something where identity matters.

I really, really hope people won't start trying to force canonicalization and
depend on the identity of records for anything.

Right, we might very well want a lint against statically known records as arguments to identical, except for the use of exactly this approach in the implicitly induced implementation of == on record types.

But the point is that an explicit request for canonicalization could ensure that == runs faster, more frequently.

I'm strongly opposed to giving people the idea that they can affect canonicalization of records, and that it matters in any way.

You really cannot know that canonicalization will make anything run faster. It might. It's definitely true that more canonicalized objects gives more chances of a positive identical check.

But you also cannot know that requesting canonicalization actually will cause more canonicalization. It's perfectly reasonable for a compiler to canonicalize all the constant records expressions it can see anyway, and if the compiler chooses not to, perhaps it's for a good reason. And it still can choose not to, even if you request canonicalization.

If I was a compiler writer implementing records, I'd completely ignore any user request for canonicalization of records, because I can, and do whatever I think is best for the compilation.

So why give the user a button to press, if it's not connected to anything anyway?

Even if the compiler does canonicalize every single record, as soon as you mention it, the compiler could silently copy it to a new unboxed storage and you've lost the fact that it once referred to some canonical instance.

working/0546-patterns/records-feature-specification.md

munificent

I really like this, better than the naive approach I took in #2422. If you want to merge it under me, I'm fine with it. I'll rebase my PR to handle it.

munificent · 2022-08-19T23:09:55Z

working/0546-patterns/records-feature-specification.md

+Records are always unmodifiable, and if their field values are deeply immutable,
+like constants values, the records are also deeply immutable.
+It's meaningless to consider whether record constants are canonicalized,
+since records do not have a persistent identity.


Yeah, I think canonicalization is a non-starter for records. The language has no formal notion of what it means to assign a value or bind it to a variable. So even if you have a canonicalized constant record, there's no guarantee that when you mention it that doing so doesn't copy it to a thing with a new identity. Even binding it to the parameter to the identical() function itself could cause the canonicalized record to be copied to a new stack-allocated storage and lose its original identity.

munificent · 2022-08-19T23:12:05Z

working/0546-patterns/records-feature-specification.md

+to be a constant, like there is for object creation expressions. 
+A record expression with field values that are constant-created values, 
+will be indistinguishable from a similar expression created in a constant 
+context, since identity cannot be used as a distinguishing trait.


Even if the compiler does canonicalize every single record, as soon as you mention it, the compiler could silently copy it to a new unboxed storage and you've lost the fact that it once referred to some canonical instance.

munificent · 2022-08-19T23:13:42Z

working/0546-patterns/records-feature-specification.md

+_This means that a record expression can be used in the initializer list
+of a constant non-redirecting generative constructor._
+
+Constant *object* instantiations create deeply immutable and canonicalied objects.


I try to use italics for non-normative text when it's in the middle of otherwise clearly normative sections, mostly as a signal to implementers that they don't have to treat it as gospel. But in introductory sections that are higher level or motivational, almost the whole thing is non-normative, so I don't bother highlighting it.

I admit I could probably be clearer and more disciplined about this. :)

munificent · 2022-08-19T23:15:51Z

working/0546-patterns/records-feature-specification.md

+since records do not have a persistent identity.
+
+Because of that, there is no need for a `const (1, 2)` syntax to force a record 
+to be a constant, like there is for object creation expressions. 


This is a really good insight! I like the idea of records being automagically const. They're immutable and have no persistent identity, so there's literally nothing preventing a compiler from treating any record expression with compile-time-known field values as a constant. And if that's the case, I see no compelling reason to require users to write const before it.

On the other hand, auto-const for records might lead users to want other immutable classes to be auto-constable too, and then get annoyed that they aren't. Perhaps something to think about with structs.

True.
My perspective is that where const Foo(..) creates a new Foo object, (a, b) does not create anything. It just places a next to b. It's just two values instead of one value.
It's not that you create a constant pair, it's that a pair of constant values are ... well, two constant values. They can be used wherever two constant values are needed.
You never need "a constant pair", you need a pair of constant values.

We can allow an initializer list of the form:

const C((int, int) pair) : (this.x, this.y) = pair;

(We should!)

We could even allow

(final x, const y) = (1, 2);

(We probably shouldn't.)

I don't think there is anything particularly tricky about

class C { final int x, y; const C((int, int) pair) : (x, y) = pair; }

All we need is the ability to desugar the initializer list to x = pair.$0, y = pair.$1. But the opposite direction is more tricky:

class D { final (int, int) pair; const D(int x, int y) : pair = (x, y); }

And even more:

class E { final Object o; const E(int x, int y, bool b) : o = b ? (x, y) : x + y; }

Canonicalization is currently based on a compile-time version of identical applied to the values of each instance variable, but identical applied to a pair of records isn't guaranteed to ever return true, so we wouldn't have any guarantees about canonicalization of instances of E, unless we specify that this compile-time-identical behaves differently.

Perhaps it is enough to say that identical at compile-time will always perform a complete traversal in case it is invoked on two record values (of course, bailing out early if there is a non-match, but traversing the entire structure in order to obtain the value true).

Currently a non-record is canonicalized based on the `identical` comparison of its fields against previously created constant values of the same type. Since a record field does not have a predictably or useful `identical` comparison result, we instead define a comparison structurally on the record, and use that instead. Also formatting tweaks.

lrhn · 2022-08-23T12:30:52Z

Landed. Your merge :)

Add section on records being constant.

f37f1a0

My proposal for constant behavior.

lrhn requested review from munificent and eernstg August 18, 2022 13:37

eernstg reviewed Aug 19, 2022

View reviewed changes

lrhn mentioned this pull request Aug 19, 2022

Can records be const? #2337

Closed

munificent mentioned this pull request Aug 19, 2022

Address a bunch of records issues. #2422

Merged

munificent approved these changes Aug 19, 2022

View reviewed changes

munificent approved these changes Aug 22, 2022

View reviewed changes

lrhn merged commit fd0d6ff into master Aug 23, 2022

lrhn deleted the lrhn-record-constant-spec branch August 23, 2022 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add section on records being constant. #2413

Add section on records being constant. #2413

lrhn commented Aug 18, 2022 •

edited

Loading

eernstg left a comment

eernstg Aug 19, 2022

eernstg Aug 19, 2022

lrhn Aug 19, 2022

eernstg Aug 19, 2022

munificent Aug 19, 2022

eernstg Aug 19, 2022

lrhn Aug 19, 2022

eernstg Aug 19, 2022

eernstg Aug 19, 2022

lrhn Aug 19, 2022

eernstg Aug 19, 2022

munificent Aug 19, 2022

eernstg Aug 19, 2022

lrhn Aug 19, 2022

eernstg Aug 19, 2022

lrhn Aug 19, 2022

munificent Aug 19, 2022

munificent left a comment

munificent Aug 19, 2022

munificent Aug 19, 2022

munificent Aug 19, 2022

munificent Aug 19, 2022

lrhn Aug 22, 2022

eernstg Aug 23, 2022

lrhn commented Aug 23, 2022

	Constant object instantiations create deeply immutable and canonicalied objects.
	Constant object instantiations create deeply immutable and canonicalized objects.

Add section on records being constant. #2413

Add section on records being constant. #2413

Conversation

lrhn commented Aug 18, 2022 • edited Loading

eernstg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

munificent left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrhn commented Aug 23, 2022

lrhn commented Aug 18, 2022 •

edited

Loading