Skip to content

Add typeclass derivation #5540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
Jan 19, 2019
Merged

Add typeclass derivation #5540

merged 56 commits into from
Jan 19, 2019

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Nov 29, 2018

An implementation of typeclass derivation. Direct link to docs:

https://github.com/dotty-staging/dotty/blob/add-derive-2/docs/docs/reference/derivation.md

Based on #5518.

@odersky odersky force-pushed the add-derive-2 branch 5 times, most recently from 9d8f1f3 to 0bf30ce Compare December 3, 2018 13:22
@odersky odersky removed the stat:wip label Dec 3, 2018
inline erasedValue[Elems] match {
case _: (elem *: elems1) =>
tryPickle[elem](buf, elems(n).asInstanceOf[elem])
pickleElems[elems1](buf, elems, n + 1)
Copy link
Contributor

@julienrf julienrf Dec 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit fragile that we have to manually manage the n counter. Would it be possible to get the corresponding field mirror as a parameter? Or maybe to extract it?

case _: (mirrorred(elem, label) *: elems1) =>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means we'd have to handle extractors in the inlining framework, which is not yet possible.

Copy link
Contributor

@julienrf julienrf Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so maybe forget about using an extractor. Still, I think it would be better to provide the typeclass implementers a correctly typed elem value (to avoid the asInstanceOf) along with its label, without requiring them to manage an elements counter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can see that as part of a framework, which could also be built on top of what we have here.

if (n == ordinal) unpickleCase[T, elems](r, buf, ordinal)
else unpickleCases[T, alts1](r, buf, ordinal, n + 1)
case _ =>
throw new IndexOutOfBoundsException(s"unexpected ordinal number: $ordinal")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s unclear to me at what time is this exception raised. Is it at compile-time or at run-time?

Also, how do we report at compile-time to the user that a typeclass instance for the shape of its type can not be generically derived?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's at runtime. There's a typelevel.error method that gives a compile-time error message.

@odersky odersky force-pushed the add-derive-2 branch 11 times, most recently from 790779a to 48dcb89 Compare December 9, 2018 14:53
@odersky
Copy link
Contributor Author

odersky commented Dec 9, 2018

@OlivierBlanvillain @milessabin @propensive This is more or less baked now. Feedback appreciated!

It's probably best to start with the docs https://github.com/dotty-staging/dotty/blob/add-derive-2/docs/docs/reference/derivation.md.

Besides for `enums`, typeclasses can also be derived for other sets of classes and objects that form an algebraic data type. These are:

- individual case classes or case objects
- sealed classes or traits that have only case classes and case objects as children.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can they also have sealed classes as children?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they are not case classes, no.


A trait or class can appear in a `derives` clause if

- it has a single type parameter, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any restriction on the kind of that type parameter? For instance, is it possible to write derives Functor (assuming a trait Functor[F[_]] definition)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such restriction. But there's a caveat: The Shape type is always first-kinded. If you want to derive a higher-kinded type class instance from that you have to do some lifting of the Shape type to the type constructor level. That's possible using the type match machinery. @OlivierBlanvillain has done a PoC.

If the ADT in question does not have a `derives` clause, an implicit `Generic` instance
would still be synthesized by the compiler at the point where `derived` is called.
This is similar to the situation with type tags or class tags: If no implicit instance
is found, the compiler will synthesize one.
Copy link
Contributor

@julienrf julienrf Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the synthesis of Generic be kept as an implementation detail? Do we need to be aware of where it comes from, if anyway we can always get a Generic[A] when we ask one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could keep it as a detail. It's just that the synthesis does generate a significant amount of code, so it's good to know that this will be done only once per class with a derives clause instead of once per usage.

@OlivierBlanvillain
Copy link
Contributor

OlivierBlanvillain commented Dec 14, 2018

What would we lose if we stick to the traditional sum-of-product representation instead of Mirror/Case/Cases/GenericClass? I feel that the added complexity needs very good motivation. The documentation claims faster compilation times and better runtime performance, it would be interesting to benchmarks both points.

The generated code for Eq[Tree[T]] is still far away from a handwritten version that doesn't allocate. It's impressive that we can get that far with inlining alone, but I feel that staging would be a more appropriate tool to get to zero allocation.

Here is a proof of concept implementation for typeclass derivation for Eq that uses staging and a Shapeless style sum-of-product representation. The generated code doesn't allocate any intermediate representation: the HList is only materialized at compile time and completely folded away by the staging framework to generate direct field access. The code is also type-safe (no casts), but I didn't measure compilation times.

If staging can be used reliably to do typeclass derivation I feel that Generic interface proposed in this PR sits in an awkward spot: it's both more complicated than shapeless.Generic and generates suboptimal code...

@odersky
Copy link
Contributor Author

odersky commented Dec 14, 2018

What would we lose if we stick to the traditional sum-of-product representation instead of Mirror/Case/Cases/GenericClass? I feel that the added complexity needs very good motivation.

I believe a Sum-Of-Products representation is more complex than what we have! In particularly, representing sums and products as binary constructs obscures things a lot. I feel it's more natural to use tuples and indexing. But I am happy to be proven wrong on this.

We now have test cases (typeclass-derivation2a.scala/typeclass-derivation3.scala) that could be adapted to a different scheme and we could compare.

Here are the criteria by which I would like to measure solutions, in decreasing order of importance:

  • size of generated code
  • runtime efficiency of generated instances (both baseline and ability to be optimized)
  • compile times
  • ability to support typeclasses of all kinds
  • difficulty of developing generic typeclasses

odersky and others added 26 commits January 19, 2019 17:38
`shapeWithClassType` does not return a NoType anymore.
 - harden addChild to also work if existing children don't have positions
   (this can happen if they came from other compilation units)
 - tweak position of Deriver, so that it always points to the deriving class
   template.
Since Shape is no longer generated separately we do not need to make
an exception for synthesized members.
Children now appear in the order of their definition, where before
there was no guaranteed order and the actual order was more likely the
reverse of definition order.
Better not to mix different proposals.
The erasure of these classes should be Product not Object. Otherwise
we cannot discriminate in pattern matching between pairs and ().
Cases in a match could be empty in case of errors
@odersky odersky merged commit 2d16d9f into scala:master Jan 19, 2019
@allanrenucci allanrenucci deleted the add-derive-2 branch January 19, 2019 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants