Skip to content

[WIP] Scala with Explicit Nulls #5747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 127 commits into from
Closed

[WIP] Scala with Explicit Nulls #5747

wants to merge 127 commits into from

Conversation

abeln
Copy link
Contributor

@abeln abeln commented Jan 18, 2019

This PR sketches how to change the Scala type system so that reference types are no longer implicitly nullable. Instead, nullability can be recovered via union types (e.g. nullable strings have type String|Null).

See the accompanying doc for a description of the design: https://gist.github.com/abeln/9f79774bac111d99b3ae2cb9016a33e6

The changes include

  • a new type hierarchy where Null <: Any directly, and of no other type
  • a type "translation" layer for Java interop, to expose the true nullability of types in Java fields/methods
  • a simple form of flow-sensitive type inference to make it easier to work with nullable values

Unfortunately, the changes end up touching many components within the compiler (parser, typer, implicits, etc.), since assumptions about Null are baked in in many places. However, the code that
does the actually-interesting stuff is not much; most of the modified files are fixed tests.

There are still a few (very) important TODOs:

  1. fix all the 37 remaining failing tests (listed here https://pastebin.com/k5qbsrGn)
  2. port the Scala standard library to the new type system
  3. bootstrap Dotty with the new type system

However, before embarking on 2. and 3. we wanted to get feedback from the Dotty team and the community on how things are looking.

Any and all feedback is greatly appreciated!

abeln added 30 commits January 15, 2019 16:53
This is the initial change disconnecting Null from the bottom of the
type hierarchy. Modified multiple places in the compiler where the
notion that Null is a subtype of any reference type is hardcoded.

At this point, there are many failing tests and the compiler can no
longer bootstrap.
The two replaced cases are:
  * ascriptions: (null: String) => (??? : String)
  * vals/vars: val x: String = null => val x: String = ???
The JLS (https://docs.oracle.com/javase/specs/jls/se7/html/jls-14.html#jls-14.18)
explains that "throw" should accept a null argument (in which case it
throws an NPE):

```
If evaluation of the Expression completes normally, producing a null value, then
an instance V' of class NullPointerException is created and thrown instead of null.
The throw statement then completes abruptly, the reason being a throw with value V'.
```

So changed the typer so the prototype allows for null.
This time we convert defs as well:

def foo(x: String): String = null

=>

def foo(x: String): String = ???
Commit cc64374 incorrectly updated
`isBottomClass` in the bakend interface.

Revert the change because in the backend types are nullable.
This first version of the transform adds "|Null" to field types and
method argument and return types of Java classes.

e.g.

class C {
  String foo(String x);
}

becomes

class C {
  String|Null foo(String|Null x);
}

Type parameters also get nullified (e.g. "ArrayList[T] =>
ArrayList[T|Null]").
JavaNull is defined as `type JavaNull = Null @JavaNullAnnot`

On selections from an expression of type `T | JavaNull`, we select as if
we were selecting from T. This is intended to make Java interop more
user-friendly, because null values coming from Java are typed as `T |
JavaNull`. Of course, this means selections on Java-retured values can
fail with NPEs.
When the compiler encounters a method with a varargs argument, the type
of the varargs is initially represnted as an Array[T]. Later, it is
transformed into a RepeatedParamType[T].

However, the nullability transform makes it so that the varargs has type
`Array[T|JavaNull]|JavaNull`. We need to teach `arrayToRepeated` how to
handle that case so we can get `RepeatedParamType[T|JavaNull]|JavaNull`
as the result.
Instead of changing typedSelect, add the special case for JavaNull in
Types#findMember. Additionally, cleaned up the tests, which now pass
without -Ychecks but fail with -Ychecks.

Additionally, there's a problem where the compiler won't infer a union
type:

e.g.
```
val x = new ArrayList[String]()
val r = x.get(0)
```

The compiler will infer `r: Object` and not `r: String|JavaNull`.
Need to address separately.
Tag as nullable TypeParamRefs, so we can handle polymorphic Java
methods.
When an enum is read from Java code, the compiler synthesizes a bunch
of classes/modules/fields for it.

One of the synthesized entities is a class that extends java.lang.Enum
and calls its constructor. The first argument of the constructor is a
string, so we were passing null which failed. Pass the empty string
instead. This is ok because the synthesized Java code isn't run: it's
just there for typechecking.
Before erasure, reference types are non-nullable, but after it they
should be nullable again, because JVM types are nullable.

This fixes tests/pos/i536 by changing the notion of a nullable type to
take into consideration the current phase id.

A similar thing is already done in TypeComparer in a different case:
https://github.com/lampepfl/dotty/blob/master/compiler/src/dotty/tools/dotc/core/TypeComparer.scala#L676
… type `Nothing`

When desugaring pattern matching code for expressions where the
matched value has type `Null` or `Nothing`, we used to generate code
that's type-incorrect.

Example:
```
val Some(x) = null
```

got desugared into
```
val x: Nothing =
      matchResult1[Nothing]:
        {
          case val x1: Null @unchecked = null: Null @unchecked
          if x1.ne(null) then
            {
              case val x: Nothing = x1.value.asInstanceOf[Nothing]
              return[matchResult1] x: Nothing
            }
           else ()
          return[matchResult1] throw new MatchError(x1)
        }
```

There were two problems here:
1) `x1.ne(null)`
2) `x1.value`

In both cases, we're trying to invoke methods that don't exist for type
`Nothing` (and #2 doesn't exist for `Null`).

This commit changes the desugaring so we generate a no-op for unapply
when the value matched has type `Nothing` or `Null`. This works because
the code we used to generate is never executed (because the `x1.ne(null)`) check.
This adds a second TypeMap, specifically for constructors.
For constructors, we nullify the argument types, but not the return
type.

Once we nullify the arguments of constructors, all case classes were
breaking, because the logic to generate synthetic methods for case
classes relies on finding the symbol for IndexOutOfBoundException, which
changed with this CL. Patch up that logic as well.
abeln added 18 commits January 15, 2019 17:00
Null <: Any makes for a cleaner type hierarchy.
For example, we can now abstract over non-nullable types with
`def foo[T <: AnyRef](x: T) = ...`

However, both AnyRef and Null need to be comparable with reference
equality, so we add a new trait RefEq
```
trait RefEq {
  def eq(that: RefEq): Boolean
  def ne(that: RefEq): Boolean
}
```

and make both AnyRef and Null extend RefEq.

RefEq is completely synthetic, and it gets erased to Object.
Go back to _not_ ignoring nullability during override checks.
The original motivation was twofold:
  1) make migration more easy (less type errors to fix in code that
overrides java classes)
  2) enable binary compatibility with pre and post nullability versions
of a Scala library

However, since our current approach to binary compatibility is to not
do anything, point 2 is now moot.

If and when we need to re-enable this in the future we can always do so.
Until then, this eliminates a source of unsoundness.
In some cases during the null transform we see Java types of the form
`A & B`, which weren't previously handled.

Handle intersections by

nf(A & B) = nf(A) & nf(B) | JavaNull (& binds stronger)

but take care not to add JavaNull again while nullifying A and B.
Copy link
Member

@dottybot dottybot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, and thank you for opening this PR! 🎉

All contributors have signed the CLA, thank you! ❤️

Commit Messages

We want to keep history, but for that to actually be useful we have
some rules on how to format our commit messages (relevant xkcd).

Please stick to these guidelines for commit messages:

  1. Separate subject from body with a blank line
  2. When fixing an issue, start your commit message with Fix #<ISSUE-NBR>:
  3. Limit the subject line to 72 characters
  4. Capitalize the subject line
  5. Do not end the subject line with a period
  6. Use the imperative mood in the subject line ("Add" instead of "Added")
  7. Wrap the body at 80 characters
  8. Use the body to explain what and why vs. how

adapted from https://chris.beams.io/posts/git-commit

Have an awesome day! ☀️

@smarter
Copy link
Member

smarter commented Jan 19, 2019

Very cool! Before we get to reviewing this PR, it'd be helpful if it was cleaned up a bit, there's many commits like "fix tests" which are not really meaningful on their own. Ideally, commits in a PR should be atomic: serve a clear purpose detailed in their commit message, and pass all tests (we don't actually enforce that in the dotty repo currently). This PR also needs to be rebased.

Would it be possible to gate the invasive semantic changes of this PR behind a compiler flag ? This way we could merge it even if we're not sure if we'll accept it, and it'll allow more experimentation in the wild.

https://gist.github.com/abeln/9f79774bac111d99b3ae2cb9016a33e6 states that JavaNull is non-denotable. Could the documentation be expanded to justify this ? I'm personally very wary of inferring types that the users cannot write down or talk about, it makes it much harder for users to reason about their code.

@odersky
Copy link
Contributor

odersky commented Jan 19, 2019

@abeln I am excited about this PR! Its timing was a bit unfortunate, since we just merged a large change how positions get computed which affected many lines. So it will take some effort to rebase, I am afraid.

@smarter
Copy link
Member

smarter commented Jan 19, 2019

It looks like the behavior of this proposal is being discussed at https://contributors.scala-lang.org/t/wip-scala-with-explicit-nulls/2761 which is just as well since it means we can limit the comments on this PR to discussing the implementation.

@abeln
Copy link
Contributor Author

abeln commented Jan 22, 2019

@smarter I think we should be able to gate the changes behind a flag.

I don't know how to make the changes atomic, since the algorithm for developing the feature so far has been

  1. change the type hierarchy
  2. a million tests brake
  3. add fixes to gradually fix the tests

Some test fixes involve changing just the tests, but others modify the compiler.
Even now there are a ~30 broken tests, one of which is the standard library.

The one way to have a less atomic change, but one that keeps the tests passing, would be to squash all the commits into a one.

For reviewing the PR, I think the best way is to go file-by-file, and not commit-by-commit. There's really not that much code to review within the compiler.

So on my end, I can

  1. gate the feature behind a flag
  2. update the PR as per the changes proposed in the Scala contributors thread
  3. fix all remaining positive and run tests
  4. rebase

How does that sound?

@smarter
Copy link
Member

smarter commented Jan 22, 2019

Sounds good. If having atomic commits require big commits that's fine too, as long as the commit message is detailed enough, think about someone doing git blame on your code in five years and what kind of things could help them understand what's going on :).

petrpan26 and others added 2 commits January 22, 2019 14:59
This improves flow sensitive inference so that it handles a bunch of
previously-unsupported cases:
  * conditions inside blocks
  * inlined code
  * isInstanceOf checks
  * reference equality checks: eq and ne
@abeln
Copy link
Contributor Author

abeln commented Jan 22, 2019

Closing this for now while I rebase and fix the rest of the failing tests.

@abeln abeln closed this Jan 22, 2019
@abgruszecki
Copy link
Contributor

@abeln - it might be useful for you to be aware of #4004. Current status quo is that isInstanceOf[Null] is permitted by Dotty, but should be rejected with an error (as is the case in Scalac). I saw suggestions on Scala contributors to equate x == null with x.isInstanceOf[Null] so in the case you go in that direction, please remember to update #4004 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants