-
Notifications
You must be signed in to change notification settings - Fork 30
Union Types #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Union Types #1
Conversation
Co-Authored-By: Gabriel Caruso <[email protected]>
Discussions about the previous RFC:
|
I really like it. I've encountered many situations where I wanted a method to accept multiple types while retaining type safety. The solution so far was using docblock type hints + static analysis + runtime checks. Union types on a language level would be a much nicer solution! In addition to the nullable ( |
For this reason, support for the `false` pseudo-type is included in this proposal. A `true` pseudo-type is *not* part of the proposal, because similar historical reasons for its necessity do not exist. | ||
|
||
The `false` pseudo-type cannot be used as a standalone type, it can only be used as part of a union. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some clarity on why true
is not included as a pseudo-type is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A
true
pseudo-type is not part of the proposal, because similar historical reasons for its necessity do not exist.
it's already there ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. A bit weak, IMO, so if there isn't much complexity involved, I'd suggest introducing it to reduce the edge cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that true should be included as a pseudo type. Without reading the RFC one would assume it's included. It helps documentation & tutorials since no effort is needed to explain why it's not included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. A bit weak, IMO, so if there isn't much complexity involved, I'd suggest introducing it to reduce the edge cases.
@mnapoli I noticed your downvote. Care to elaborate? (Personally, I value your opinion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mnapoli I noticed your downvote. Care to elaborate?
@joshdifabio it's easier to simply downvote what you find on the internet rather than write a proper argument, I went with the lazy way 😄
Nikita summed it up nicely already, I don't have a strong opinion on this and I don't think my opinion matters much here, but: false
would exist for legacy reasons. true
would have no real reason to exist except consistency, and I would favor aiming for what's essential over consistency or completeness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is sound; my one concern is that allowing false as an explicit return type seems to indicate that it's an OK/good thing to do. I think most agree these days that returning something|false
is bad design, but PHP (and sadly much of userspace) is full of that bad design.
Is there some way, perhaps just in documentation, that we could call out that returning foo|false
is supported but a bad idea? Or at least avoid giving the implication that it's a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think people should just fall back to Foo|bool
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially now that the syntax for nullable type has switched to T|null
, it should be made clear that not only false
, but also both ?false
and false|null
, are unsupported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding support for false
in favor of bool
without proper support for literal types helps to promote bad design decisions
Appreciate the new RFC medium. |
This feature would be great! 😍 @nikic can this feature bring performance issues? |
rfcs/0000-union-types-v2.md
Outdated
array_filter(array $array, callable $callback, ArrayFilterFlags $flag): array; | ||
``` | ||
|
||
Proper enums are likely a better solution to this problem space, though depending on the implementation they may not be retrofitted to existing functions for backwards-compatibility reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really love to see both custom types and enums, as separate language features, so hope we keep the door open to both. In particular, I am strongly of the opinion that an enum value should not be coercible to a plain type, because that leads to nonsense like MONDAY == JANUARY
; as you say, that would rule out using them for existing function signatures. Custom types, meanwhile, could have more complex definitions - e.g. combining unions, intersections, constrained callables, and domains such as int{>=0,<100}
.
class B extends A {} | ||
|
||
class Test { | ||
public A|B $prop; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should not be allowed at all? Since A
type already includes B
as well, using A|B
type has no added value over just A
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted elsewhere in the RFC, this can't be known in the general case at compile-time, because classes A
and B
may not have been defined yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and type analysers can do the complaining for you (though at least in my case it'll just treat as public A $prop
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, makes sense. Yeah, I'm sure PHPStan will handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be moved to a separate section: Known problems
or Limitations
because while it should work as @enumag says, the engine doesn't make it possible ATM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see there would be some value if the language was able to point this out, but there's plenty of ways of writing redundant code, so I don't see it as a "limitation" particularly. For instance, you can already write if ( $x instanceof A || $x instanceof B )
, or even if ( $x instanceof A || $x instanceof A )
, which is clearly redundant. A tool like PHPStan may well spot this for you, and can probably spot redundant union types as well, because it can analyse your whole project.
Type aliases is great. |
If an RFC is to be on Github and in Markdown anyway, I wonder if it couldn't be made as a pull request to the language specification repo instead. It could be beneficial to that spec for it not to be updated later after RFCs are passed, but instead the spec update could be the RFC. Perhaps this is too radical however :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
The comments above are mostly tiny details, I think the proposal is very much sound and very welcome too.
I hope it is the right place to ask for something? Its the first RFC I read, thats why please be patient. IMO many programmers are going to use it like a mixed-typehint with "strict_types = 1". (something like : ":int|float|string|bool") Also IMO PHP was getting typified, harded, easier to read and write since version 7. Thx for that, but this feature could go in another direction. Maybe a realworld example could help me to understand, which sourcecode is getting easier to maintain or to write with this new feature? |
|
||
What is likely intended instead is `?T`, which allows returning either `T` or `null`. | ||
|
||
### `false` pseudo-type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO adding the false
pseudo-type because of library implementation details is the wrong choice. Adding a language-level feature to cope with a historical design flaw is only adding to the pile. It's legacy as soon as it's implemented.
I absolutely understand your reasoning; I myself would prefer to know a function cannot return true. However, if typehinting |bool
over |false
is such an issue, it should be fixed at the root (return null instead of false) instead of adding a band-aid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're getting at, but rather than "legacy as soon as it's implemented" it could also be seen as a pre-cursor for a more general feature, where you can specify a union of values rather than just types. See "Literal Types" in the Future Scope section for a brief discussion of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to BC constraints, changes to return type like that have a devastating potential to break uncountable (read infinite) amount of code on the internet, affecting adoption rate for the upgrade. I don't think you are wrong to say it's legacy the moment it's implemented, but it's part of coping with the 20 years baggage of PHP.
I think discussing the merits of BC breaks is not worth anybody's time at this RFC as Internals have been debating that for several months already (that I know of).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with false
is really one of meaning as opposed to technical requirement. With all the "standard" types, developers adopt a general definition of what they represent. The value false
is generally defined similarly to:
// The two values of a logical condition.
enum Boolean {
False,
True,
}
As we can see, false
was never meant to represent an error condition much in the same way that the string "apple" or the number 1234 don't generally represent error conditions either. Including false
as a pseudo-type simply perpetuates this misrepresentation.
Furthermore, I would argue that may actually be detrimental to developers as its usage in this manner typically indicates an error.
- In the case of a function or method that should return a value, what was likely meant was
null
or another type similar to Rust'sOption<T>
orResult<T, E>
which are meant to represent error conditions.
This also covers literal types, because they are generally used to represent a subset of some other type. Sincetrue
is invalid in this context andfalse
should not be combined with anything (its supertype does not have any other values),false
can no longer represent its intended meaning, one of two values. - In the case of a function or method that should not return a value, the correct result should actually be a thrown error instead. Using
false
in this second case is especially bad, as it may also prevent static analysis from narrowing the intended type.
class Map {
function has(key: TKey): bool {}
function get(key: TKey): T | false {}
}
function foo(): T | null {
$map = new Map();
if ($map->has("bar")) {
// Here we expect `get()` to always return `T`, however because of our poor
// design decision to not throw an exception, the type cannot be narrowed,
// and an error would be reported here because `T | false` != `T | null`.
return $map->get("bar");
}
return null;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're getting at, but rather than "legacy as soon as it's implemented" it could also be seen as a pre-cursor for a more general feature, where you can specify a union of values rather than just types. See "Literal Types" in the Future Scope section for a brief discussion of this.
Maybe it would be better to leave false out of this proposal but flesh it out more as a part of literal types? If someone wants to use that paradigm, allowing type false = false
would allow the end use to pull off those sort of Shenanigans without encouraging poor practices by specifically implementing them in code.
|
||
### Property types | ||
|
||
Property types are invariant, which means that types must stay the same during inheritance. However, the "same" type may be expressed in different ways. Prior to union types, one such possibility was to have two aliased classes `A` and `B`, in which case a property type may legally change from `A` to `B` or vice versa. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because I had to think this through for a moment, I'd suggest explaining why property types are invariant during inheritance. Specifically that setting a property would allow for contravariance, but that reading would only allow for covariance. The intersection of those rules is invariance.
I don't if it was already asked before, but why would you want this over method overloading? If a method accepts both a string and an integer wouldn't you want to have different behaviours in the method? In which method overloading creates a much clearer code base than a method that both handles a string and an integer. |
@robbertstevens Dan has aggregated the current state of why method overloading is probably never going to happen in PHP here: https://github.com/Danack/RfcCodex/blob/master/method_overloading.md |
There are some issues I would like to address. Union types are necessary, but I believe that some improvements must be made to this proposal. Type aliasingLet's look at the following example. use Foo\bar;
function test(bar $bar) {
// is $bar an object?
} If the namespace Foo;
type bar = int|float;
// Usable as \Foo\bar from elsewhere ProposalIntroduce use Foo\bar;
use type Foo\baz;
use type int|float|string|bool as scalar; // local type
function test1(baz $baz, scalar $scalar) {
// Don't make assumptions, check type definition.
}
function test2(bar $bar) {
// It's an object
} Nullable typesNullable types should not be allowed when defining a union type. use type ?(T1|T2|T3) as foo;
use type bar;
// What's this? Nullable nullable type?
function test(?foo $val) {
}
// Is bar nullable?
function test2(bar $val) {
// Should I check if $val is null?
if (is_null($val)) {
}
} ProposalKeep it simple. No nullable types inside unions use type T1|T2|T3 as foo;
// Not nullable
function test(foo $val) {
// $val can't be null
}
// Nullable
function test2(?foo $val) {
// I should check if $val is null
if (is_null($val)) {
}
} Keep the syntax cleanThere should be a single way of declaring a union type. The in place way of declaring a union type should be removed. use type float|int as number;
// Confusing..
function sum(float|int $a, int|float $b): number{
}
// ??
function example(int|float $value, int $flag = 2|4|8|16|32, string|object $item = null): float|int|false {
} It's hard to read, hard to follow and the syntax is super ugly. ProposalKeep the syntax unchanged by forcing the user to either export a union type or declare it locally with the help of use type float|int as number;
// Makes sense now..
function sum(number $a, number $b): number{
} This way the class ReflectionType {
isUnionType(): bool;
getUnionTypes(): array;
} Putting all together.. type scalar=int|float|string|bool; // export union type
use type int|float as number; // local union type
class Foo
{
private number $foo; // non-nullable property
private ?number $bar; // nullable property
// non-nullable arguments & return type
function sum(number $a, number b):number {
}
// nullable arguments and return type
function test(?scalar $value): ?scalar {
}
} |
@mattacosta @kocsismate in PHP, there is a concept of " |
To add to what @Ocramius said, typed properties default to uninitialized, so you can simply use Unfortunately, there is no easy way to distinguish between an uninitialized and a null property right now (only possible via reflection), so this still falls short in practice. But generally this would be the correct way to go about it, I think. |
@nikic Yes, I agree to you, and I already missed an easy-to-use functionality to check uninitialized properties when I wanted to lazy-load other properties which had type declarations. As far as I remember I had to downgrade type declarations to PHPDoc at the end :/ . Speaking about |
I have always hated that typed properties added an extra "uninitialized" state, which can't be detected, and only exists in certain scenarios. I still think it was a bad design. If you want more than one terminal value, then some kind of enum or algebraic data type is probably the ideal, and a union with specific values can emulate this a bit, e.g. |
I admit that my P.S: I believe the uninitialized state is a "necessary evil", otherwise we had trouble declaring non-nullable class types for properties or we ended up in a situation where Java is in now (where classes are implicitly nullable). |
The reason I assumed that your code could work with booleans is because if not, then the property could just as easily have been typed as All of that is irrelevant to my original question of if a restriction would be feasible though. Assuming it is, I ask because the primary argument for As a side note: It also kind of sucks that the |
Please make all RFC's available on Github like this from now on! |
A similar argument could be said in favour of scalar types for, e.g., |
This comment has been minimized.
This comment has been minimized.
Proposal+, GH RFC - good practice! Thx @nikic. |
|
||
The `null` type is supported as part of unions, such that `T1|T2|null` can be used to create a nullable union. The existing `?T` notation is considered a shorthand for the common case of `T|null`. | ||
|
||
An earlier version of this RFC proposed to use `?(T1|T2)` for nullable union types instead, to avoid having two ways of expressing nullability in PHP. However, this notation is both rather awkward syntactically, and differs from the well-established `T1|T2|null` syntax used by phpdoc comments. The discussion feedback was overwhelmingly in favor of supporting the `T1|T2|null` notation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just ?T1|T2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ambiguous syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have more than two types, like T1|T2|T3
, then it's become complicated to place ce question marks (?T1|T2?|T3?
or ?T1|?T2|T3?
).
The T1|T2|T3|null
syntax looks more comprehensive in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthieu88160 The second question mark was outside as I was asking a question. More Types would look like this ?T1|T2|T3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Majkl578 But is it really that ambiguous? The question mark already establishes to allow null. ?T1 which is current is the same as null|T1. Whether someone reads that as (null|T1)|T2|T3 or null|T1|T2|T3 isn't going to make a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(null|T1)|T2|T3
and null|T1|T2|T3
is equivalent.
Please read the discussion in this thread where this choice was discussed: #1 (comment)
Voting has closed with 61 votes in favor and 5 against. As such, the RFC is accepted. |
Nice! Meta: will RFCs be opened to broader discussion via GitHub in the future as well? Also would it make sense to actually merge the accepted RFCs into the repo, instead of just closing the PR? |
@rask See https://externals.io/message/107747. The tl;dr for now is: RFCs can be initially submitted against GitHub, but have to move to the PHP wiki before voting. |
Sad but true. Nullable types are fine. I also understand that a lot of builtin function return false if error, so something like falseable is also good. For example ?int - falsable int, ??int - nullable int. But the most of examples I see here https://gist.github.com/nikic/64ff90c5038522606643eac1259a9dae are just funny. RFC must prohibit such thigs: Maybe some words as "scalar", "numeric" are acceptable, but such a full freedom in types definition is a.. is a... is a bottom|disaster|php-sunset|distrubance|upset|etc |
to save space? |
It's terrible feature. I can't accept that. I like that one variable has one type. |
Proposal for the addition of union types
T1|T2|...
. There has been a previous proposal on this topic, which has been declined.Final RFC and Vote
(Original Rendered Proposal)