Do not materialize optional undefined
properties inside actual types
#28727
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is what I suggested when I had heard about the problem #28707 was opened to fix.
Fixes #28540 without making the input code into an error 😉
This is done by tracking the freshness state of the unions which contain and used to contain fresh object literals, and generating the missing properties on the fly as needed in any given function. This greatly reduces our memory usage in pathological examples like the given test, to the point where we can check it pretty easily - the test actually takes slightly less time to execute than the
largeControlFlowGraph.ts
test, which is which is actually only a third of the length of the new test.Also in this PR:
Subtype reduction is supposedly an optimization; but the gamble is that the powerset comparison within the union is beaten by the potential savings of eliminating union members for future comparisons. #28707 simply disables subtype reduction and reports an error to prevent followup issues with larger types. Instead, this lets the non-reduced type through and handles larger types better. For truly large unions in practice, subtype reduction is simply rarely a benefit - as normally these things get assigned once or twice and then disappear, rather than compared and narrowed extensively (like smaller unions do). In effect, if the number of followup comparisons exceeds the square of the number of union members not eliminated, then subtype reduction is a plus - however if it regularly eliminates few or no members from the union, then for large unions, it represents a huge cost for little potential gain.
It's possible there's an efficient middleground in subtype reduction here for large types where, rather than outright disabling subtype reduction, we divide the union into object and non-object type buckets, perform normal reduction on the non-object types (since there are usually a limited number of them, unless someone has 10000 different string literal variants in a single union for some reason), scan the object types of the union for the "least specific type" and then attempt to assign each union element to only the least specific type - that grows only linearly with union size, rather than exponentially, and still would handle cases where, eg, there's a lot of very specific types and a single base type in the array to unify them. In abstract, the idea is to spend one pass over the list grouping things into similarity classes which are likely a subtype of a single easily identifiable thing in the class, then doing a single pass on each group to check if it can be reduced to the candidate in that class, then combining the candidates (if a group succeeds in being entirely a subtype of its candidate) or classes back together. Just spitballing here, though - mostly a sketch of trying to the use sorta-transitivity of the subtype relationship to remodel subtype reduction into a divide and conquer style algorithm.
Anyways, back to smaller things in this PR:
On top of this, one of those assignments involving the widened and unwidened forms of large unions are now trivial - that relationship is tracked (for the purpose of fetching properties lazily rather than storing them in a ton of maps and consuming space), so that comparison can be effectively skipped, thus with this PR now these large unions of types can often have no actual costly comparisons performed on them.
A hard limit on the length of a type string we generate before applying truncation rules, even if
noTruncation
is set - this is done to avoid an OOM crash for huge output types, as the materialized AST nodes for the types we can now create and reason about are, well, huge. I've capped this at 100,000 characters right now, which is incredibly generous compared to the normal truncation limit of 160 characters. If you hit the 100,000 character limit, you're probably not reading the type anyway, and we'd probably crash on trying to parse the type back in anyway. This isn't strictly necessary for this PR to pass at this point, but I added it so we wouldn't OOM on the type baselines for the new test before I addedskipTypeAndSymbol
(see next bullet). I'd understand if we'd rather not see this in this PR.The ability to
skipTypeAndSymbol
baselines for a given test (because type and symbol baselines for a 30k line long file are never going to be reviewed anyway, and they take like 30 seconds to generate besides because of how big the types are, even with the 100,000 character limit).Minor optimizations to union property generation - rather than using
appendIfUnique
, we push symbols into a map, to avoid the cost of checking uniqueness repeatedly in unions with a huge number of members.undefined
property symbols are now globally cached and don't include a declaration - meaning we produce just oneundefined
symbol for each property name, rather than one for each name maybe with a declaration from the first one of that name we saw. This has the nice side effect of removing duplicated symbol declaration entries from union symbols, with the minor downside that if a property is narrowed to just an impliedundefined
symbol, it will no longer have an associated declaration at that site.And one smaller change that might be a regression/bug but also might not depending on how flexible we are: