diff --git a/text/0000-hir.md b/text/0000-hir.md new file mode 100644 index 00000000000..be8ef62c1f2 --- /dev/null +++ b/text/0000-hir.md @@ -0,0 +1,81 @@ +- Feature Name: N/A +- Start Date: 2015-07-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + + +# Summary + +Add a high-level intermediate representation (HIR) to the compiler. This is +basically a new (and additional) AST more suited for use by the compiler. + +This is purely an implementation detail of the compiler. It has no effect on the +language. + +Note that adding a HIR does not preclude adding a MIR or LIR in the future. + + +# Motivation + +Currently the AST is used by libsyntax for syntactic operations, by the compiler +for pretty much everything, and in syntax extensions. I propose splitting the +AST into a libsyntax version that is specialised for syntactic operation and +will eventually be stabilised for use by syntax extensions and tools, and the +HIR which is entirely internal to the compiler. + +The benefit of this split is that each AST can be specialised to its task and we +can separate the interface to the compiler (the AST) from its implementation +(the HIR). Specific changes I see that could happen are more ids and spans in +the AST, the AST adhering more closely to the surface syntax, the HIR becoming +more abstract (e.g., combining structs and enums), and using resolved names in +the HIR (i.e., performing name resolution as part of the AST->HIR lowering). + +Not using the AST in the compiler means we can work to stabilise it for syntax +extensions and tools: it will become part of the interface to the compiler. + +I also envisage all syntactic expansion of language constructs (e.g., `for` +loops, `if let`) moving to the lowering step from AST to HIR, rather than being +AST manipulations. That should make both error messages and tool support better +for such constructs. It would be nice to move lifetime elision to the lowering +step too, in order to make the HIR as explicit as possible. + + +# Detailed design + +Initially, the HIR will be an (almost) identical copy of the AST and the +lowering step will simply be a copy operation. Since some constructs (macros, +`for` loops, etc.) are expanded away in libsyntax, these will not be part of the +HIR. Tools such as the AST visitor will need to be duplicated. + +The compiler will be changed to use the HIR throughout (this should mostly be a +matter of change the imports). Incrementally, I expect to move expansion of +language constructs to the lowering step. Further in the future, the HIR should +get more abstract and compact, and the AST should get closer to the surface +syntax. + + +# Drawbacks + +Potentially slower compilations and higher memory use. However, this should be +offset in the long run by making improvements to the compiler easier by having a +more appropriate data structure. + + +# Alternatives + +Leave things as they are. + +Skip the HIR and lower straight to a MIR later in compilation. This has +advantages which adding a HIR does not have, however, it is a far more complex +refactoring and also misses some benefits of the HIR, notably being able to +stabilise the AST for tools and syntax extensions without locking in the +compiler. + + +# Unresolved questions + +How to deal with spans and source code. We could keep the AST around and +reference back to it from the HIR. Or we could copy span information to the HIR +(I plan on doing this initially). Possibly some other solution like keeping the +span info in a side table (note that we need less span info in the compiler than +we do in libsyntax, which is in turn less than tools want).