Description
Life has a melody, Gaius. A rhythm of notes that become your existence once they're played in harmony with God's plan. Come. See the face of the shape of things to come. -- "6" from Battlestar Galactica
NOTE: This plan covers 2025 and beyond as it is very ambitious.
Nil/not nil
The plan for this feature changed again and it is now:
ref/ptr T
: Remains to be unchecked so that it is backwards compatible with every Nim version ever released. But eventually this will be written asunchecked ref/ptr T
.ref/ptr T not nil
: Checked at compile-time (we need to fix the bugs in the implementation).nil ref/ptr T
: Can be nil and it is checked at compile-time that every deref operation is within a guard likeif x != nil
.
This feature is not scheduled for any particular release.
Version 2.4
- [ ] Enables strict "definite assignment" analysis which enforces at compile-time that variables have been initialized.
- Offers an experimental "type-bound operations" mode.
Version 3
Version 3 will be achieved via a combination of compiler phase rewrites, code reuse, refactorings and porting of compiler code. The primary goal is to finally give us a Nim that offers:
- Incremental recompilations.
- No forward declarations for procs and types required.
- Allow for explicit cyclic module dependencies.
- Type-checked generics.
- Avoid the phase ordering problems that plagued Nim for a long time: Destructors
and other=hooks
can be invoked before they have been synthesized successfully
which is hard for users to understand.
Implementation
The implementation will use NIF for everything: It is the data format used for communication between the different compiler phases.
Internally most (if not all) phases work on streams of NIF tokens, no tree constructions are required. It is expected that this reduces the amount of memory allocations the compiler has to perform by an order of magnitude.
Arguably a token stream enforces a principled approach to compiler development where by design subtrees cannot be forgotten to be traversed and handled. Roughly a NIF token corresponds to a PNode
in the old compiler. A NIF token takes 8 bytes in total, a PNode
of today's compiler takes 40 bytes. Therefore it is expected that the new compiler takes 5 times less memory than the current compiler. Since precompiled modules are loaded lazily, this factor should be even higher.
The phases of compilation are:
- Pure parsing (nifler): Turn Nim code into a dialect of NIF.
- Semantic checking phase 1 (nimony): symbol lookups, type checking, template¯o expansions.
- Semantic checking phase 2 (nimony): Effect inference.
- Iterator inlining (lowerer).
- Lambda lifting (lowerer).
- Inject derefs (and the corresponding mutation checking) (lowerer).
- Inject dups (lowerer).
- Lower control flow expressions to control flow statements (elminate the expr/nkStmtListExpr construct) (lowerer).
- Inject destructors (lowerer).
- Map builtins like
new
and+
to "compiler procs" (lowerer). - Translate exception handling (lowerer).
- Generate NIFC code (gear3).
These phases have been collected into different tools with dedicated names.
NIF
NIF is a general purpose text format designed for compiler construction. Think of it as a "JSON for compilers". NIF has a short precise specification and a Nim library implementation.
While NIF is almost a classical Lisp, it innovates in these aspects:
- It uses a separate namespace for tags ("builtins") and source code identifiers. It is most extensible and supports a wide range of abstraction levels. Code that is very high level can be represented effectively as well as code that is close to machine code.
- Line information is carried around during all phases of compilation for easy debugging and code introspection tools. The line information is based on the difference between a parent and its child node so that the resulting text representation is kept small.
- Declaration and use of a symbol are clearly distinguished in the syntax allowing for many different tasks to be implemented with a fraction of the usual complexity: "find definition", "find all uses" and "inline this code snippet" are particularly easy to implement.
- There is an additional format called Nif-index that allows for the lazy on-demand loading of symbols. This is most essential for incremental compilations.
Status: Implemented in 2024. No further changes anticipated.
Nifler
The Nifler tool encapsulates the initial Nim-to-NIF translation step and is generally useful for other tools that want to process Nim code without importing the Nim compiler as a library.
Nifler can also evaluate Nim's configuration system, the nim.cfg
files and the NimScript files so that tools like nimsuggest
get precise --path
information and configuration settings without having to import the Nim compiler as a library.
Status: Implemented in 2024. Minor adjustments required.
Nimony
The primary point of Nifler is to shield "Nimony" from Nim's compiler internals. Nimony is a new frontend for Nim, designed from day one for:
- Low memory consumption.
- Incremental compilation.
- Tooling. The compiler continues after an error and supports "find all usages" and "goto definition"
which work much more reliably since generics and macros are type-checked too. - Efficient handling of code bases that make heavy use of generics, type computations and macros.
- Reducing the bug count by orders of magnitute. A "Minimal Redundancy Internal Representation" is used, ensuring that there is only one access path to a piece of data. The internally used data structures cannot get out of sync.
Status: In heavy development.
Lowerer
The lowerer's job is to "lower" high level Nim code to low level Nim code that does not use features such as closures, iterators and automatic memory management. It is planned to only support Nim's ARC/ORC scheme of doing memory management. In the old compiler ARC/ORC was very complex to support as the problem was not as well understood as it is now: A key insight here is to split up the tasks into multiple well-defined subtasks:
- Inject dups/copies: This can produce weird constructs like
while (;let tmp = f(); tmp.value)
. - Lower control flow expressions to statements. This means
if
andcase
do not produce values anymore. - Inject destructors: Now that values have been bound to temporaries explicitly and the control flow has been simplified it is rather easy to inject
=destroy(x)
calls at scope exists.
As previously mentioned, the lowerer also does:
- Map builtins to Nim's runtime.
- Iterator inlining.
- Eliminate closures by performing "lambda lifting".
- Inject pointer derefs and implement "pass by reference".
- Translate exception handling constructs to NIFC's supported error handling.
Status: In heavy development.
Expander
The expander ("Gear 3") performs backend tasks that need to operate on multiple NIF files at once:
- It copies used imported symbols into the current NIF file. As a fix point operation
until no foreign symbols are left. importc
'ed symbols are replaced by their.c
variants.importc
'ed symbols might lead to(incl "file.h")
injections.- Nim types must be translated to NIFC types.
- Types and procs must be moved to toplevel statements.
Status: Implemented in 2024. Major adjustments expected.
NIFC: C/C++ Backends based on NIF
NIFC is a dialect of NIF designed to be very close to C. Its benefits are:
- NIFC is easier to generate than generating C/C++ code directly because:
- It uses NIF's regular syntax.
- It allows for an arbitrary order of declarations without the need for forward declarations.
- NIFC improves upon C's quirky array and pointer type confusion by clearly distinguishing
betweenarray
which is always a value type,ptr
which always points to a
single element andaptr
which points to an array of elements. - Inheritance is modelled directly in the type system as opposed to C's quirky type aliasing
rule that is concerned with aliasings between a struct and its first element. - NIFC can also produce C++ code without information loss because inheritance and exception handling are directly supported.
Status: Implemented in 2024. Bugfixes required.