|
| 1 | +# Supporting a new language |
| 2 | + |
| 3 | +This section is to help developers implement support for a new language in `rust-code-analysis`. |
| 4 | + |
| 5 | +To implement a new language, two steps are required: |
| 6 | + |
| 7 | +1. Generate the grammar |
| 8 | +2. Add the grammar to `rust-code-analysis` |
| 9 | + |
| 10 | +A number of [metrics are supported](https://mozilla.github.io/rust-code-analysis/metrics.html) and help to implement those are covered elsewhere in the documentation. |
| 11 | + |
| 12 | +## Generating the grammar |
| 13 | + |
| 14 | +As a **prerequisite** for adding a new grammar, there needs to exist a [tree-sitter](https://github.com/tree-sitter) version for the desired language that matches the [version used in this project](https://github.com/mozilla/rust-code-analysis/blob/master/Cargo.toml). |
| 15 | + |
| 16 | +The grammars are generated by a project in this repository called [enums](https://github.com/mozilla/rust-code-analysis/tree/master/enums). The following steps add the language support from the language crate and generate an enum file that is then used as the grammar in this project to evaluate metrics. |
| 17 | + |
| 18 | +1. Add the language specific `tree-sitter` crate to the `enum` crate, making sure to tie it to the `tree-sitter` version used in the `ruse-code-analysis` crate. For example, for the Rust support at time of writing the following line exists in the [/enums/Cargo.toml](https://github.com/mozilla/rust-code-analysis/blob/master/enums/Cargo.toml): `tree-sitter-rust = "version number"`. |
| 19 | +2. Append the language to the `enum` crate in [/enums/src/languages.rs](https://github.com/mozilla/rust-code-analysis/blob/master/enums/src/languages.rs). Keeping with Rust as the example, the line would be `(Rust, tree_sitter_rust)`. The first parameter is the name of the Rust enum that will be generated, the second is the `tree-sitter` function to call to get the language's grammar. |
| 20 | +3. Add a case to the end of the match in `mk_get_language` macro rule in [/enums/src/macros.rs](https://github.com/mozilla/rust-code-analysis/blob/master/enums/src/macros.rs) eg. for Rust `Lang::Rust => tree_sitter_rust::language()`. |
| 21 | +4. Lastly, we execute the [/recreate-grammars.sh](https://github.com/mozilla/rust-code-analysis/blob/master/recreate-grammars.sh) script that runs the `enums` crate to generate the grammar for the new language. |
| 22 | + |
| 23 | +At this point we should have a new grammar file for the new language in [/src/languages/](https://github.com/mozilla/rust-code-analysis/tree/master/src/languages). See [/src/languages/language_rust.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/languages/language_rust.rs) as an example of the generated enum. |
| 24 | + |
| 25 | +## Adding the new grammar to rust-code-analysis |
| 26 | + |
| 27 | +1. Add the language specific `tree-sitter` crate to the `rust-code-analysis` project, making sure to tie it to the `tree-sitter` version used in this project. For example, for the Rust support at time of writing the following line exists in the [Cargo.toml](https://github.com/mozilla/rust-code-analysis/blob/master/Cargo.toml): `tree-sitter-rust = "0.19.0"`. |
| 28 | +2. Next we add the new `tree-sitter` language namespace to [/src/languages/mod.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/languages/mod.rs) eg. |
| 29 | + |
| 30 | +```rust |
| 31 | +pub mod language_rust; |
| 32 | +pub use language_rust::*; |
| 33 | +``` |
| 34 | + |
| 35 | +3. Lastly, we add a definition of the language to the arguments of `mk_langs!` macro in [/src/langs.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/langs.rs). |
| 36 | + |
| 37 | +```rust |
| 38 | +// 1) Name for enum |
| 39 | +// 2) Language description |
| 40 | +// 3) Display name |
| 41 | +// 4) Empty struct name to implement |
| 42 | +// 5) Parser name |
| 43 | +// 6) tree-sitter function to call to get a Language |
| 44 | +// 7) file extensions |
| 45 | +// 8) emacs modes |
| 46 | +( |
| 47 | + Rust, |
| 48 | + "The `Rust` language", |
| 49 | + "rust", |
| 50 | + RustCode, |
| 51 | + RustParser, |
| 52 | + tree_sitter_rust, |
| 53 | + [rs], |
| 54 | + ["rust"] |
| 55 | +) |
| 56 | +``` |
0 commit comments