Skip to content

Commit eb70225

Browse files
dburrissmarco-c
andauthored
New language implementation reference (mozilla#692)
Co-authored-by: Marco Castelluccio <[email protected]>
1 parent fc11d8d commit eb70225

File tree

4 files changed

+128
-11
lines changed

4 files changed

+128
-11
lines changed

rust-code-analysis-book/src/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,5 @@
88
- [Nodes](commands/nodes.md)
99
- [Rest API](commands/rest.md)
1010
- [Developers Guide](developers/README.md)
11+
- [How-to: Add a new language](developers/new-language.md)
12+
- [How-to: Implement LoC](developers/loc.md)
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Lines of Code (LoC)
2+
3+
In this document we give some guidance on how to implement the LoC metrics available in this crate.
4+
[Lines of code](https://en.wikipedia.org/wiki/Source_lines_of_code) is a software metric that gives an indication of the size of some source code by counting the lines of the source code.
5+
There are many types of LoC so we will first explain those by way of an example.
6+
7+
## Types of LoC
8+
9+
```rust
10+
/*
11+
Instruction: Implement factorial function
12+
For extra credits, do not use mutable state or a imperative loop like `for` or `while`.
13+
*/
14+
15+
/// Factorial: n! = n*(n-1)*(n-2)*(n-3)...3*2*1
16+
fn factorial(num: u64) -> u64 {
17+
18+
// use `product` on `Iterator`
19+
(1..=num).product()
20+
}
21+
```
22+
23+
The example above will be used to illustrate each of the **LoC** metrics described below.
24+
25+
### SLOC
26+
27+
A straight count of all lines in the file including code, comments, and blank lines.
28+
METRIC VALUE: 11
29+
30+
### PLOC
31+
32+
A count of the instruction lines of code contained in the source code. This would include any brackets or similar syntax on a new line.
33+
Note that comments and blank lines are not counted in this.
34+
METRIC VALUE: 3
35+
36+
### LLOC
37+
38+
The "logical" lines is a count of the number of statements in the code. Note that what a statement is depends on the language.
39+
In the above example there is only a single statement which id the function call of `product` with the `Iterator` as its argument.
40+
METRIC VALUE: 1
41+
42+
### CLOC
43+
44+
A count of the comments in the code. The type of comment does not matter ie single line, block, or doc.
45+
METRIC VALUE: 6
46+
47+
### BLANK
48+
49+
Last but not least, this metric counts the blank lines present in a code.
50+
METRIC VALUE: 2
51+
52+
## Implementation
53+
54+
To implement the LoC related metrics described above you need to implement the `Loc` trait for the language you want to support.
55+
56+
This requires implementing the `compute` function.
57+
See [/src/metrics/loc.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/metrics/loc.rs) for where to implement, as well as examples from other languages.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Supporting a new language
2+
3+
This section is to help developers implement support for a new language in `rust-code-analysis`.
4+
5+
To implement a new language, two steps are required:
6+
7+
1. Generate the grammar
8+
2. Add the grammar to `rust-code-analysis`
9+
10+
A number of [metrics are supported](https://mozilla.github.io/rust-code-analysis/metrics.html) and help to implement those are covered elsewhere in the documentation.
11+
12+
## Generating the grammar
13+
14+
As a **prerequisite** for adding a new grammar, there needs to exist a [tree-sitter](https://github.com/tree-sitter) version for the desired language that matches the [version used in this project](https://github.com/mozilla/rust-code-analysis/blob/master/Cargo.toml).
15+
16+
The grammars are generated by a project in this repository called [enums](https://github.com/mozilla/rust-code-analysis/tree/master/enums). The following steps add the language support from the language crate and generate an enum file that is then used as the grammar in this project to evaluate metrics.
17+
18+
1. Add the language specific `tree-sitter` crate to the `enum` crate, making sure to tie it to the `tree-sitter` version used in the `ruse-code-analysis` crate. For example, for the Rust support at time of writing the following line exists in the [/enums/Cargo.toml](https://github.com/mozilla/rust-code-analysis/blob/master/enums/Cargo.toml): `tree-sitter-rust = "version number"`.
19+
2. Append the language to the `enum` crate in [/enums/src/languages.rs](https://github.com/mozilla/rust-code-analysis/blob/master/enums/src/languages.rs). Keeping with Rust as the example, the line would be `(Rust, tree_sitter_rust)`. The first parameter is the name of the Rust enum that will be generated, the second is the `tree-sitter` function to call to get the language's grammar.
20+
3. Add a case to the end of the match in `mk_get_language` macro rule in [/enums/src/macros.rs](https://github.com/mozilla/rust-code-analysis/blob/master/enums/src/macros.rs) eg. for Rust `Lang::Rust => tree_sitter_rust::language()`.
21+
4. Lastly, we execute the [/recreate-grammars.sh](https://github.com/mozilla/rust-code-analysis/blob/master/recreate-grammars.sh) script that runs the `enums` crate to generate the grammar for the new language.
22+
23+
At this point we should have a new grammar file for the new language in [/src/languages/](https://github.com/mozilla/rust-code-analysis/tree/master/src/languages). See [/src/languages/language_rust.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/languages/language_rust.rs) as an example of the generated enum.
24+
25+
## Adding the new grammar to rust-code-analysis
26+
27+
1. Add the language specific `tree-sitter` crate to the `rust-code-analysis` project, making sure to tie it to the `tree-sitter` version used in this project. For example, for the Rust support at time of writing the following line exists in the [Cargo.toml](https://github.com/mozilla/rust-code-analysis/blob/master/Cargo.toml): `tree-sitter-rust = "0.19.0"`.
28+
2. Next we add the new `tree-sitter` language namespace to [/src/languages/mod.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/languages/mod.rs) eg.
29+
30+
```rust
31+
pub mod language_rust;
32+
pub use language_rust::*;
33+
```
34+
35+
3. Lastly, we add a definition of the language to the arguments of `mk_langs!` macro in [/src/langs.rs](https://github.com/mozilla/rust-code-analysis/blob/master/src/langs.rs).
36+
37+
```rust
38+
// 1) Name for enum
39+
// 2) Language description
40+
// 3) Display name
41+
// 4) Empty struct name to implement
42+
// 5) Parser name
43+
// 6) tree-sitter function to call to get a Language
44+
// 7) file extensions
45+
// 8) emacs modes
46+
(
47+
Rust,
48+
"The `Rust` language",
49+
"rust",
50+
RustCode,
51+
RustParser,
52+
tree_sitter_rust,
53+
[rs],
54+
["rust"]
55+
)
56+
```

rust-code-analysis-book/src/languages.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,16 @@
33
This is the list of programming languages parsed by
44
**rust-code-analysis**.
55

6-
* C++
7-
* C#
8-
* CSS
9-
* Go
10-
* HTML
11-
* Java
12-
* JavaScript
13-
* The JavaScript used in Firefox internal
14-
* Python
15-
* Rust
16-
* Typescript
6+
- [x] C++
7+
- [ ] C#
8+
- [ ] CSS
9+
- [ ] Go
10+
- [ ] HTML
11+
- [ ] Java
12+
- [x] JavaScript
13+
- [x] The JavaScript used in Firefox internal
14+
- [x] Python
15+
- [x] Rust
16+
- [x] Typescript
17+
18+
A check indicates which languages have metrics implemented.

0 commit comments

Comments
 (0)