Skip to content

Commit 9cf7dcb

Browse files
authored
Merge branch 'main' into ber.a/diagnosticData
2 parents 2be0669 + 1f37bd6 commit 9cf7dcb

28 files changed

+2174
-578
lines changed

docs/changing-the-ast.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
title: Changing the AST
3+
category: Compiler Internals
4+
categoryindex: 200
5+
index: 800
6+
---
7+
# Changing the AST
8+
9+
Making changes to the AST is a common task when working on new F# compiler features or when working on developer tooling.
10+
This document describes the process of making changes to the AST.
11+
12+
The easiest way to modify the AST is to start with the type definitions in `SyntaxTree.fsi` and `SyntaxTree.fs` and then let the compiler guide you to the places where you need to make changes.
13+
Let's look at an example: We want to extend the AST to include the range of the `/` symbol in a `SynRationalConst.Rational`.
14+
15+
There are two solutions to choose from:
16+
- Add a new field to the `Rational` union case
17+
- Add a dedicated trivia type to the union case which contains the new range and maybe move the existing ranges to the trivia type as well
18+
19+
The pros of introducing a dedicated trivia type are:
20+
- Having the additional information in a separate structure allows it to grow more easily over time. Adding new information to an existing trivia type won't disrupt most FCS consumers.
21+
- It is clear that it is information that is not relevant for the compilation.
22+
23+
The cons are:
24+
- It can be a bit excessive to introduce for a single field.
25+
- The existing AST node might already contain fields that are historically more suited for trivia, but predate the SyntaxTrivia module.
26+
27+
In this example, we'll go with the first solution and add a new field named `divRange` to the `Rational` union case as it felt a bit excessive to introduce a new trivia type for a single field.
28+
But these are the type of decisions that need to be made when changing the AST.
29+
30+
```fsharp
31+
type SynRationalConst =
32+
33+
// ...
34+
35+
| Rational of
36+
numerator: int32 *
37+
numeratorRange: range *
38+
divRange: range * // our new field
39+
denominator: int32 *
40+
denominatorRange: range *
41+
range: range
42+
43+
// ...
44+
```
45+
46+
After modifying `SyntaxTree.fsi` and `SyntaxTree.fs`, the compiler will report erros in `pars.fsy`. If not, the `fsy` file wasn't processed by the compilation. In this case, a rebuild of `FSharp.Compiler.Service.fsproj` should help.
47+
`pars.fsy` is the parser specification of F#, a list of rules that describe how to parse F# code. Don't be scared by the size of the file or the unfamiliar content.
48+
It's easier than it looks.
49+
The F# compiler uses a parser generator called [fsyacc](https://github.com/fsprojects/FsLexYacc) to generate the parser from the specification in `pars.fsy`.
50+
Let's look at the most relevant syntax parts of a `.fsy` file:
51+
52+
```fsharp
53+
rationalConstant:
54+
| INT32 INFIX_STAR_DIV_MOD_OP INT32
55+
{ if $2 <> "/" then reportParseErrorAt (rhs parseState 2) (FSComp.SR.parsUnexpectedOperatorForUnitOfMeasure())
56+
if fst $3 = 0 then reportParseErrorAt (rhs parseState 3) (FSComp.SR.parsIllegalDenominatorForMeasureExponent())
57+
if (snd $1) || (snd $3) then errorR(Error(FSComp.SR.lexOutsideThirtyTwoBitSigned(), lhs parseState))
58+
SynRationalConst.Rational(fst $1, rhs parseState 1, fst $3, rhs parseState 3, lhs parseState) }
59+
| // ...
60+
```
61+
62+
The first line is the name of the rule, `rationalConstant` in this case. It's a so called non-terminal symbol in contrast to a terminal symbol like `INT32` or `INFIX_STAR_DIV_MOD_OP`. The individual cases of the rule are separated by `|`, they are called productions.
63+
64+
By now, you should be able to see the similarities between an fsyacc rule and the pattern matching you know from F#.
65+
The code between the curly braces is the code that gets executed when the rule is matched and is _real_ F# code. After compilation, it ends up in
66+
`.\artifacts\obj\FSharp.Compiler.Service\Debug\netstandard2.0\pars.fs`, generated by fsyacc.
67+
68+
The first three lines do error checking and report errors if the input is invalid.
69+
Then the code calls the `Rational` constructor of `SynRationalConst` and passes some values to it. Here we need to make changes to adjust the parser to our modified type definition.
70+
The values or symbols that matched the rule are available as `$1`, `$2`, `$3` etc. in the code. As you can see, `$1` is a tuple, consisting of the parsed number and a boolean indicating whether the number is a valid 32 bit signed integer or not.
71+
The code is executed in the context of the parser, so you can use the `parseState` variable, an instance of `IParseState`, to access the current state of the parser. There are helper functions defined in `ParseHelpers.fs` that make it easier to work with it.
72+
`rhs parseState 1` returns the range of the first symbol that matched the rule, here `INT32`. So, it returns the range of `23` in `23/42`.
73+
`rhs` is short for _right hand side_.
74+
Another helper is `rhs2`. Using it like `rhs2 parseState 2 3` for example, returns the range covering the symbols from the second to the third symbol that matched the rule. Given `23/42`, it would return the range of `/42`.
75+
`lhs parseState` returns the range of the whole rule, `23/42` in our example.
76+
When parser recovery is of concern for a rule, it's preferred to use `rhs2` over `lhs`.
77+
78+
Circling back to our original example of adding a new field to `SynRationalConst`, we need to add a new parameter to the call of the `Rational` constructor. We want to pass the range of the `/` symbol, so we need to add `rhs parseState 2` as the third parameter to the constructor call:
79+
80+
```fsharp
81+
SynRationalConst.Rational(fst $1, rhs parseState 1, rhs parseState 2, fst $3, rhs parseState 3, lhs parseState)
82+
```
83+
84+
That's it. Adjusting the other constructor calls of `Rational` in `pars.fsy` should be enough to have a working parser again which returns the modified AST.
85+
While fixing the remaining compiler errors outside of `pars.fsy`, it's a good idea to use named access to the fields of the `SynRationalConst.Rational` union case instead of positional access. This way, the compilation won't fail if aditional fields are added to the union case in the future.
86+
After a successful compilation, you can run the parser tests in `SyntaxTreeTests.fs` to verify that everything works as expected.
87+
It's likely that you'll need to update the baseline files as decribed in `SyntaxTreeTests.fs`.

docs/fcs/editor.fsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ let decls =
184184

185185
// Print the names of available items
186186
for item in decls.Items do
187-
printfn " - %s" item.Name
187+
printfn " - %s" item.NameInList
188188

189189
(**
190190

docs/fcs/filesystem.fsx

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,9 @@ let B = File1.A + File1.A"""
130130
member _.IsStableFileHeuristic(path) =
131131
defaultFileSystem.IsStableFileHeuristic(path)
132132

133+
member this.ChangeExtensionShim(path, extension) =
134+
defaultFileSystem.ChangeExtensionShim(path, extension)
135+
133136
let myFileSystem = MyFileSystem()
134137
FileSystem <- MyFileSystem()
135138

docs/fcs/syntax-visitor.fsx

Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
(**
2+
---
3+
title: Tutorial: SyntaxVisitorBase
4+
category: FSharp.Compiler.Service
5+
categoryindex: 300
6+
index: 301
7+
---
8+
*)
9+
(*** hide ***)
10+
#I "../../artifacts/bin/FSharp.Compiler.Service/Debug/netstandard2.0"
11+
(**
12+
Compiler Services: Using the SyntaxVisitorBase
13+
=========================================
14+
15+
Syntax tree traversal is a common topic when interacting with the `FSharp.Compiler.Service`.
16+
As established in [Tutorial: Expressions](./untypedtree.html#Walking-over-the-AST), the [ParsedInput](../reference/fsharp-compiler-syntax-parsedinput.html) can be traversed by a set of recursive functions.
17+
It can be tedious to always construct these functions from scratch.
18+
19+
As an alternative, a [SyntaxVisitorBase](../reference/fsharp-compiler-syntax-syntaxvisitorbase-1.html) can be used to traverse the syntax tree.
20+
Consider, the following code sample:
21+
*)
22+
23+
let codeSample = """
24+
module Lib
25+
26+
let myFunction paramOne paramTwo =
27+
()
28+
"""
29+
30+
(**
31+
Imagine we wish to grab the `myFunction` name from the `headPat` in the [SynBinding](../reference/fsharp-compiler-syntax-synbinding.html).
32+
Let's introduce a helper function to construct the AST:
33+
*)
34+
35+
#r "FSharp.Compiler.Service.dll"
36+
open FSharp.Compiler.CodeAnalysis
37+
open FSharp.Compiler.Text
38+
open FSharp.Compiler.Syntax
39+
40+
let checker = FSharpChecker.Create()
41+
42+
/// Helper to construct an ParsedInput from a code snippet.
43+
let mkTree codeSample =
44+
let parseFileResults =
45+
checker.ParseFile(
46+
"FileName.fs",
47+
SourceText.ofString codeSample,
48+
{ FSharpParsingOptions.Default with SourceFiles = [| "FileName.fs" |] }
49+
)
50+
|> Async.RunSynchronously
51+
52+
parseFileResults.ParseTree
53+
54+
(**
55+
And create a visitor to traverse the tree:
56+
*)
57+
58+
let visitor =
59+
{ new SyntaxVisitorBase<string>() with
60+
override this.VisitPat(path, defaultTraverse, synPat) =
61+
// First check if the pattern is what we are looking for.
62+
match synPat with
63+
| SynPat.LongIdent(longDotId = SynLongIdent(id = [ ident ])) ->
64+
// Next we can check if the current path of visited nodes, matches our expectations.
65+
// The path will contain all the ancestors of the current node.
66+
match path with
67+
// The parent node of `synPat` should be a `SynBinding`.
68+
| SyntaxNode.SynBinding _ :: _ ->
69+
// We return a `Some` option to indicate we found what we are looking for.
70+
Some ident.idText
71+
// If the parent is something else, we can skip it here.
72+
| _ -> None
73+
| _ -> None }
74+
75+
let result = SyntaxTraversal.Traverse(Position.pos0, mkTree codeSample, visitor) // Some "myFunction"
76+
77+
(**
78+
Instead of traversing manually from `ParsedInput` to `SynModuleOrNamespace` to `SynModuleDecl.Let` to `SynBinding` to `SynPat`, we leverage the default navigation that happens in `SyntaxTraversal.Traverse`.
79+
A `SyntaxVisitorBase` will shortcut all other code paths once a single `VisitXYZ` override has found anything.
80+
81+
Our code sample of course only had one let binding and thus we didn't need to specify any further logic whether to differentiate between multiple bindings.
82+
Let's consider a second example where we know the user's cursor inside an IDE is placed after `c` and we are interested in the body expression of the let binding.
83+
*)
84+
85+
let secondCodeSample = """
86+
module X
87+
88+
let a = 0
89+
let b = 1
90+
let c = 2
91+
"""
92+
93+
let secondVisitor =
94+
{ new SyntaxVisitorBase<SynExpr>() with
95+
override this.VisitBinding(path, defaultTraverse, binding) =
96+
match binding with
97+
| SynBinding(expr = e) -> Some e }
98+
99+
let cursorPos = Position.mkPos 6 5
100+
101+
let secondResult =
102+
SyntaxTraversal.Traverse(cursorPos, mkTree secondCodeSample, secondVisitor) // Some (Const (Int32 2, (6,8--6,9)))
103+
104+
(**
105+
Due to our passed cursor position, we did not need to write any code to exclude the expressions of the other let bindings.
106+
`SyntaxTraversal.Traverse` will check whether the current position is inside any syntax node before drilling deeper.
107+
108+
Lastly, some `VisitXYZ` overrides can contain a defaultTraverse. This helper allows you to continue the default traversal when you currently hit a node that is not of interest.
109+
Consider `1 + 2 + 3 + 4`, this will be reflected in a nested infix application expression.
110+
If the cursor is at the end of the entire expression, we can grab the value of `4` using the following visitor:
111+
*)
112+
113+
let thirdCodeSample = "let sum = 1 + 2 + 3 + 4"
114+
115+
(*
116+
AST will look like:
117+
118+
Let
119+
(false,
120+
[SynBinding
121+
(None, Normal, false, false, [],
122+
PreXmlDoc ((1,0), Fantomas.FCS.Xml.XmlDocCollector),
123+
SynValData
124+
(None, SynValInfo ([], SynArgInfo ([], false, None)), None,
125+
None),
126+
Named (SynIdent (sum, None), false, None, (1,4--1,7)), None,
127+
App
128+
(NonAtomic, false,
129+
App
130+
(NonAtomic, true,
131+
LongIdent
132+
(false,
133+
SynLongIdent
134+
([op_Addition], [], [Some (OriginalNotation "+")]),
135+
None, (1,20--1,21)),
136+
App
137+
(NonAtomic, false,
138+
App
139+
(NonAtomic, true,
140+
LongIdent
141+
(false,
142+
SynLongIdent
143+
([op_Addition], [],
144+
[Some (OriginalNotation "+")]), None,
145+
(1,16--1,17)),
146+
App
147+
(NonAtomic, false,
148+
App
149+
(NonAtomic, true,
150+
LongIdent
151+
(false,
152+
SynLongIdent
153+
([op_Addition], [],
154+
[Some (OriginalNotation "+")]), None,
155+
(1,12--1,13)),
156+
Const (Int32 1, (1,10--1,11)), (1,10--1,13)),
157+
Const (Int32 2, (1,14--1,15)), (1,10--1,15)),
158+
(1,10--1,17)), Const (Int32 3, (1,18--1,19)),
159+
(1,10--1,19)), (1,10--1,21)),
160+
Const (Int32 4, (1,22--1,23)), (1,10--1,23)), (1,4--1,7),
161+
Yes (1,0--1,23), { LeadingKeyword = Let (1,0--1,3)
162+
InlineKeyword = None
163+
EqualsRange = Some (1,8--1,9) })
164+
*)
165+
166+
let thirdCursorPos = Position.mkPos 1 22
167+
168+
let thirdVisitor =
169+
{ new SyntaxVisitorBase<int>() with
170+
override this.VisitExpr(path, traverseSynExpr, defaultTraverse, synExpr) =
171+
match synExpr with
172+
| SynExpr.Const (constant = SynConst.Int32 v) -> Some v
173+
// We do want to continue to traverse when nodes like `SynExpr.App` are found.
174+
| otherExpr -> defaultTraverse otherExpr }
175+
176+
let thirdResult =
177+
SyntaxTraversal.Traverse(cursorPos, mkTree thirdCodeSample, thirdVisitor) // Some 4
178+
179+
(**
180+
`defaultTraverse` is especially useful when you do not know upfront what syntax tree you will be walking.
181+
This is a common case when dealing with IDE tooling. You won't know what actual code the end-user is currently processing.
182+
183+
**Note: SyntaxVisitorBase is designed to find a single value inside a tree!**
184+
This is not an ideal solution when you are interested in all nodes of certain shape.
185+
It will always verify if the given cursor position is still matching the range of the node.
186+
As a fallback the first branch will be explored when you pass `Position.pos0`.
187+
By design, it is meant to find a single result.
188+
189+
*)

docs/fcs/tokenizer.fsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ file name of the source code. The defined symbols are required because the
3636
tokenizer handles `#if` directives. The file name is required only to specify
3737
locations of the source code (and it does not have to exist):
3838
*)
39-
let sourceTok = FSharpSourceTokenizer([], Some "C:\\test.fsx")
39+
let sourceTok = FSharpSourceTokenizer([], Some "C:\\test.fsx", Some "PREVIEW", None)
4040
(**
4141
Using the `sourceTok` object, we can now (repeatedly) tokenize lines of
4242
F# source code.

docs/fcs/untypedtree.fsx

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -111,11 +111,11 @@ used more often):
111111
/// Walk over a pattern - this is for example used in
112112
/// let <pat> = <expr> or in the 'match' expression
113113
let rec visitPattern = function
114-
| SynPat.Wild(_) ->
114+
| SynPat.Wild _ ->
115115
printfn " .. underscore pattern"
116-
| SynPat.Named(name, _, _, _) ->
116+
| SynPat.Named(ident = SynIdent(ident = name)) ->
117117
printfn " .. named as '%s'" name.idText
118-
| SynPat.LongIdent(LongIdentWithDots(ident, _), _, _, _, _, _, _) ->
118+
| SynPat.LongIdent(longDotId = SynLongIdent(id = ident)) ->
119119
let names = String.concat "." [ for i in ident -> i.idText ]
120120
printfn " .. identifier: %s" names
121121
| pat -> printfn " .. other pattern: %A" pat
@@ -145,9 +145,7 @@ let rec visitExpression e =
145145
// for 'let .. = .. and .. = .. in ...'
146146
printfn "LetOrUse with the following bindings:"
147147
for binding in bindings do
148-
let (SynBinding(
149-
access, kind, isInline, isMutable, attrs, xmlDoc, data,
150-
headPat, retInfo, init, equalsRange, debugPoint, trivia)) = binding
148+
let (SynBinding(headPat = headPat; expr = init)) = binding
151149
visitPattern headPat
152150
visitExpression init
153151
// Visit the body expression
@@ -177,9 +175,7 @@ let visitDeclarations decls =
177175
// Let binding as a declaration is similar to let binding
178176
// as an expression (in visitExpression), but has no body
179177
for binding in bindings do
180-
let (SynBinding(
181-
access, kind, isInline, isMutable, attrs, xmlDoc,
182-
valData, pat, retInfo, body, equalsRange, debugPoint, trivia)) = binding
178+
let (SynBinding(headPat = pat; expr = body)) = binding
183179
visitPattern pat
184180
visitExpression body
185181
| _ -> printfn " - not supported declaration: %A" declaration
@@ -194,7 +190,7 @@ with multiple `namespace Foo` declarations:
194190
/// does not explicitly define it..
195191
let visitModulesAndNamespaces modulesOrNss =
196192
for moduleOrNs in modulesOrNss do
197-
let (SynModuleOrNamespace(lid, isRec, isMod, decls, xml, attrs, accessibility, range)) = moduleOrNs
193+
let (SynModuleOrNamespace(longId = lid; decls = decls)) = moduleOrNs
198194
printfn "Namespace or module: %A" lid
199195
visitDeclarations decls
200196
(**
@@ -238,7 +234,7 @@ in the previous step:
238234
match tree with
239235
| ParsedInput.ImplFile(implFile) ->
240236
// Extract declarations and walk over them
241-
let (ParsedImplFileInput(fn, script, name, _, _, modules, _, _)) = implFile
237+
let (ParsedImplFileInput(contents = modules)) = implFile
242238
visitModulesAndNamespaces modules
243239
| _ -> failwith "F# Interface file (*.fsi) not supported."
244240
(**

eng/Version.Details.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
<?xml version="1.0" encoding="utf-8"?>
22
<Dependencies>
33
<ProductDependencies>
4-
<Dependency Name="Microsoft.SourceBuild.Intermediate.source-build-reference-packages" Version="8.0.0-alpha.1.23451.1">
4+
<Dependency Name="Microsoft.SourceBuild.Intermediate.source-build-reference-packages" Version="8.0.0-alpha.1.23455.3">
55
<Uri>https://github.com/dotnet/source-build-reference-packages</Uri>
6-
<Sha>0030d238c7929b0e9b06576837b60ad90037b1d2</Sha>
6+
<Sha>75ec14a961f43446d952c64b5b3330df750db54f</Sha>
77
<SourceBuild RepoName="source-build-reference-packages" ManagedOnly="true" />
88
</Dependency>
99
<Dependency Name="Microsoft.SourceBuild.Intermediate.msbuild" Version="17.7.0-preview-23217-02">

0 commit comments

Comments
 (0)