Skip to content

Commit df283fd

Browse files
kallentuCommit Queue
authored and
Commit Queue
committed
[linter] Update documentation and regex for unintended_html_in_doc_comment lint.
Fix the documentation and make it more clear for users. Added the allowlist of tags that won't be linted. The regex for what is considered an unintended HTML tag has been updated. Fixes https://github.com/dart-lang/linter/issues/5055 Bug: https://github.com/dart-lang/linter/issues/5050 Change-Id: I1963eb6878dd15d4a408be4ba6c2a4ba5f1d2e49 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/379519 Reviewed-by: Phil Quitslund <[email protected]> Commit-Queue: Kallen Tu <[email protected]> Reviewed-by: Lasse Nielsen <[email protected]>
1 parent 531a4c8 commit df283fd

File tree

2 files changed

+137
-41
lines changed

2 files changed

+137
-41
lines changed

pkg/linter/lib/src/rules/unintended_html_in_doc_comment.dart

Lines changed: 85 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,51 @@ const _desc = r'Use of angle brackets in a doc comment is treated as HTML by '
1313
'Markdown.';
1414

1515
const _details = r'''
16-
**DO** reference only in-scope identifiers in doc comments.
16+
**DON'T** use angle-bracketed text, `<…>`, in a doc comment unless you want to
17+
write an HTML tag or link.
1718
18-
When a developer writes a reference with angle brackets within a doc comment,
19-
the angle brackets are interpreted as HTML. The text within pairs of opening and
20-
closing angle brackets generally get swallowed by the browser, and will not be
21-
displayed.
19+
Markdown allows HTML tags as part of the Markdown code, so you can write, for
20+
example, `T<sub>1</sub>`. Markdown does not restrict the allowed tags, it just
21+
includes the tags verbatim in the output.
2222
23-
You can use a code block or code span to wrap the text containing angle
24-
brackets. You can also replace `<` with `&lt;` and `>` with `&gt;`.
23+
Dartdoc only allows some known and valid HTML tags, and will omit any disallowed
24+
HTML tag from the output. See the list of allowed tags and directives below.
25+
Your doc comment should not contain any HTML tags that are not on this list.
26+
27+
Markdown also allows you to write an "auto-link" to an URL as for example
28+
`<https://example.com/page.html>`, delimited only by `<...>`. Such a link is
29+
allowed by Dartdoc as well.
30+
A `<...>` delimited text is an auto-link if it is a valid absolute URL, starting
31+
with a scheme of at least two characters followed by a colon, like
32+
`<mailto:[email protected]>`.
33+
34+
Any other other occurrence of `<word...>` or `</word...>` is likely a mistake
35+
and this lint will warn about it.
36+
If something looks like an HTML tag, meaning it starts with `<` or `</`
37+
and then a letter, and it has a later matching `>`, then it's considered an
38+
invalid HTML tag unless it is an auto-link, or it starts with an *allowed*
39+
HTML tag.
40+
41+
Such a mistake can, for example, happen if writing Dart code with type arguments
42+
outside of a code span, for example `The type List<int> is ...`, where `<int>`
43+
looks like an HTML tag. Missing the end quote of a code span can have the same
44+
effect: ``The type `List<int> is ...`` will also treat `<int>` as an HTML tag.
45+
46+
Allowed HTML directives are: HTML comments, `<!-- text -->`, processing
47+
instructions, `<?...?>`, CDATA-sections, `<[CDATA...]>`, and the allowed HTML
48+
tags are:
49+
`a`, `abbr`, `address`, `area`, `article`, `aside`, `audio`, `b`,
50+
`bdi`, `bdo`, `blockquote`, `br`, `button`, `canvas`, `caption`,
51+
`cite`, `code`, `col`, `colgroup`, `data`, `datalist`, `dd`, `del`,
52+
`dfn`, `div`, `dl`, `dt`, `em`, `fieldset`, `figcaption`, `figure`,
53+
`footer`, `form`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `header`, `hr`,
54+
`i`, `iframe`, `img`, `input`, `ins`, `kbd`, `keygen`, `label`,
55+
`legend`, `li`, `link`, `main`, `map`, `mark`, `meta`, `meter`, `nav`,
56+
`noscript`, `object`, `ol`, `optgroup`, `option`, `output`, `p`,
57+
`param`, `pre`, `progress`, `q`, `s`, `samp`, `script`, `section`,
58+
`select`, `small`, `source`, `span`, `strong`, `style`, `sub`, `sup`,
59+
`table`, `tbody`, `td`, `template`, `textarea`, `tfoot`, `th`, `thead`,
60+
`time`, `title`, `tr`, `track`, `u`, `ul`, `var`, `video` and `wbr`.
2561
2662
**BAD:**
2763
```dart
@@ -172,21 +208,6 @@ class _UnintendedTag {
172208
}
173209

174210
class _Visitor extends SimpleAstVisitor<void> {
175-
// Matches autolinks: starting angle bracket, starting alphabetic character,
176-
// any alphabetic character or `-`, `+`, `.`, a semi-colon with optionally two
177-
// `/`s then anything but whitespace until a closing angle bracket.
178-
static final _autoLinkPattern =
179-
RegExp(r'<(([a-zA-Z][a-zA-Z\-\+\.]+):(?://)?[^\s>]*)>');
180-
181-
// Matches codespans: starting backtick with anything but a backtick until a
182-
// closing backtick.
183-
static final _codeSpanPattern = RegExp(r'`([^`]+)`');
184-
185-
// Matches unintential tags: starting `>`, optionally an opening `/` then one
186-
// or more valid tag characters then anything but a `>` until a closing `>`.
187-
static final _nonHtmlPattern =
188-
RegExp("<(?!/?(${_validHtmlTags.join("|")})[>])[^>]*[>]");
189-
190211
final LintRule rule;
191212

192213
_Visitor(this.rule);
@@ -215,18 +236,50 @@ class _Visitor extends SimpleAstVisitor<void> {
215236
/// Finds tags that are not valid HTML tags, not contained in a code span, and
216237
/// are not autolinks.
217238
List<_UnintendedTag> _findUnintendedHtmlTags(String text) {
218-
var codeSpanOrAutoLink = [
219-
..._codeSpanPattern.allMatches(text),
220-
..._autoLinkPattern.allMatches(text)
221-
];
222-
var unintendedHtmlTags = _nonHtmlPattern.allMatches(text);
239+
var markdownTokenPattern = RegExp(
240+
// Escaped Markdown character.
241+
r'\\.'
242+
243+
// Or code span, from "`"*N to "`"*N or just the start if it's
244+
// unterminated, to avoid "```a``" matching the "``a``".
245+
// The ```-sequence is atomic.
246+
r'|(?<cq>`+)(?:[^]*?\k<cq>)?'
247+
248+
// Or autolink, start with scheme + `:`.
249+
r'|<[a-z][a-z\d\-+.]+:[^\x00-\x20\x7f<>]*>'
250+
251+
// Or HTML comments.
252+
r'|<!--(?:-?>|[^]*?-->)'
253+
254+
// Or HTML declarations.
255+
r'|<![a-z][^]*?!>'
256+
257+
// Or HTML processing instructions.
258+
r'|<\?[^]*?\?>'
259+
260+
// Or HTML CDATA sections sections.
261+
r'|<\[CDATA[^]*\]>'
262+
263+
// Or valid HTML tag.
264+
// Matches `<validTag>`, `<validTag ...>`, `<validTag/>`, `</validTag>`
265+
// and `</validTag ...>.
266+
r'|<(?<et>/?)(?:'
267+
'${_validHtmlTags.join('|')}'
268+
r')'
269+
r'(?:/(?=\k<et>)>|>|[\x20\r\n\t][^]*?>)'
270+
271+
// Or any of the following matches which are considered invalid tags.
272+
// If the "nh" capture group is participating, one of these matched.
273+
r'|(?<nh>)(?:'
274+
275+
// Any other `</?tag ...>` sequence.
276+
r'</?[a-z][^]*?>'
277+
r')', caseSensitive: false);
223278

224279
var matches = <_UnintendedTag>[];
225-
for (var htmlTag in unintendedHtmlTags) {
226-
// If the tag is in a code span or is an autolink, we won't report it.
227-
if (!codeSpanOrAutoLink.any((match) =>
228-
match.start <= htmlTag.start && htmlTag.end <= match.end)) {
229-
matches.add(_UnintendedTag(htmlTag.start, htmlTag.end - htmlTag.start));
280+
for (var match in markdownTokenPattern.allMatches(text)) {
281+
if (match.namedGroup('nh') != null) {
282+
matches.add(_UnintendedTag(match.start, match.end - match.start));
230283
}
231284
}
232285
return matches;

pkg/linter/test/rules/unintended_html_in_doc_comment_test.dart

Lines changed: 52 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,23 @@ class C {}
5252
''');
5353
}
5454

55+
test_codeSpan_backSlashEscaped() async {
56+
await assertDiagnostics(r'''
57+
/// \\\`List<int> <tag>`
58+
class C {}
59+
''', [
60+
lint(12, 5), // <int>
61+
lint(18, 5), // <tag>
62+
]);
63+
}
64+
65+
test_codeSpan_multiple() async {
66+
await assertNoDiagnostics(r'''
67+
/// `<` or `>`
68+
class C {}
69+
''');
70+
}
71+
5572
test_hangingAngleBracket_left() async {
5673
await assertNoDiagnostics(r'''
5774
/// n < 12
@@ -66,13 +83,48 @@ class C {}
6683
''');
6784
}
6885

86+
test_html_cData() async {
87+
await assertNoDiagnostics(r'''
88+
/// <[CDATA[aaa]]>
89+
class C {}
90+
''');
91+
}
92+
93+
test_html_comment() async {
94+
await assertNoDiagnostics(r'''
95+
/// <!--comment-->
96+
class C {}
97+
''');
98+
}
99+
100+
test_html_declaration() async {
101+
await assertNoDiagnostics(r'''
102+
/// <!DOCTYPE html>
103+
class C {}
104+
''');
105+
}
106+
107+
test_html_processingInstruction() async {
108+
await assertNoDiagnostics(r'''
109+
/// <?aaa?>
110+
class C {}
111+
''');
112+
}
113+
69114
test_notDocComment() async {
70115
await assertNoDiagnostics(r'''
71116
// List<int> <tag>
72117
class C {}
73118
''');
74119
}
75120

121+
test_notHtml_space() async {
122+
await assertNoDiagnostics(r'''
123+
/// n < 0 || n > 512
124+
class C {}
125+
''');
126+
}
127+
76128
test_unintendedHtml() async {
77129
await assertDiagnostics(r'''
78130
/// Text List<int>.
@@ -153,15 +205,6 @@ class C {}
153205
]);
154206
}
155207

156-
test_unintendedHtml_notIdentifier() async {
157-
await assertDiagnostics(r'''
158-
/// n < 0 || n > 512
159-
class C {}
160-
''', [
161-
lint(6, 10), // < 0 || n >
162-
]);
163-
}
164-
165208
test_unintendedHtml_reference() async {
166209
await assertDiagnostics(r'''
167210
/// Text [List<int>].

0 commit comments

Comments
 (0)