Description
Issue #13560 defined the exact form for a comment marking a file as auto-generated.
As linked at https://golang.org/s/generatedcode and then repeated at
https://golang.org/cmd/go/#hdr-Generate_Go_files_by_processing_source,
the rule is:
Generated files are marked by a line of text that matches the regular expression, in Go syntax:
^// Code generated .* DO NOT EDIT.$
The .* means the tool can put whatever folderol it wants in there, but the comment must be a single line and must start with Code generated and end with DO NOT EDIT., with a period.The text may appear anywhere in the file.
@robpike's original suggestion for placement was to match the rule for // +build
:
[Narrator: Good idea; I am going to suggest matching the rule for //go:build
below.]
The text must be a // comment, and that comment must appear before but not be attached to the package clause.”
After some bikeshedding about other comments it changed to:
The text must appear as the first line of a properly formatted Go // comment, and that comment must appear before but not be attached to the package clause and before any /* */ comment. This is similar to the rules for build tags.
@bradfitz then raised the question of non-Go source files saying:
I think it's a mistake for this proposal to require knowing anything about Go syntax.
It should be confined to a pattern being matched on the first N lines only, ignoring everything about packages or package comments, etc.
@myitcv raised a question about Go files with syntax errors and suggested anywhere in the file is fine, comparing to go generate
.
@robpike then revised to the “text may appear anywhere in the file” rule, which is what was finally accepted.
I was writing a generator the other day and @dmitshur helpfully pointed out that I'd accidentally marked the generator itself as generated by writing this code in the generator:
const header = `
// Copyright 2019 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Code generated by "makestatic"; DO NOT EDIT.
package static
`
If the rule we've defined flags this program as auto-generated, because it is itself a generator, that seems to me more like a bug in the rule than a bug in the generator.
I propose we change the rule to match the new //go:build
rules (#41184).
Specifically, the comment must appear before the first non-blank line, non-comment text.
/* */
comments before the // Code generated ...
comment would be allowed.
The rationale is:
- There's now a sensible placement rule we can share (the
//go:build
one). - The
//go:build
rule applies equally well to Go and non-Go files, addressing @bradfitz's concern. - The
//go:build
rule avoids any problem with syntax errors, because the lines must be before any syntax, addressing @myitcv's concern. - The lines are for people to find, and they are much easier to find if they are at the top (perhaps after a copyright notice and other comments), as opposed to needing to skim through the entire file.
- (It's of course fine for tools to help find them, but we work hard to keep Go a language that does not require tool support for a good development experience.)
- The comparison with
//go:generate
does not really apply. The lax placement there is because we want to allow putting the//go:generate
line next to what it applies to, such as putting a//go:generate stringer
line above the type that is being stringered. In this case, the// Code generated ...
applies to the entire file. It should be above the file. - Changing the rule avoids a false positive when a generator itself includes the magic comment in a multiline raw string literal.
It's reasonable to ask: Isn't it too late to change this? What about generators that put the comment later in the file? Their output won't be recognized as generated anymore.
I looked into that using the module mirror-based Go corpus I assembled back in March. Of all the Go source files I have in that corpus, I found:
- 659,188 files that contained a
// Code generated
comment. - 653,750 of them appear before the package statement.
- 4,034 of them appear after the package statement but before imports/decls.
- 1,391 of them appear after the imports but before decls.
- 13 of them appear after decls.
So making this change would require 0.825% of generated files to be updated. And until they are updated, no big deal - people will still see the comment, and only a few tools care. If we fix the top five generators causing these lines (xo, cdproto-gen, minimock, msgp, chromedp-gen), we'd be left with only 1,008 mismatched files, or 0.15%. In any event, when we first adopted the rule we had to update essentially 100% of generated files. Now we're talking about under 1/100 of that, so the impact here is not large.
On the other hand, consider generators. I had the same scan look for the magic text inside string literals. It found 2,272 instances of string literals containing the text but not at the start/end of a line; those are correctly skipped by the current rule. It also found 2,350 instances of multiline string literals like the one I'd written; all those generators are incorrectly flagged as themselves auto-generated by the current rule.
That is, just over half of the generators I found are doing it wrong by the current rule.
This strongly suggests the rule should be changed.
I've attached the non-top-of-file results as autogen.txt if anyone wants to look into the details here.