Skip to content

Commit 1803ab1

Browse files
author
Bryan C. Mills
committed
cmd/go: validate pseudo-versions against module paths and revision metadata
Previously, most operations involving pseudo-versions allowed any arbitrary combination of version string and date, and would resolve to the underlying revision (typically a Git commit hash) as long as that revision existed. There are a number of problems with that approach: • The pseudo-version participates in minimal version selection. If its version prefix is inaccurate, the pseudo-version may appear to have higher precedence that the releases that follow it, effectively “pinning” the module to that commit. For release tags, module authors are the ones who make the decision about release tagging; they should also have control over the pseudo-version precedence within their module. • The commit date within the pseudo-version provides a total order among pseudo-versions. If it is not accurate, the pseudo-version will sort into the wrong place relative to other commits with the same version prefix. To address those problems, this change restricts the pseudo-versions that the 'go' command accepts, rendering some previously accepted-but-not-canonical versions invalid. A pseudo-version is now valid only if all of: 1. The tag from which the pseudo-version derives points to the named revision or one of its ancestors as reported by the underlying VCS tool, or the pseudo-version is not derived from any tag (that is, has a "vX.0.0-" prefix before the date string and uses the lowest major version appropriate to the module path). 2. The date string within the pseudo-version matches the UTC timestamp of the revision as reported by the underlying VCS tool. 3. The short name of the revision within the pseudo-version (such as a Git hash prefix) is the same as the short name reported by the underlying cmd/go/internal/modfetch/codehost.Repo. Specifically, if the short name is a SHA-1 prefix, it must use the same number of hex digits (12) as codehost.ShortenSHA1. 4. The pseudo-version includes a '+incompatible' suffix only if it is needed for the corresponding major version, and only if the underlying module does not have a go.mod file. We believe that all releases of the 'go' tool have generated pseudo-versions that meet these constraints. However, a few pseudo-versions edited by hand or generated by third-party tools do not. If we discover invalid-but-benign pseudo-versions in widely-used existing dependencies, we may choose to add a whitelist for those specific path/version combinations. ― To work around invalid dependencies in leaf modules, users may add a 'replace' directive from the invalid version to its valid equivalent. Note that the go command's go.mod parser automatically resolves commit hashes found in 'replace' directives to the appropriate pseudo-versions, so in most cases one can write something like: replace github.com/docker/docker v1.14.0-0.20190319215453-e7b5f7dbe98c => github.com/docker/docker e7b5f7dbe98c and then run any 'go' command (such as 'go list' or 'go mod tidy') to resolve it to an appropriate pseudo-version. Note that the invalid version will still be used in minimal version selection, so this use of 'replace' directives is an incomplete workaround. ― One of the common use cases for higher-than-tagged pseudo-versions is for projects that do parallel development on release branches. For example, if a project cuts a 'v1.2' release branch at v1.2.0, they may want future commits on the main branch to show up as pre-releases for v1.3.0 rather than for v1.2.1 — especially if v1.2.1 is already tagged on the release branch. (On the other hand, a backport of a patch to the v1.2 branch should not show up as a pre-release for v1.3.0.) To address this use-case, module authors can make use of our existing support for pseudo-versions derived from pre-release tags: if the author adds an explicit pre-release tag (such as 'v1.3.0-devel') to the first commit after the branch, then the pseudo-versions for that commit and its descendents will be derived from that tag and will sort appropriately in version selection. ― Updates #27171 Fixes #29262 Fixes #27173 Fixes #32662 Fixes #32695 Change-Id: I0d50a538b6fdb0d3080aca9c9c3df1040da1b329 Reviewed-on: https://go-review.googlesource.com/c/go/+/181881 Run-TryBot: Bryan C. Mills <[email protected]> Reviewed-by: Jay Conrod <[email protected]>
1 parent 851616d commit 1803ab1

20 files changed

+1170
-201
lines changed

doc/go1.13.html

+58
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,64 @@ <h2 id="tools">Tools</h2>
161161
TODO
162162
</p>
163163

164+
<h3 id="modules">Modules</h3>
165+
166+
<h4 id="version-validation">Version validation</h4><!-- CL 181881 -->
167+
168+
<p>
169+
When extracting a module from a version control system, the <code>go</code>
170+
command now performs additional validation on the requested version string.
171+
</p>
172+
173+
<p>
174+
The <code>+incompatible</code> version annotation bypasses the requirement
175+
of <a href="/cmd/go/#hdr-Module_compatibility_and_semantic_versioning">semantic
176+
import versioning</a> for repositories that predate the introduction of
177+
modules. The <code>go</code> command now verifies that such a version does not
178+
include an explicit <code>go.mod</code> file.
179+
</p>
180+
181+
<p>
182+
The <code>go</code> command now verifies the mapping
183+
between <a href="/cmd/go#hdr-Pseudo_versions">pseudo-versions</a> and
184+
version-control metadata. Specifically:
185+
<ul>
186+
<li>The version prefix must be derived from a tag on the named revision or
187+
one of its ancestors, or be of the form <code>vX.0.0</code>.</li>
188+
189+
<li>The date string must match the UTC timestamp of the revision.</li>
190+
191+
<li>The short name of the revision must use the same number of characters as
192+
what the <code>go</code> command would generate. (For SHA-1 hashes as used
193+
by <code>git</code>, a 12-digit prefix.)</li>
194+
</ul>
195+
</p>
196+
197+
<p>
198+
If the main module directly requires a version that fails the above
199+
validation, a corrected version can be obtained by redacting the version to
200+
just the commit hash and re-running a <code>go</code> command such as <code>go
201+
list -m all</code> or <code>go mod tidy</code>. For example,
202+
<pre>require github.com/docker/docker v1.14.0-0.20190319215453-e7b5f7dbe98c</pre>
203+
can be redacted to
204+
<pre>require github.com/docker/docker e7b5f7dbe98c</pre>
205+
which resolves to
206+
<pre>require github.com/docker/docker v0.7.3-0.20190319215453-e7b5f7dbe98c</pre>
207+
</p>
208+
209+
<p>
210+
If the main module has a transitive requirement on a version that fails
211+
validation, the invalid version can still be replaced with a valid one through
212+
the use of a <a href="/cmd/go/#hdr-The_go_mod_file"><code>replace</code>
213+
directive</a> in the <code>go.mod</code> file of
214+
the <a href="/cmd/go/#hdr-The_main_module_and_the_build_list">main module</a>.
215+
If the replacement is a commit hash, it will be resolved to the appropriate
216+
pseudo-version. For example,
217+
<pre>replace github.com/docker/docker v1.14.0-0.20190319215453-e7b5f7dbe98c => github.com/docker/docker e7b5f7dbe98c</pre>
218+
resolves to
219+
<pre>replace github.com/docker/docker v1.14.0-0.20190319215453-e7b5f7dbe98c => github.com/docker/docker v0.7.3-0.20190319215453-e7b5f7dbe98c</pre>
220+
</p>
221+
164222
<h3 id="compiler">Compiler toolchain</h3>
165223

166224
<p><!-- CL 170448 -->

src/cmd/go/internal/modconv/convert_test.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ func TestConvertLegacyConfig(t *testing.T) {
128128

129129
{
130130
// golang.org/issue/24585 - confusion about v2.0.0 tag in legacy non-v2 module
131-
"github.com/fishy/gcsbucket", "v0.0.0-20150410205453-618d60fe84e0",
131+
"github.com/fishy/gcsbucket", "v0.0.0-20180217031846-618d60fe84e0",
132132
`module github.com/fishy/gcsbucket
133133
134134
require (

src/cmd/go/internal/modfetch/cache.go

+5-13
Original file line numberDiff line numberDiff line change
@@ -216,29 +216,21 @@ func (r *cachingRepo) Latest() (*RevInfo, error) {
216216
return &info, nil
217217
}
218218

219-
func (r *cachingRepo) GoMod(rev string) ([]byte, error) {
219+
func (r *cachingRepo) GoMod(version string) ([]byte, error) {
220220
type cached struct {
221221
text []byte
222222
err error
223223
}
224-
c := r.cache.Do("gomod:"+rev, func() interface{} {
225-
file, text, err := readDiskGoMod(r.path, rev)
224+
c := r.cache.Do("gomod:"+version, func() interface{} {
225+
file, text, err := readDiskGoMod(r.path, version)
226226
if err == nil {
227227
// Note: readDiskGoMod already called checkGoMod.
228228
return cached{text, nil}
229229
}
230230

231-
// Convert rev to canonical version
232-
// so that we use the right identifier in the go.sum check.
233-
info, err := r.Stat(rev)
234-
if err != nil {
235-
return cached{nil, err}
236-
}
237-
rev = info.Version
238-
239-
text, err = r.r.GoMod(rev)
231+
text, err = r.r.GoMod(version)
240232
if err == nil {
241-
checkGoMod(r.path, rev, text)
233+
checkGoMod(r.path, version, text)
242234
if err := writeDiskGoMod(file, text); err != nil {
243235
fmt.Fprintf(os.Stderr, "go: writing go.mod cache: %v\n", err)
244236
}

src/cmd/go/internal/modfetch/codehost/codehost.go

+24-8
Original file line numberDiff line numberDiff line change
@@ -79,14 +79,16 @@ type Repo interface {
7979
// nested in a single top-level directory, whose name is not specified.
8080
ReadZip(rev, subdir string, maxSize int64) (zip io.ReadCloser, actualSubdir string, err error)
8181

82-
// RecentTag returns the most recent tag at or before the given rev
83-
// with the given prefix. It should make a best-effort attempt to
84-
// find a tag that is a valid semantic version (following the prefix),
85-
// or else the result is not useful to the caller, but it need not
86-
// incur great expense in doing so. For example, the git implementation
87-
// of RecentTag limits git's search to tags matching the glob expression
88-
// "v[0-9]*.[0-9]*.[0-9]*" (after the prefix).
89-
RecentTag(rev, prefix string) (tag string, err error)
82+
// RecentTag returns the most recent tag on rev or one of its predecessors
83+
// with the given prefix and major version.
84+
// An empty major string matches any major version.
85+
RecentTag(rev, prefix, major string) (tag string, err error)
86+
87+
// DescendsFrom reports whether rev or any of its ancestors has the given tag.
88+
//
89+
// DescendsFrom must return true for any tag returned by RecentTag for the
90+
// same revision.
91+
DescendsFrom(rev, tag string) (bool, error)
9092
}
9193

9294
// A Rev describes a single revision in a source code repository.
@@ -105,6 +107,20 @@ type FileRev struct {
105107
Err error // error if any; os.IsNotExist(Err)==true if rev exists but file does not exist in that rev
106108
}
107109

110+
// UnknownRevisionError is an error equivalent to os.ErrNotExist, but for a
111+
// revision rather than a file.
112+
type UnknownRevisionError struct {
113+
Rev string
114+
}
115+
116+
func (e *UnknownRevisionError) Error() string {
117+
return "unknown revision " + e.Rev
118+
}
119+
120+
func (e *UnknownRevisionError) Is(err error) bool {
121+
return err == os.ErrNotExist
122+
}
123+
108124
// AllHex reports whether the revision rev is entirely lower-case hexadecimal digits.
109125
func AllHex(rev string) bool {
110126
for i := 0; i < len(rev); i++ {

src/cmd/go/internal/modfetch/codehost/git.go

+88-17
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ import (
1010
"io"
1111
"io/ioutil"
1212
"os"
13+
"os/exec"
1314
"path/filepath"
1415
"sort"
1516
"strconv"
@@ -318,7 +319,7 @@ func (r *gitRepo) stat(rev string) (*RevInfo, error) {
318319
hash = rev
319320
}
320321
} else {
321-
return nil, fmt.Errorf("unknown revision %s", rev)
322+
return nil, &UnknownRevisionError{Rev: rev}
322323
}
323324

324325
// Protect r.fetchLevel and the "fetch more and more" sequence.
@@ -378,17 +379,30 @@ func (r *gitRepo) stat(rev string) (*RevInfo, error) {
378379

379380
// Last resort.
380381
// Fetch all heads and tags and hope the hash we want is in the history.
382+
if err := r.fetchRefsLocked(); err != nil {
383+
return nil, err
384+
}
385+
386+
return r.statLocal(rev, rev)
387+
}
388+
389+
// fetchRefsLocked fetches all heads and tags from the origin, along with the
390+
// ancestors of those commits.
391+
//
392+
// We only fetch heads and tags, not arbitrary other commits: we don't want to
393+
// pull in off-branch commits (such as rejected GitHub pull requests) that the
394+
// server may be willing to provide. (See the comments within the stat method
395+
// for more detail.)
396+
//
397+
// fetchRefsLocked requires that r.mu remain locked for the duration of the call.
398+
func (r *gitRepo) fetchRefsLocked() error {
381399
if r.fetchLevel < fetchAll {
382-
// TODO(bcmills): should we wait to upgrade fetchLevel until after we check
383-
// err? If there is a temporary server error, we want subsequent fetches to
384-
// try again instead of proceeding with an incomplete repo.
385-
r.fetchLevel = fetchAll
386400
if err := r.fetchUnshallow("refs/heads/*:refs/heads/*", "refs/tags/*:refs/tags/*"); err != nil {
387-
return nil, err
401+
return err
388402
}
403+
r.fetchLevel = fetchAll
389404
}
390-
391-
return r.statLocal(rev, rev)
405+
return nil
392406
}
393407

394408
func (r *gitRepo) fetchUnshallow(refSpecs ...string) error {
@@ -411,7 +425,7 @@ func (r *gitRepo) fetchUnshallow(refSpecs ...string) error {
411425
func (r *gitRepo) statLocal(version, rev string) (*RevInfo, error) {
412426
out, err := Run(r.dir, "git", "-c", "log.showsignature=false", "log", "-n1", "--format=format:%H %ct %D", rev, "--")
413427
if err != nil {
414-
return nil, fmt.Errorf("unknown revision %s", rev)
428+
return nil, &UnknownRevisionError{Rev: rev}
415429
}
416430
f := strings.Fields(string(out))
417431
if len(f) < 2 {
@@ -648,7 +662,7 @@ func (r *gitRepo) readFileRevs(tags []string, file string, fileMap map[string]*F
648662
return missing, nil
649663
}
650664

651-
func (r *gitRepo) RecentTag(rev, prefix string) (tag string, err error) {
665+
func (r *gitRepo) RecentTag(rev, prefix, major string) (tag string, err error) {
652666
info, err := r.Stat(rev)
653667
if err != nil {
654668
return "", err
@@ -681,7 +695,7 @@ func (r *gitRepo) RecentTag(rev, prefix string) (tag string, err error) {
681695

682696
semtag := line[len(prefix):]
683697
// Consider only tags that are valid and complete (not just major.minor prefixes).
684-
if c := semver.Canonical(semtag); c != "" && strings.HasPrefix(semtag, c) {
698+
if c := semver.Canonical(semtag); c != "" && strings.HasPrefix(semtag, c) && (major == "" || semver.Major(c) == major) {
685699
highest = semver.Max(highest, semtag)
686700
}
687701
}
@@ -716,12 +730,8 @@ func (r *gitRepo) RecentTag(rev, prefix string) (tag string, err error) {
716730
}
717731
defer unlock()
718732

719-
if r.fetchLevel < fetchAll {
720-
// Fetch all heads and tags and see if that gives us enough history.
721-
if err := r.fetchUnshallow("refs/heads/*:refs/heads/*", "refs/tags/*:refs/tags/*"); err != nil {
722-
return "", err
723-
}
724-
r.fetchLevel = fetchAll
733+
if err := r.fetchRefsLocked(); err != nil {
734+
return "", err
725735
}
726736

727737
// If we've reached this point, we have all of the commits that are reachable
@@ -738,6 +748,67 @@ func (r *gitRepo) RecentTag(rev, prefix string) (tag string, err error) {
738748
return tag, err
739749
}
740750

751+
func (r *gitRepo) DescendsFrom(rev, tag string) (bool, error) {
752+
// The "--is-ancestor" flag was added to "git merge-base" in version 1.8.0, so
753+
// this won't work with Git 1.7.1. According to golang.org/issue/28550, cmd/go
754+
// already doesn't work with Git 1.7.1, so at least it's not a regression.
755+
//
756+
// git merge-base --is-ancestor exits with status 0 if rev is an ancestor, or
757+
// 1 if not.
758+
_, err := Run(r.dir, "git", "merge-base", "--is-ancestor", "--", tag, rev)
759+
760+
// Git reports "is an ancestor" with exit code 0 and "not an ancestor" with
761+
// exit code 1.
762+
// Unfortunately, if we've already fetched rev with a shallow history, git
763+
// merge-base has been observed to report a false-negative, so don't stop yet
764+
// even if the exit code is 1!
765+
if err == nil {
766+
return true, nil
767+
}
768+
769+
// See whether the tag and rev even exist.
770+
tags, err := r.Tags(tag)
771+
if err != nil {
772+
return false, err
773+
}
774+
if len(tags) == 0 {
775+
return false, nil
776+
}
777+
778+
// NOTE: r.stat is very careful not to fetch commits that we shouldn't know
779+
// about, like rejected GitHub pull requests, so don't try to short-circuit
780+
// that here.
781+
if _, err = r.stat(rev); err != nil {
782+
return false, err
783+
}
784+
785+
// Now fetch history so that git can search for a path.
786+
unlock, err := r.mu.Lock()
787+
if err != nil {
788+
return false, err
789+
}
790+
defer unlock()
791+
792+
if r.fetchLevel < fetchAll {
793+
// Fetch the complete history for all refs and heads. It would be more
794+
// efficient to only fetch the history from rev to tag, but that's much more
795+
// complicated, and any kind of shallow fetch is fairly likely to trigger
796+
// bugs in JGit servers and/or the go command anyway.
797+
if err := r.fetchRefsLocked(); err != nil {
798+
return false, err
799+
}
800+
}
801+
802+
_, err = Run(r.dir, "git", "merge-base", "--is-ancestor", "--", tag, rev)
803+
if err == nil {
804+
return true, nil
805+
}
806+
if ee, ok := err.(*RunError).Err.(*exec.ExitError); ok && ee.ExitCode() == 1 {
807+
return false, nil
808+
}
809+
return false, err
810+
}
811+
741812
func (r *gitRepo) ReadZip(rev, subdir string, maxSize int64) (zip io.ReadCloser, actualSubdir string, err error) {
742813
// TODO: Use maxSize or drop it.
743814
args := []string{}

src/cmd/go/internal/modfetch/codehost/vcs.go

+12-2
Original file line numberDiff line numberDiff line change
@@ -347,7 +347,7 @@ func (r *vcsRepo) fetch() {
347347
func (r *vcsRepo) statLocal(rev string) (*RevInfo, error) {
348348
out, err := Run(r.dir, r.cmd.statLocal(rev, r.remote))
349349
if err != nil {
350-
return nil, vcsErrorf("unknown revision %s", rev)
350+
return nil, &UnknownRevisionError{Rev: rev}
351351
}
352352
return r.cmd.parseStat(rev, string(out))
353353
}
@@ -392,7 +392,7 @@ func (r *vcsRepo) ReadFileRevs(revs []string, file string, maxSize int64) (map[s
392392
return nil, vcsErrorf("ReadFileRevs not implemented")
393393
}
394394

395-
func (r *vcsRepo) RecentTag(rev, prefix string) (tag string, err error) {
395+
func (r *vcsRepo) RecentTag(rev, prefix, major string) (tag string, err error) {
396396
// We don't technically need to lock here since we're returning an error
397397
// uncondititonally, but doing so anyway will help to avoid baking in
398398
// lock-inversion bugs.
@@ -405,6 +405,16 @@ func (r *vcsRepo) RecentTag(rev, prefix string) (tag string, err error) {
405405
return "", vcsErrorf("RecentTag not implemented")
406406
}
407407

408+
func (r *vcsRepo) DescendsFrom(rev, tag string) (bool, error) {
409+
unlock, err := r.mu.Lock()
410+
if err != nil {
411+
return false, err
412+
}
413+
defer unlock()
414+
415+
return false, vcsErrorf("DescendsFrom not implemented")
416+
}
417+
408418
func (r *vcsRepo) ReadZip(rev, subdir string, maxSize int64) (zip io.ReadCloser, actualSubdir string, err error) {
409419
if r.cmd.readZip == nil {
410420
return nil, "", vcsErrorf("ReadZip not implemented for %s", r.cmd.vcs)

0 commit comments

Comments
 (0)