-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: make "byte offset on line" error position info available #10324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'll note that both gccgo and the go/parser package track column information. I'm not sure either tracks a complete range, though. |
What is a column? |
filename:line:column: msg where column is counted in bytes (at least for go/parser) The go/ast has full range information for each node. |
The "complement" of a line: a position (usually one-based index) counting from the last newline character. As opposed to a position counting from the start of the file (an absolute offset). BTW, it would be preferable to report Unicode character positions rather than raw byte positions. |
I suspect Rob was making the point that "column" only makes sense if the text layout is a grid (i.e. fixed-width fonts). A few people, including @robpike, are known for using proportional fonts in their editors. In this context "column" is less meaningful. |
Not just that. Some characters, such as those in some Asian languages are wider than Latin characters. The concept of column is parochial. I deeply object to the noise that the proposed style of error reporting introduces but it seems likely I will lose this fight. The best I can do is point out its technical inadequacies. |
Also: The idea of Unicode character position is also poorly defined. The concept of character is poorly defined, and then the concept of column is poorly defined on top of that. |
as cmd/gc doesn't currently track column (or byte, rune offset) For IDEs, it's much better to use go/parser and go/types for this |
"column" is just the term that it is called because its what it looks like on fixed-width fonts. Yes there are no columns in non-fixed-width fonts, but the term and the value are still useful. Just as software "bugs" are not actual insects, and when we save files we no longer save them in "floppy disks", but that is still the icon most graphical applications use. You can call it what you want, as long as the tool gives the value I want... :) I did give a definition of what I meant by "column" in my previous post, but I will clarify it:
Alternatively, instead of a byte index, a Unicode code point index could be provided. This is preferable. |
My usage of "character" is not poorly defined: I was using the same definition that the Go language specification uses:
A source text can be viewed as a sequence of Unicode code points. When I mention code point index (or position), I mean the index in that sequence where said code point occurs. This should clarify it. As for the term column, clarified it in the previous post. |
This also breaks badly for generated code. If //line is being used, the compiler should suppress "column" information. |
@rsc "for punch card IDEs" ?... Is this supposed to be a joke? 😮 Virtually all modern IDEs and code editors support the display of errors overlayed in the source editor, in the form of a line subrange (which means they need a start column, possibly an end column as well). For example, Eclipse, IntelliJ, Visual Studio Code, Atom, Visual Studio , and likely many others that I didn't bother searching for screenshots. I'm fairly sure even the ancient Vim and Emacs understand the concept of columns as well, with regards to error messages (although I didn't check). So, if the implication is that all those IDEs and editors above are archaic, I'd really like to know what wonderful tool y'all are using to edit source code... |
@bruno-medeiros Regarding your question: Virtually nobody in the Go team uses an IDE which explains the lack of enthusiasm for more detailed position information. Don't take this as a value judgement, it's just a fact and a preference. We understand that a lot of people prefer to use some sort of IDE. That said, as I have pointed out before, the go/ast provides full range information. There is some not insignificant cost associated with tracking exact position information in a compiler, and the existing syntax tree in the compiler doesn't do it. There some significant engineering required to make it work correctly and reasonably efficiently. It may be done at some point, but it's not a high priority, hence the milestone of "unplanned". |
To respond to an earlier comment, if we do this it should be byte offset, not Unicode code point offset. There's almost no Go setting where Unicode code point makes sense: byte offsets give you efficient random access into a UTF-8 representation of the source, code point offsets give that up, and neither actually matches characters on the screen (due to complications like combining accents). I do understand that Unicode code point offset makes sense in Java because that's the native string representation and so it would give efficient random access in that setting. But Go should target Go, not Java. |
@griesemer Thanks for the answer. It does make sense that there would be little interest in this functionality if none of the Go team is using an IDE (or graphical editor). As for efficiency though, wouldn't it be okay if the detailed tracking was only enabled with a compiler option? This way, an implementation that wasn't as efficient as ideally possible could be allowed in. @rsc A byte offset instead of code point would be very welcome too. The conversion back to code point could always be done back in the IDE, and it would be relatively straightforward. |
@griesemer even if not using an IDE, many text editors can take Most of the time the line number is enough, but at times it can be confusing. Example: https://play.golang.org/p/eFauWxodFX This may sound silly since you can very quickly see the func signature and figure out which of the two |
@mvdan Thanks for the example. Again, I think we all understand the rationale, and again, go/parser and friends provide column information (byte offset from start of line). As pointed out before, there's some significant plumbing required to make this work. I agree it would be nice to have. It's simply unplanned for now because we have more important things to work on at the moment. |
To add 2 cents, it was added in the D compiler against Walter's opinion ("slow downs lexer") and some debate. It turned out to be less noisy than expected and useful for various editor/IDE integration. I think people would defend column information with their teeth now. |
Which teeth? Be specific. https://en.wikipedia.org/wiki/Universal_Numbering_System :) |
CL https://golang.org/cl/34244 mentions this issue. |
…hen rune count for column value This will only become user-visible if error messages show column information. Per the discussion in #10324. For #10324. Change-Id: I5959c1655aba74bb1a22fdc261cd728ffcfa6912 Reviewed-on: https://go-review.googlesource.com/34244 Run-TryBot: Robert Griesemer <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Matthew Dempsky <[email protected]>
We now have precise position information (line# and byte offset per line). Should provide a mechanism to enable/disable byte offsets in error messages (and decide on a default). |
CL https://golang.org/cl/37970 mentions this issue. |
When a Go source file has an error, only the line number is reported by the go compiler. So for example, with:
the Go compiler error message:
other\blah.go:4: undefined: fmtxxx
This is okay for command line usage, but when Go is invoked by editors and IDEs, it would be an improvement if a full source range was reported - that is, the start offset (line+column) of the error, and the end offset (line+column).
As an example, here's the error output from the Rust compiler for a similar error:
The Rust compiler goes even further and prints the range in the command-line UI too. That's not required though, what I'm asking is just the extra offset info reported in the first line. "8:5" is the start of the error range and "8:15" the end of the range. Reporting the error location as an absolute character offset (as opposed to line+column), would also work fine.
With this, IDE's can reported errors as squiggly lines in a source editor. Example, see how "xxx" is underlined:
As a sidenote, https://github.com/GoClipse/goclipse in particular already supports this functionality, but it needs the compiler to report the full source ranges.
The text was updated successfully, but these errors were encountered: