Description
This issue is an umbrella for several proposed additions to SCIP, based on discussions with @donsbot on Mastodon .
SymbolInformation.display_name
This would be the name of the symbol, which is both helpful for local variables and avoids parsing the name from the symbol. The field could be name
instead of display_name
, we use display_name
in SemanticDB to emphasize that this name is meant to be displayed (and should therefore not have special encoding for non-ASCII characters like emojis.
SymbolInformation.owner
Alternative name parent
. The thinking with this field is that it avoids parsing the owner from the symbol, and it allows us to emit an owner for local symbols.
SymbolInformation.kind
An enum that specifies what kind of symbol this is (enum/interface/method/...). Currently, Descriptor.Suffix
doesn't encode enough fine-grained information (and it's intentionally named "suffix" to emphasize that it's primarily related to the syntax of the symbol.
SymbolInformation.signature_documentation
A string-formatted rendering of the signature. Currently, indexers emit this information in the documentation
field as markdown-formatted code blocks. Having a separate field makes it cleaner to extract only the signature. I propose we reserve the field SymbolInformation.signature
for fully typed/structured signatures (not string-formatted signatures).
Activity
[-]Proposall: add several fields to `SymbolInformation`[/-][+]Proposal: add several fields to `SymbolInformation`[/+]donsbot commentedon May 23, 2023
SymbolInformation.kind
For symbol kinds, a variant of the lsif/vscode spec. E.g. currently used by Glean : https://github.com/facebookincubator/Glean/blob/main/glean/schema/source/codemarkup.types.angle#L38 ; these are used for search and outline views: to e.g. constraint a search to methods or classes only, or when searching by name, to show the kind of each result.
SymbolInformation.signature_documentation
this is somewhat important, as the scip index conflates "Documentation". e.g. in rust-analyzer the type signature and the documentation markdown are combined. They should ideally be separate "tables" in the output, keyed by scip.Symbol, as depending on the scenario you might only want to show the signature (e.g. a search page), or docs (an API page).
SymbolInformation.display_name
This is another search related application. Imagine taking the scip data and efficiently listing all symbols called 'vec'. We'd need a way to extrac the local name. Another scenario: search for foo::bar::vec , where 'bar' might be a subclass of something. In each case we need to know the precise local name and parent name ("qualified name"). These can be extracted by parsing the scip.Symbol but that's something the protocol could already do,
olafurpg commentedon May 23, 2023
@donsbot Agreed! We have these pieces of information in SemanticDB for similar reasons. If we add these fields to SCIP, would you be interested in contributing the changes to rust-analyzer? I estimate it will be easy to update scip-java to emit the new fields. We can create a tracking issue for the remaining languages.
donsbot commentedon May 23, 2023
Yep we would likely want to extend rust-analyzer immediately to support these.
Add `SymbolInformation.Kind`
SymbolInformation.kind
#156olafurpg commentedon May 23, 2023
I opened a PR adding
SymbolInformation.kind
as a starting point. Will follow up with separate PRs adding the remaining fields.Add `SymbolInformation.display_name`
SymbolInformation.display_name
#158olafurpg commentedon May 25, 2023
@donsbot can you elaborate on your motivation for
SymbolInformation.owner
? I am OK with adding this field on the condition it should only be used for locals. My concern with adding this field for global symbols is that indexers will start to emit malformed global symbols. I think it's a very desirable property that clients can implement a lot of functionality with an array of symbols (string[]
) without having to load a global symbol table (record<string, SymbolInformation>
). If we limitSymbolInformation.owner
to locals then this problem is alleviated since it's cheap to load a local-only symbol table.I think it should be unavoidable that clients need to implement a symbol parser to walk up the symbol hierarchy. The Java logic for the symbol parser is ~150 lines of code and we could aim to provide similar APIs for all language bindings.
donsbot commentedon May 25, 2023
I was imagining most indexers usually know the "containing" symbol.
which lets us build navigation by containing scope.
There's an equivalent relationship for parent by "inheritance" (e.g.
A extends B
).So In this case I was just interested in the relationship between a symbol and its parent (at the scip.Symbol level, not the textual names).
Obviously if we have something like a/b/c() , we can figure out that
a/b
might be the containing parent. But then I need to search for scip.Symbol that match? or is this existent relationship present somewhere else?donsbot commentedon May 25, 2023
The specific product use case is not just symbol outlines on. page, but API listings. E.g. generating cargo-like docs for a language: E.g. looking up "class C" to see its API:
Add `SymbolInformation.display_name` (#158)
Add `SymbolInformation.signature_documentation`
SymbolInformation.signature_documentation
#159olafurpg commentedon May 25, 2023
@donsbot You can parse the owner from the
symbol
fields to support those use-cases (assuming they're global symbols). The Java symbol parser is 150 lines of code and we have had a decent experience porting this logic to other languages with ChatGPT https://github.com/sourcegraph/scip-java/blob/main/scip-semanticdb/src/main/java/com/sourcegraph/scip_semanticdb/SymbolDescriptor.java#L21I am open to add
SymbolInformation.owner
(orSymbolInformation.enclosing_symbol
) to support the same functionality for local symbols.Would that be a satisfiable solution? I'm not 100% against allowing
owner
/enclosing_symbol
for global symbols but I am concerned it increases the risk that indexers stop emitting well-structured symbols as I mentioned in my last comment.donsbot commentedon May 26, 2023
Ok sounds good. Yes I will need it for locals (and can derive it for globals).
Add `SymbolInformation.signature_documentation` (#159)
Add `SymbolInformation.enclosing_symbol`
SymbolInformation.enclosing_symbol
#164Add `SymbolInformation.enclosing_symbol`
Add `SymbolInformation.enclosing_symbol` (#164)
olafurpg commentedon Jun 4, 2023
Closing this as fixed after #164 since all the proposed fields have been added to the spec now. However(!), none of the existing SCIP indexers have been updated yet to emit the new information. Feel free to open an issue in this repository or each SCIP indexer repository if you'd like the indexer to add support for these new fields.
SymbolInformation
fields in the SCIP backend rust-lang/rust-analyzer#15919SymbolInformation
fields sourcegraph/scip-java#666