1# Semantic Tokens
2
3The [LSP](https://microsoft.github.io/language-server-protocol/specifications/specification-3-17/#textDocument_semanticTokens)
4specifies semantic tokens as a way of telling clients about language-specific
5properties of pieces of code in a file being edited.
6
7The client asks for a set of semantic tokens and modifiers. This note describe which ones
8gopls will return, and under what circumstances. Gopls has no control over how the client
9converts semantic tokens into colors (or some other visible indication). In vscode it
10is possible to modify the color a theme uses by setting the `editor.semanticTokenColorCustomizations`
11object. We provide a little [guidance](#Colors) later.
12
13There are 22 semantic tokens, with 10 possible modifiers. The protocol allows each semantic
14token to be used with any of the 1024 subsets of possible modifiers, but most combinations
15don't make intuitive sense (although `async documentation` has a certain appeal).
16
17The 22 semantic tokens are `namespace`, `type`, `class`, `enum`, `interface`,
18		`struct`, `typeParameter`, `parameter`, `variable`, `property`, `enumMember`,
19		`event`, `function`, `member`, `macro`, `keyword`, `modifier`, `comment`,
20		`string`, `number`, `regexp`, `operator`.
21
22The 10 modifiers are `declaration`, `definition`, `readonly`, `static`,
23		`deprecated`, `abstract`, `async`, `modification`, `documentation`, `defaultLibrary`.
24
25The authoritative lists are in the [specification](https://microsoft.github.io/language-server-protocol/specifications/specification-3-17/#semanticTokenTypes)
26
27For the implementation to work correctly the client and server have to agree on the ordering
28of the tokens and of the modifiers. Gopls, therefore, will only send tokens and modifiers
29that the client has asked for. This document says what gopls would send if the client
30asked for everything. By default, vscode asks for everything.
31
32Gopls sends 11 token types for `.go` files and 1 for `.*tmpl` files.
33Nothing is sent for any other kind of file.
34This all could change. (When Go has generics, gopls will return `typeParameter`.)
35
36For `.*tmpl` files gopls sends `macro`, and no modifiers, for each `{{`...`}}` scope.
37
38## Semantic tokens for Go files
39
40There are two contrasting guiding principles that might be used to decide what to mark
41with semantic tokens. All clients already do some kind of syntax marking. E.g., vscode
42uses a TextMate grammar. The minimal principle would send semantic tokens only for those
43language features that cannot be reliably found without parsing Go and looking at types.
44The maximal principle would attempt to convey as much as possible about the Go code,
45using all available parsing and type information.
46
47There is much to be said for returning minimal information, but the minimal principle is
48not well-specified. Gopls has no way of knowing what the clients know about the Go program
49being edited. Even in vscode the TextMate grammars can be more or less elaborate
50and change over time. (Nonetheless, a minimal implementation would not return `keyword`,
51`number`, `comment`, or `string`.)
52
53The maximal position isn't particularly well-specified either. To chose one example, a
54format string might have formatting codes (`%[4]-3.6f`), escape sequences (`\U00010604`), and regular
55characters. Should these all be distinguished? One could even imagine distinguishing
56different runes by their Unicode language assignment, or some other Unicode property, such as
57being [confusable](http://www.unicode.org/Public/security/10.0.0/confusables.txt).
58
59Gopls does not come close to either of these principles.  Semantic tokens are returned for
60identifiers, keywords, operators, comments, and literals. (Sematic tokens do not
61cover the file. They are not returned for
62white space or punctuation, and there is no semantic token for labels.)
63The following describes more precisely what gopls
64does, with a few notes on possible alternative choices.
65The references to *object* refer to the
66```types.Object``` returned by the type checker. The references to *nodes* refer to the
67```ast.Node``` from the parser.
68
691. __`keyword`__ All Go [keywords](https://golang.org/ref/spec#Keywords) are marked `keyword`.
701. __`namespace`__ All package names are marked `namespace`. In an import, if there is an
71alias, it would be marked. Otherwise the last component of the import path is marked.
721. __`type`__ Objects of type ```types.TypeName``` are marked `type`.
73If they are also ```types.Basic```
74the modifier is `defaultLibrary`. (And in ```type B struct{C}```, ```B``` has modifier `definition`.)
751. __`parameter`__ The formal arguments in ```ast.FuncDecl``` nodes are marked `parameter`.
761. __`variable`__  Identifiers in the
77scope of ```const``` are modified with `readonly`. ```nil``` is usually a `variable` modified with both
78`readonly` and `defaultLibrary`. (```nil``` is a predefined identifier; the user can redefine it,
79in which case it would just be a variable, or whatever.) Identifiers of type ```types.Variable``` are,
80not surprisingly, marked `variable`. Identifiers being defined (node ```ast.GenDecl```) are modified
81by `definition` and, if appropriate, `readonly`. Receivers (in method declarations) are
82`variable`.
831. __`member`__ Members are marked at their definition (```func (x foo) bar() {}```) or declaration
84in an ```interface```. Members are not marked where they are used.
85In ```x.bar()```, ```x``` will be marked
86either as a `namespace` if it is a package name, or as a `variable` if it is an interface value,
87so distinguishing ```bar``` seemed superfluous.
881. __`function`__ Bultins (```types.Builtin```) are modified with `defaultLibrary`
89(e.g., ```make```, ```len```, ```copy```). Identifiers whose
90object is ```types.Func``` or whose node is ```ast.FuncDecl``` are `function`.
911. __`comment`__ Comments and struct tags. (Perhaps struct tags should be `property`?)
921. __`string`__ Strings. Could add modifiers for e.g., escapes or format codes.
931. __`number`__ Numbers. Should the ```i``` in ```23i``` be handled specially?
941. __`operator`__ Assignment operators, binary operators, ellipses (```...```), increment/decrement
95operators, sends (```<-```), and unary operators.
96
97Gopls will send the modifier `deprecated` if it finds a comment
98```// deprecated``` in the godoc.
99
100The unused tokens for Go code are `class`, `enum`, `interface`,
101		`struct`, `typeParameter`, `property`, `enumMember`,
102		`event`, `macro`, `modifier`,
103		`regexp`
104
105## Colors
106
107These comments are about vscode.
108
109The documentation has a [helpful](https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide#custom-textmate-scope-mappings)
110description of which semantic tokens correspond to scopes in TextMate grammars. Themes seem
111to use the TextMate scopes to decide on colors.
112
113Some examples of color customizations are [here](https://medium.com/@danromans/how-to-customize-semantic-token-colorization-with-visual-studio-code-ac3eab96141b).
114
115## Note
116
117While a file is being edited it may temporarily contain either
118parsing errors or type errors. In this case gopls cannot determine some (or maybe any)
119of the semantic tokens. To avoid weird flickering it is the responsibility
120of clients to maintain the semantic token information
121in the unedited part of the file, and they do.