1`#foo` is a symbol value.
2Symbol values are used when you need a C-like enumerated type.
3
4Rationale: I like the syntax.
5
6I want it for modelling the x3d -O colour=face|vertex option.
7You will type `-Ocolour=#vertex`
8instead of `-Ocolour='"vertex"'` or `-Ocolour={vertex:}`.
9The latter syntax is too cumbersome for the Unix command line (or cmd.exe).
10
11This syntax is also nice in the general case where you are working with
12enum types or algebraic types. In that context, we want `#foo` to work as a
13pattern.
14
15Note, #foo is intentionally similar to a Twitter hash tag.
16Also, # is reserved as a prefix for future syntax.
17Eg, if I need more container types, #[a,b,c] is a set,
18#{key1: value1, ...} is a map. Or, #! comments.
19
20What is the type of a symbol? How do they print?
21 0) No symbol syntax. Use strings, special syntax for string-valued options.
22 1) #vertex abbreviates "vertex", which prints as "vertex".
23 2) #vertex abbreviates {vertex:null}, which prints as #vertex.
24 3) Symbols are a new (8th) data type.
25
26But what about a command line option X with an algebraic type, where some
27variants have values, and some variants do not? This can't be a string-valued
28option.
29 * Without symbol syntax: '-OX="foo"' and -OX={bar:a}
30 * With symbol syntax: -OX=#foo and -OX={bar:a}
31 * With multipart record definitions: -OX=#foo and -OX.bar=a
32
33Judgement:
34* Not 0, doesn't work for options with algebraic types.
35* Not 3, don't want extra complexity of a new type.
36* 1 or 2.
37* I slightly prefer 2: #foo == {foo:null}, prints as #foo, is a pattern.
38  #"foo bar" if name is not a C identifier.
39
40Symbols are used to simulate 'enumerated types'.
41Symbols are used to simulate niladic variants in an algebraic type.
42
43Symbols are used to simulate 'enumerated types' in SubCurv.
44Drop-down-list picker type. In a C-like language, the options in a drop-down
45list would be enums, which would map to integers in GLSL. So these should be
46symbols?
47    parametric {s :: dropdown[#square,#circle,#triangle] = #square} ...
48In SubCurv, symbols are normally illegal, and even if we get record values,
49the null value in a symbol is illegal. But, if a dropdown picker is used,
50then the symbols it uses become legal SubCurv values, represented as integers.
51We could also use these semantics for interpreting an is_enum[#foo,#bar]
52type predicate in SubCurv.
53
540. No Symbols
55-------------
56An alternate to this proposal is to parse enum-valued -O options differently
57from other options. Enum-valued -O options are actually string valued options,
58with only a fixed set of string values being legal. We just interpret the
59option value as a string, without any quotation syntax required.
60
61We parse pure string valued options the same way: the option string is
62taken literally, you don't need to include double quote characters.
63But, we do interpret '$' escape sequences, so you still have full access
64to the Curv expression language in string valued options if you need it.
65
66Or, the option parser treats `#identifier` as a special case,
67so you can type -Oname=#foo and set name to the string "foo".
68
69Enum values are string values. This means we want "foo" to work as a pattern.
70
711. Symbols as Strings
72---------------------
73Pros:
74* If JSON export is important, strings are the most natural translation
75  for symbols. It's what you would use if modelling data directly in JSON.
76  * Counterexample: in JSON-API, a picker config is {"type":data}. If no
77    data is needed, I use {"type":null} instead of "type".
78* Strings as the representation for symbols is simple and easy for users
79  to understand.
80* When I first encountered Lisp/Scheme, the distinction between symbols and
81  strings seemed unnecessary. Curv doesn't use different types for integers
82  and non-integral reals. So why make this distinction? If the only technical
83  benefit is "faster equality", that's pretty weak.
84* This proposal is the simplest, and requires the least code to implement.
85
86If symbols are strings, why provide an alternate syntax for string literals?
87* Usability on the command line, for setting enum-valued options.
88  -Omethod=#sharp instead of '-Omethod="sharp"'.
89* At present, strings are rarely used in Curv. If you see a string literal
90  in Curv, it's probably uninterpreted text, to be printed to the console
91  or rendered using the future `text` primitive. Writing #foo instead of "foo"
92  is a signal to the reader that this is a semantic tag, not uninterpreted
93  natural language text. Symbols are used as enum values, as nilary constructors
94  for an algebraic type, as field names for indexing a record.
95* #foo is a pattern, "foo" is not a pattern due to field generator syntax.
96
97Cons:
98* Since symbols have semantics (semantic tags, not uninterpreted text),
99  they ought to print as symbols, not as quoted strings.
100* A string is a collection, but symbols are 'atomic' values.
101  Conceptually, a non-string symbol type is a better match to the domain.
102* Semantically, symbols are more closely related to records than they
103  are to strings, due to the analogy with enum types and algebraic types.
104* I don't want symbols to compare equal to strings.
105  I want them to be disjoint from strings, so that overloaded
106  functions can distinguish symbol arguments from string arguments.
107* Symbol equality is traditionally faster than string equality.
108
109JSON export: #foo -> "foo"
110
1112. Symbols as Records
112---------------------
113Here's the sales pitch for this design alternative:
114* Curv has 7 data types, there's no compelling reason to add an 8th.
115  This preserves the ability to map all Curv non-function data to JSON.
116* Symbols are closely related to records, when you think about how
117  algebraic types are represented in Curv.
118* Records are the general mechanism for representing new data types.
119  So representing symbols as records fits this tendency.
120* Record literals are patterns. Symbol literals need to be patterns.
121* #foo prints as #foo, not as a string.
122* Symbols and strings are distinct, which is good, they have different semantic
123  roles.
124
125Let's recall the doctrine for representing labelled options
126and algebraic data types in Curv.
127
128* A set of labelled arguments is a record.
129  Examples from other languages:
130  * Swift function calls
131  * HTML attributes, within a tag
132  * Unix command line arguments: options
133  Some of these domains permit labelled arguments that have
134  just a name, not a value. The underlying parameter is a
135  boolean, defaulting to false, but if the name is specified,
136  the parameter becomes true. Eg, --foo instead of --foo=value
137  in Unix, or foo instead of foo=value as an HTML attribute.
138  That's why `foo:` abbreviates `foo:true` in a Curv record.
139
140A Haskell algebraic type is a set of named alternatives.
141The constructor for an alternative is written as `Foo` or `Bar value`.
142We can think of `Foo` as an abbreviation; you would otherwise need
143to write `Foo ()` if all alternatives were required to have an argument.
144
145In Curv, an algebraic type is a set of tagged values of the
146form {name: value} -- tagged values are singleton records.
147The abbreviation for tagged value that only needs a name is {name:}:
148the value is implicitly `true`. In this case, the choice of the value
149`true` is arbitrary: `null` would make as much sense.
150* I'm no longer convinced that conflating enum values with singleton boolean
151  options is a good idea. Let's define #foo == {foo:null}, which prints as #foo.
152  By contrast, {foo:} prints as {foo:true}.
153
154With the introduction of symbol syntax, the idiomatic syntax for
155members of an algebraic type will be #foo and {bar: value}.
156
157Possible reasons for identifying #foo with {foo:} in an algebraic type.
158* Maybe there are benefits to uniformly representing all members
159  of an algebraic type as singleton records?
160  * Like, an algebraic value is isomorphic to a single field of a record,
161    and can be interpolated into a set of labelled arguments using `...`.
162  * A uniform API for extracting the name and value components of an
163    algebraic value.
164
165JSON export: #foo -> {"foo":null}
166
1673. Symbols as an Abstract Type
168------------------------------
169Sales pitch:
170* Symbols are fully abstract, simple, scalar values.
171  The only Symbol operations are construction, equality, conversion to string
172  (which are the generic operations supported by all values).
173* #foo prints as #foo, is only equal to values that print as #foo.
174  There's no aliasing with other types. Simple.
175* (Why not use strings?) Symbols do not compare equal to strings.
176  Overloaded functions can distinguish symbol arguments from string arguments.
177* Symbols are more fundamental than strings or records, which are aggregates.
178* Symbols are the natural representation for nilary enum constructors (instead
179  of strings or integers).
180* Symbols are the natural representation for field names in records
181  (instead of strings; see Structure proposal).
182* Define true=#true, false=#false, null=#null. Then, in conjunction with Maps,
183  all data types have literals that can be used as patterns. (But, aliasing.)
184* Symbols in SubCurv:
185  * #foo is compiled to an enum value with an int representation.
186  * `dropdown_menu[#Value_Noise,#Fractal_Noise]` is a picker value.
187  * is_enum[#foo,#bar] is a type predicate supported by SubCurv?
188* Symbols might be useful in the Term proposal.
189* Symbols might have a use if Curv becomes homoiconic and supports macros?
190
191Instances of algebraic types are notated as:
192    #nilary
193    {binary: (a, b)}
194The field name `binary` is internally represented as #binary,
195so in this sense the constructor name is always a symbol.
196
197Construction:
198    #foo
199    #'hello world'
200
201A conversion from String to Symbol? make_symbol "foo" == #foo.
202This is in the same category as a conversion from String to Number.
203It shouldn't normally be needed, given the role of strings in Curv.
204
205Conversion to string:
206    "${#foo}" becomes "foo"    or strcat[#foo]
207    "$(#foo)" becomes "#foo"   or repr[#foo]
208    `strcat[#'Hello world']` becomes "Hello world".
209
210Field names are represented by symbols (Structure proposal).
211* `fields` returns a list of symbols.
212
213Cons:
214* Explaining to users why symbols are different from strings.
215  It's doable, esp. if #true and #false are the boolean values.
216* Is there any context where we need a variable that is either a string
217  or a symbol? Or are the use cases disjoint? (Because then why not unify them.)
218
219JSON export: #foo -> {"\u0000":"#foo"}
220                  or "\u0000foo"
221or record keys are strings, no Curv value maps to JSON null, and
222             #foo -> {"foo":null}
223
224Symbols are not Strings
225-----------------------
226A Symbol is an abstract value whose only property is its name.
227The symbol `#foo` prints as `#foo`, and is only equal to itself.
228You can compare a symbol for equality to any other value, use it as a map key,
229or convert it to a string. Those are the only operations.
230
231Symbol constants look like Twitter hash tags, and that's not a coincidence.
232Symbols are abstract names that have semantic meaning within a program.
233
234In Curv, the Boolean values are called `#true` and `#false` (they are symbols),
235and this is a good example of what symbols are used for. They are used to
236distinguish between several different named alternatives.
237
238Statically typed languages like C, Rust, Swift and Go do not have a generic
239symbol type. Instead, they have user-defined `enum` types, which serve the
240same purpose. Internally, `enum` values are represented efficiently by small
241integers. When Curv programs are compiled into statically typed code (eg, into
242C++ or into GLSL), symbol values are compiled to small integers or enum values.
243
244The only other languages with a symbol type are dynamically typed languages:
245* Lisp, Scheme and other languages from the Lisp family.
246* Ruby.
247* Erlang and Elixir (where symbols are called "atoms").
248Javascript has a Symbol class, but it is an unrelated concept.
249
250When users first encounter Symbols in a language like Scheme, Elixir or Ruby,
251it can be unclear how Symbols differ from Strings. In Curv, the distinction
252is very clear.
253
254Strings are meant to represent uninterpreted text that is destined to form
255part of the program's output.
256* Documentation/help strings (in a future language version).
257* A string of text that will be rendered into an image using the future `text`
258  primitive.
259* A string of text that will be printed as a debug message.
260* A string of text that represents the final output of a program
261  (in the case where you are using Curv to convert your data to some text
262  based file format for further use outside of Curv).
263
264You are not meant to parse strings. Curv has no way of opening and reading a
265text file, so there's no input to parse. Curv isn't a text processing language,
266and doesn't have regular expressions or parsing facilities.
267
268You are not meant to use strings to encode meaning within your data structures.
269* You shouldn't internally represent a compound data structure using Strings,
270  because now your code has to parse that string to traverse the data structure.
271  That's a code smell, because parsing is complex and error prone compared to
272  just traversing a real data structure.
273* You shouldn't use strings to encode semantically meaningful names, eg denoting
274  one of several alternatives. That's what Symbols are for. If your code
275  compares two strings for equality, or uses a string as a map key,
276  then you should use Symbols instead.
277