1`#foo` is a symbol value. 2Symbol values are used when you need a C-like enumerated type. 3 4Rationale: I like the syntax. 5 6I want it for modelling the x3d -O colour=face|vertex option. 7You will type `-Ocolour=#vertex` 8instead of `-Ocolour='"vertex"'` or `-Ocolour={vertex:}`. 9The latter syntax is too cumbersome for the Unix command line (or cmd.exe). 10 11This syntax is also nice in the general case where you are working with 12enum types or algebraic types. In that context, we want `#foo` to work as a 13pattern. 14 15Note, #foo is intentionally similar to a Twitter hash tag. 16Also, # is reserved as a prefix for future syntax. 17Eg, if I need more container types, #[a,b,c] is a set, 18#{key1: value1, ...} is a map. Or, #! comments. 19 20What is the type of a symbol? How do they print? 21 0) No symbol syntax. Use strings, special syntax for string-valued options. 22 1) #vertex abbreviates "vertex", which prints as "vertex". 23 2) #vertex abbreviates {vertex:null}, which prints as #vertex. 24 3) Symbols are a new (8th) data type. 25 26But what about a command line option X with an algebraic type, where some 27variants have values, and some variants do not? This can't be a string-valued 28option. 29 * Without symbol syntax: '-OX="foo"' and -OX={bar:a} 30 * With symbol syntax: -OX=#foo and -OX={bar:a} 31 * With multipart record definitions: -OX=#foo and -OX.bar=a 32 33Judgement: 34* Not 0, doesn't work for options with algebraic types. 35* Not 3, don't want extra complexity of a new type. 36* 1 or 2. 37* I slightly prefer 2: #foo == {foo:null}, prints as #foo, is a pattern. 38 #"foo bar" if name is not a C identifier. 39 40Symbols are used to simulate 'enumerated types'. 41Symbols are used to simulate niladic variants in an algebraic type. 42 43Symbols are used to simulate 'enumerated types' in SubCurv. 44Drop-down-list picker type. In a C-like language, the options in a drop-down 45list would be enums, which would map to integers in GLSL. So these should be 46symbols? 47 parametric {s :: dropdown[#square,#circle,#triangle] = #square} ... 48In SubCurv, symbols are normally illegal, and even if we get record values, 49the null value in a symbol is illegal. But, if a dropdown picker is used, 50then the symbols it uses become legal SubCurv values, represented as integers. 51We could also use these semantics for interpreting an is_enum[#foo,#bar] 52type predicate in SubCurv. 53 540. No Symbols 55------------- 56An alternate to this proposal is to parse enum-valued -O options differently 57from other options. Enum-valued -O options are actually string valued options, 58with only a fixed set of string values being legal. We just interpret the 59option value as a string, without any quotation syntax required. 60 61We parse pure string valued options the same way: the option string is 62taken literally, you don't need to include double quote characters. 63But, we do interpret '$' escape sequences, so you still have full access 64to the Curv expression language in string valued options if you need it. 65 66Or, the option parser treats `#identifier` as a special case, 67so you can type -Oname=#foo and set name to the string "foo". 68 69Enum values are string values. This means we want "foo" to work as a pattern. 70 711. Symbols as Strings 72--------------------- 73Pros: 74* If JSON export is important, strings are the most natural translation 75 for symbols. It's what you would use if modelling data directly in JSON. 76 * Counterexample: in JSON-API, a picker config is {"type":data}. If no 77 data is needed, I use {"type":null} instead of "type". 78* Strings as the representation for symbols is simple and easy for users 79 to understand. 80* When I first encountered Lisp/Scheme, the distinction between symbols and 81 strings seemed unnecessary. Curv doesn't use different types for integers 82 and non-integral reals. So why make this distinction? If the only technical 83 benefit is "faster equality", that's pretty weak. 84* This proposal is the simplest, and requires the least code to implement. 85 86If symbols are strings, why provide an alternate syntax for string literals? 87* Usability on the command line, for setting enum-valued options. 88 -Omethod=#sharp instead of '-Omethod="sharp"'. 89* At present, strings are rarely used in Curv. If you see a string literal 90 in Curv, it's probably uninterpreted text, to be printed to the console 91 or rendered using the future `text` primitive. Writing #foo instead of "foo" 92 is a signal to the reader that this is a semantic tag, not uninterpreted 93 natural language text. Symbols are used as enum values, as nilary constructors 94 for an algebraic type, as field names for indexing a record. 95* #foo is a pattern, "foo" is not a pattern due to field generator syntax. 96 97Cons: 98* Since symbols have semantics (semantic tags, not uninterpreted text), 99 they ought to print as symbols, not as quoted strings. 100* A string is a collection, but symbols are 'atomic' values. 101 Conceptually, a non-string symbol type is a better match to the domain. 102* Semantically, symbols are more closely related to records than they 103 are to strings, due to the analogy with enum types and algebraic types. 104* I don't want symbols to compare equal to strings. 105 I want them to be disjoint from strings, so that overloaded 106 functions can distinguish symbol arguments from string arguments. 107* Symbol equality is traditionally faster than string equality. 108 109JSON export: #foo -> "foo" 110 1112. Symbols as Records 112--------------------- 113Here's the sales pitch for this design alternative: 114* Curv has 7 data types, there's no compelling reason to add an 8th. 115 This preserves the ability to map all Curv non-function data to JSON. 116* Symbols are closely related to records, when you think about how 117 algebraic types are represented in Curv. 118* Records are the general mechanism for representing new data types. 119 So representing symbols as records fits this tendency. 120* Record literals are patterns. Symbol literals need to be patterns. 121* #foo prints as #foo, not as a string. 122* Symbols and strings are distinct, which is good, they have different semantic 123 roles. 124 125Let's recall the doctrine for representing labelled options 126and algebraic data types in Curv. 127 128* A set of labelled arguments is a record. 129 Examples from other languages: 130 * Swift function calls 131 * HTML attributes, within a tag 132 * Unix command line arguments: options 133 Some of these domains permit labelled arguments that have 134 just a name, not a value. The underlying parameter is a 135 boolean, defaulting to false, but if the name is specified, 136 the parameter becomes true. Eg, --foo instead of --foo=value 137 in Unix, or foo instead of foo=value as an HTML attribute. 138 That's why `foo:` abbreviates `foo:true` in a Curv record. 139 140A Haskell algebraic type is a set of named alternatives. 141The constructor for an alternative is written as `Foo` or `Bar value`. 142We can think of `Foo` as an abbreviation; you would otherwise need 143to write `Foo ()` if all alternatives were required to have an argument. 144 145In Curv, an algebraic type is a set of tagged values of the 146form {name: value} -- tagged values are singleton records. 147The abbreviation for tagged value that only needs a name is {name:}: 148the value is implicitly `true`. In this case, the choice of the value 149`true` is arbitrary: `null` would make as much sense. 150* I'm no longer convinced that conflating enum values with singleton boolean 151 options is a good idea. Let's define #foo == {foo:null}, which prints as #foo. 152 By contrast, {foo:} prints as {foo:true}. 153 154With the introduction of symbol syntax, the idiomatic syntax for 155members of an algebraic type will be #foo and {bar: value}. 156 157Possible reasons for identifying #foo with {foo:} in an algebraic type. 158* Maybe there are benefits to uniformly representing all members 159 of an algebraic type as singleton records? 160 * Like, an algebraic value is isomorphic to a single field of a record, 161 and can be interpolated into a set of labelled arguments using `...`. 162 * A uniform API for extracting the name and value components of an 163 algebraic value. 164 165JSON export: #foo -> {"foo":null} 166 1673. Symbols as an Abstract Type 168------------------------------ 169Sales pitch: 170* Symbols are fully abstract, simple, scalar values. 171 The only Symbol operations are construction, equality, conversion to string 172 (which are the generic operations supported by all values). 173* #foo prints as #foo, is only equal to values that print as #foo. 174 There's no aliasing with other types. Simple. 175* (Why not use strings?) Symbols do not compare equal to strings. 176 Overloaded functions can distinguish symbol arguments from string arguments. 177* Symbols are more fundamental than strings or records, which are aggregates. 178* Symbols are the natural representation for nilary enum constructors (instead 179 of strings or integers). 180* Symbols are the natural representation for field names in records 181 (instead of strings; see Structure proposal). 182* Define true=#true, false=#false, null=#null. Then, in conjunction with Maps, 183 all data types have literals that can be used as patterns. (But, aliasing.) 184* Symbols in SubCurv: 185 * #foo is compiled to an enum value with an int representation. 186 * `dropdown_menu[#Value_Noise,#Fractal_Noise]` is a picker value. 187 * is_enum[#foo,#bar] is a type predicate supported by SubCurv? 188* Symbols might be useful in the Term proposal. 189* Symbols might have a use if Curv becomes homoiconic and supports macros? 190 191Instances of algebraic types are notated as: 192 #nilary 193 {binary: (a, b)} 194The field name `binary` is internally represented as #binary, 195so in this sense the constructor name is always a symbol. 196 197Construction: 198 #foo 199 #'hello world' 200 201A conversion from String to Symbol? make_symbol "foo" == #foo. 202This is in the same category as a conversion from String to Number. 203It shouldn't normally be needed, given the role of strings in Curv. 204 205Conversion to string: 206 "${#foo}" becomes "foo" or strcat[#foo] 207 "$(#foo)" becomes "#foo" or repr[#foo] 208 `strcat[#'Hello world']` becomes "Hello world". 209 210Field names are represented by symbols (Structure proposal). 211* `fields` returns a list of symbols. 212 213Cons: 214* Explaining to users why symbols are different from strings. 215 It's doable, esp. if #true and #false are the boolean values. 216* Is there any context where we need a variable that is either a string 217 or a symbol? Or are the use cases disjoint? (Because then why not unify them.) 218 219JSON export: #foo -> {"\u0000":"#foo"} 220 or "\u0000foo" 221or record keys are strings, no Curv value maps to JSON null, and 222 #foo -> {"foo":null} 223 224Symbols are not Strings 225----------------------- 226A Symbol is an abstract value whose only property is its name. 227The symbol `#foo` prints as `#foo`, and is only equal to itself. 228You can compare a symbol for equality to any other value, use it as a map key, 229or convert it to a string. Those are the only operations. 230 231Symbol constants look like Twitter hash tags, and that's not a coincidence. 232Symbols are abstract names that have semantic meaning within a program. 233 234In Curv, the Boolean values are called `#true` and `#false` (they are symbols), 235and this is a good example of what symbols are used for. They are used to 236distinguish between several different named alternatives. 237 238Statically typed languages like C, Rust, Swift and Go do not have a generic 239symbol type. Instead, they have user-defined `enum` types, which serve the 240same purpose. Internally, `enum` values are represented efficiently by small 241integers. When Curv programs are compiled into statically typed code (eg, into 242C++ or into GLSL), symbol values are compiled to small integers or enum values. 243 244The only other languages with a symbol type are dynamically typed languages: 245* Lisp, Scheme and other languages from the Lisp family. 246* Ruby. 247* Erlang and Elixir (where symbols are called "atoms"). 248Javascript has a Symbol class, but it is an unrelated concept. 249 250When users first encounter Symbols in a language like Scheme, Elixir or Ruby, 251it can be unclear how Symbols differ from Strings. In Curv, the distinction 252is very clear. 253 254Strings are meant to represent uninterpreted text that is destined to form 255part of the program's output. 256* Documentation/help strings (in a future language version). 257* A string of text that will be rendered into an image using the future `text` 258 primitive. 259* A string of text that will be printed as a debug message. 260* A string of text that represents the final output of a program 261 (in the case where you are using Curv to convert your data to some text 262 based file format for further use outside of Curv). 263 264You are not meant to parse strings. Curv has no way of opening and reading a 265text file, so there's no input to parse. Curv isn't a text processing language, 266and doesn't have regular expressions or parsing facilities. 267 268You are not meant to use strings to encode meaning within your data structures. 269* You shouldn't internally represent a compound data structure using Strings, 270 because now your code has to parse that string to traverse the data structure. 271 That's a code smell, because parsing is complex and error prone compared to 272 just traversing a real data structure. 273* You shouldn't use strings to encode semantically meaningful names, eg denoting 274 one of several alternatives. That's what Symbols are for. If your code 275 compares two strings for equality, or uses a string as a map key, 276 then you should use Symbols instead. 277