1--[[****************************************************************************
2*                                                                              *
3*                    Highlight Language Definition Template                    *
4*                                                                              *
5*                v1.1.0 (2019/03/02) | Highlight v3.49 | Lua 5.3               *
6*                                                                              *
7*                              by Tristano Ajmone                              *
8*                                                                              *
9********************************************************************************
10This is a langDef template intended to be a starting point to build your own
11custom language definition on top of it. Just copy it, rename it, and edit it as
12required. All possible syntax elements definition are provided, with some dummy
13values or useful presets. Just discard what you don't need.
14
15Guidelines and some common presets are also provided in comments, in case you
16need them and hoping they might spare you some research or cut-&-pasting.
17
18The goal here is to provide as much useful information as possible, so that
19building a custom language definition from scratch is simplified by in-file
20resources, reducing the need to consult the documentation.
21
22Hopefully, this template will help both those creating their first langDef as
23well as experienced users.
24--------------------------------------------------------------------------------
25** HOW TO DISABLE SYNTAX ELEMENTS **
26If you wish to suppress a syntax element, you can assign a never matching
27regular expression to its definition (thanks to André Simon for the tip):
28
29    [=[ \A(?!x)x ]=]
30--------------------------------------------------------------------------------
31** MANDATORY ELEMENTS **
32The bare minimum definitions required for a langDef file to be valid are:
33  -- Description
34  -- Keywords
35If a langDef file doesn't provide these definitions, Highlight will raise an
36error. All other definitions are optional.
37--------------------------------------------------------------------------------
38** HIGHLIGHT DEFAULTS **
39Highlight provides a default definition to the following syntax elements:
40  -- Identifiers
41  -- Digits
42  -- Escape
43All other definition are empty/false by default.
44--------------------------------------------------------------------------------
45** HIGHLIGHT REGEX ENGINE **
46Highlight uses the boost xpressive library for handling regular expressions.
47You can find the official documentation at:
48    https://www.boost.org/doc/libs/1_46_1/doc/html/xpressive/user_s_guide.html
49For tutorials and information on regular expressions, see:
50    https://www.regular-expressions.info
51--------------------------------------------------------------------------------
52For more info on creating language definitions, see:
53    https://github.com/andre-simon/highlight/blob/master/README#L486
54    http://www.andre-simon.de/doku/highlight/en/highlight.php#ch3_3
55--------------------------------------------------------------------------------
56Written by Tristano Ajmone:
57    <tajmone@gmail.com>
58    https://github.com/tajmone
59    https://gitlab.com/tajmone
60Released into the public domain according to the Unlicense terms:
61    http://unlicense.org/
62------------------------------------------------------------------------------]]
63
64Description = "Lang Name" -- Syntax description
65
66Categories = {"source", "script"} -- The categories the syntax belongs to.
67
68-- Common categories are: "source", "markup", "script" and "config".
69-- For more info see the Wiki:
70--
71--    https://gitlab.com/saalen/highlight/wikis/LangDefs-Elements#categories
72
73-- *** DON'T FORGET TO: *** ----------------------------------------------------
74
75  -- Add your new lang's file extensions in "$HL_DIR/filetypes.conf"
76  -- Add comments that might help others take on your work in the future.
77  -- Test the landDef against language edge cases.
78  -- Credit your reference sources.
79
80--------------------------------------------------------------------------------
81
82IgnoreCase = false -- Are keywords case-sensitive? (true/false)
83
84EnableIndentation = false -- Syntax may be reformatted and indented? (true/false)
85
86--[[============================================================================
87                                  IDENTIFIERS
88================================================================================
89String, Regular expression which defines identifiers (optional).
90
91Usually the default Identifiers definition suits most languages; if not, you
92can customize it to match yout lang needs.                                  --]]
93
94Identifiers = [=[ [a-zA-Z_]\w* ]=] -- Highlight's default Identifiers definition
95
96--[[============================================================================
97                                    COMMENTS
98================================================================================
99Comments = { {Block, Nested?, Delimiter=} }
100
101  Block:     Boolean, true if comment is a block comment
102  Nested:    Boolean, true if block comments can be nested (optional)
103  Delimiter: List, contains open delimiter regex (line comment) or open and
104              close delimiter regexes (block comment)
105------------------------------------------------------------------------------]]
106Comments = {
107  -- Define BLOCK-COMMENTS delimiters
108  { Block  = true,
109    Nested = false, -- Can block comments be nested? (optional)
110    Delimiter = {
111      -- C style delimiters pair: /* */
112      [=[ \/\* ]=],
113      [=[ \*\/ ]=]
114    }
115  },
116  -- Define SINGLE-LINE-COMMENTS delimiter
117  { Block = false,
118    Delimiter = { [=[ // ]=] } -- C style delimiter: //
119  }
120}
121--[[============================================================================
122                                    STRINGS
123================================================================================
124Strings = { Delimiter|DelimiterPairs={Open, Close, Raw?}, Escape?, Interpolation?,
125            RawPrefix?, AssertEqualLength? }
126
127  Delimiter:         String, regular expression which describes string delimiters
128  DelimiterPairs:    List, includes open and close delimiter expressions if not
129                      equal, includes optional Raw flag as boolean which marks
130                      delimiter pair to contain a raw string
131  Escape:            String, regex of escape sequences (optional)
132  Interpolation:     String, regex of interpolation sequences (optional)
133  RawPrefix:         String, defines raw string indicator (optional)
134  AssertEqualLength: Boolean, set true if delimiters must have the same length
135------------------------------------------------------------------------------]]
136Strings = {
137
138--------------------------------------------------------------------------------
139--                              STRING DELIMITERS
140--------------------------------------------------------------------------------
141
142  -- SYMMETRICAL STRINGS DELIMITERS
143  Delimiter = [=[ "|' ]=], -- Double- and single-quote delimiters: " '
144
145  -- ASYMMETRICAL STRINGS DELIMITERS
146  -- Example: Lua style string-delimiters:
147  DelimiterPairs= {
148    { Open  = [=[ \[=*\[ ]=],  -- [[  [=[  [===[   etc.
149      Close = [=[ \]=*\] ]=],  -- ]]  ]=]  ]===]   etc.
150      Raw = true, -- Are these raw string delimiters? (true/false)
151    }
152  },
153  AssertEqualLength = true,  -- Delimiters must have the same length?
154
155  -- RAW-STRING PREFIX (if language supports it)
156  RawPrefix = "R",           -- Raw string indicator (optional): R (C style)
157--[[----------------------------------------------------------------------------
158                                ESCAPE SEQUENCES
159--------------------------------------------------------------------------------
160If the language at hand supports escape sequences, define a RegEx pattern to
161capture them.
162
163    https://en.wikipedia.org/wiki/Escape_sequences_in_C
164
165NOTE: Escape sequences are not restricted to occur inside strings only, they
166      will be matched anywhere in the source code (some languages, like Perl
167      and Bash, allow their use anywhere). Usually this doesn't constitute a
168      problem, but in some languages this uncostrained behaviour might cause
169      false positives matches; in such cases you'll need to restrict escape
170      sequences occurrence to inside-strings context only by implementing a
171      custom hook via the OnStateChange() function --- see "Hook Preset #01"
172      further down.                                                         --]]
173
174-- Highlight's default built-in Escape definition:
175  Escape = [=[ \\u[[:xdigit:]]{4}|\\\d{3}|\\x[[:xdigit:]]{2}|\\[ntvbrfa\\\?'"] ]=],
176--[[----------------------------------------------------------------------------
177                                INTERPOLATION
178--------------------------------------------------------------------------------
179String, regex of interpolation sequences (optional)
180
181To understand interpolation, here is an example from Javascript:
182
183    var apples = 6;
184    console.log(`There are ${apples} apples in the basket!`);
185
186which will otuput:
187
188    There are 6 apples in the basket!
189
190References:
191    https://en.wikipedia.org/wiki/String_interpolation
192--]]
193  Interpolation = [=[ \$\{.+?\} ]=], -- Javascript Interpolation: ${ ... }
194}
195
196--[[============================================================================
197                                  PREPROCESSOR
198================================================================================
199PreProcessor = { Prefix, Continuation? }
200
201  Prefix:        String, regular expression which describes open delimiter
202  Continuation:  String, contains line continuation character (optional).
203
204NOTE: This element is treated by Highlight parser in a similar way to single-
205      line comments: it swallows up everything from the matching Prefix up to
206      the end of the line -- but unlike comment lines (which can't contain
207      further syntax elements), the parser will still be looking for some
208      syntax elements (in the current line) which might be reasonably found
209      within a line of preprocessor directives (eg: strings and comments);
210      but once these elements are dealt with, the parser will resume the
211      PreProcessor state to carry on parsing the rest of the line.
212
213      Furthermore, the Continuation character allows this element to span
214      across multiple line (without the need of an opening and closing pair,
215      unlike multiline comments do).
216--]]
217PreProcessor = {  -- C/C++ PreProcessor example:
218  Prefix = [=[ # ]=],  -- Hash char ('#') marks beginning of preprocessor line
219  Continuation = "\\", -- Backslash ('\') marks continuation of preprocessor line
220}
221
222--[[============================================================================
223                                  OPERATORS
224============================================================================--]]
225Operators = [=[ \&|<|>|\!|\||\=|\/|\*|\%|\+|\-|~ ]=] -- Match: &<>!|=/*%+-~
226
227--[[============================================================================
228                                  DIGITS
229================================================================================
230String, Regular expression which defines digits (optional).                 --]]
231
232-- Highlight's default built-in Digits definition:
233Digits = [=[ (?:0x|0X)[0-9a-fA-F]+|\d*[\.]?\d+(?:[eE][\-\+]\d+)?[lLuU]* ]=]
234
235--[[============================================================================
236                                  KEYWORDS
237================================================================================
238Keywords = { Id, List|Regex, Group? }
239
240  Id:    Integer, keyword group id (values 1-6, can be reused for several keyword
241          groups)
242  List:  List, list of keywords
243  Regex: String, regular expression
244  Group: Integer, capturing group id of regular expression, defines part of regex
245          which should be returned as keyword (optional; if not set, the match
246          with the highest group number is returned (counts from left to right))
247
248NOTE: Keyword group Ids are not limited to 6, you can create as many as you
249      need; but bare in mind that most themes that ship with Highlight usually
250      provide definitions only for Ids 1-6, so in order to syntax-color Keyword
251      groups with Ids greater than 6 you'll need to define a theme that covers
252      their definitions.                                                    --]]
253
254Keywords = {
255--------------------------------------------------------------------------------
256--                               Keywords by List
257--------------------------------------------------------------------------------
258-- NOTE: If you've set `IgnoreCase = false` then all keywords in the list must
259--       be in lowercase otherwise they'll never match! With case-insensitive
260--       languages, Highlight converts to lowercase the token before comparing
261--       it to the entries of the Keywords list, but the list entries are not
262--       manipulated before comparison.
263  { Id = 1,
264    List = {
265      -- Keywords list
266      "If", "Then", "Else"
267    }
268  },
269--------------------------------------------------------------------------------
270--                              Keywords by RegEx
271--------------------------------------------------------------------------------
272  { Id = 2,
273    Regex = [=[ (\w+)\s*\:\: ]=],
274    Group = 1
275  }
276}
277
278--[=[===========================================================================
279                                NESTED LANGUAGES
280================================================================================
281If the language at hand may contain other languages in its source code (eg, like
282HTML sources might contain CSS, JavaScript or PHP code):
283------------------------------------------------------------------------------
284    NestedSections = {Lang, Delimiter= {} }
285
286      Lang:      String, name of nested language
287      Delimiter: List, contains open and close delimiters of the code section
288------------------------------------------------------------------------------
289EXAMPLE to allow HTML code to contain PHP and CSS (adapted from "html.lang"):
290
291    NestedSections = {
292      { Lang = "php",
293        Delimiter = {
294          [[<\?php]], -- PHP opening delimiter: <?php
295          [[\?>]]     -- PHP closing delimiter: ?>
296        }
297      },
298      { Lang = "css",
299        Delimiter = {
300          [[<style\s+type\=[\'\"]text\/css[\'\"]>]], -- <style type="text/css">
301          [[<\/style>]]                              -- </style>
302        }
303      }
304    }
305
306-----------------------------------------------------------------------------]=]
307
308--[[****************************************************************************
309*                                                                              *
310*                            CUSTOM HOOK-FUNCTIONS                             *
311*                                                                              *
312********************************************************************************
313In some cases you might need to gain finer control over Highlight parser; you
314can do so by defininng some custom hooks via the OnStateChange() function.
315
316For more info, see:
317    https://github.com/andre-simon/highlight/blob/master/README#L596
318    https://github.com/andre-simon/highlight/blob/master/README_PLUGINS#L170
319--------------------------------------------------------------------------------
320                                OnStateChange()
321--------------------------------------------------------------------------------
322This function is a hook which is called if an internal state changes (e.g.
323from HL_STANDARD to HL_KEYWORD if a keyword is found). It can be used to alter
324the new state or to manipulate syntax elements like keyword lists.
325
326    OnStateChange(oldState, newState, token, kwGroupID)
327
328      Hook Event: Highlighting parser state change
329      Parameters: oldState:  old state
330                  newState:  intended new state
331                  token:     the current token which triggered the new state
332                  kwGroupID: if newState is HL_KEYWORD, the parameter
333                              contains the keyword group ID
334      Returns:    Correct state to continue OR HL_REJECT
335
336Return HL_REJECT if the recognized token and state should be discarded; the
337first character of token will be outputted and highlighted as "oldState".
338--------------------------------------------------------------------------------
339                                  STATES VARS
340--------------------------------------------------------------------------------
341The following integer variables, representing the internal highlighting states,
342are available within a language definition (read-only):
343
344    HL_STANDARD
345    HL_STRING
346    HL_NUMBER
347    HL_LINE_COMMENT
348    HL_BLOCK_COMMENT
349    HL_ESC_SEQ
350    HL_PREPROC
351    HL_PREPROC_STRING
352    HL_OPERATOR
353    HL_INTERPOLATION
354    HL_LINENUMBER
355    HL_KEYWORD
356    HL_STRING_END
357    HL_LINE_COMMENT_END
358    HL_BLOCK_COMMENT_END
359    HL_ESC_SEQ_END
360    HL_PREPROC_END
361    HL_OPERATOR_END
362    HL_INTERPOLATION_END
363    HL_KEYWORD_END
364    HL_EMBEDDED_CODE_BEGIN
365    HL_EMBEDDED_CODE_END
366    HL_IDENTIFIER_BEGIN
367    HL_IDENTIFIER_END
368    HL_UNKNOWN
369    HL_REJECT
370
371********************************************************************************
372*                              SOME HOOK PRESETS                               *
373********************************************************************************
374Below you'll find some common hooks examples/presets which might come handy in
375various situation. Just delete what you don't need, or adapt the code to your
376own needs.                                                                  --]]
377
378-- =============================================================================
379-- Hook Preset #01 -- Escape Sequences Only Inside String
380-- =============================================================================
381function OnStateChange(oldState, newState, token, kwgroup)
382--  This function ensure that escape sequences outside strings are ignored.
383--  Based on André Simon's reply to Issue #23:
384--  -- https://github.com/andre-simon/highlight/issues/23#issuecomment-332002639
385  if newState==HL_ESC_SEQ and oldState~=HL_STRING  then
386    return HL_REJECT
387  end
388  return newState
389end
390
391--[[ ** DON'T FORGET THE CHANGELOG **
392
393Readapt the changelog found below to your new syntax definition.
394Changelogs are important for other people who might take on maintaining
395your langDef from where you left it.
396
397* Always include in each revision:
398  * langDef version number*
399  * date of release
400  * Highlight version used at the time of release
401
402* Adopt a meaningful release scheme, like MAJOR.MINOR.PATCH, where:
403  * MAJOR is incremented for backward incompatible changes,
404    (and MINOR and PATCH are set to "0")
405  * MINOR is incremented for backwards-compatible added functionality
406    (and PATCH is set to "0")
407  * PATCH is incremented for backwards-compatible bug fixes and retouches
408
409  For more info, see:
410    http://semver.org
411--]]
412
413--[[============================================================================
414                                    CHANGELOG
415================================================================================
416v1.1.1 (2021/03/15) | Highlight v4
417  - Updated keyword default count
418
419v1.1.0 (2019/03/02) | Highlight v3.49
420  - Added new 'Categories' element.
421
422v1.0.4 (2018/07/02) | Highlight v3.43
423  - Cleaned up code and comments.
424  - Some notes added.
425
426v1.0.3 (2018/01/04) | Highlight v3.41
427  - Minor tweaks.
428
429v1.0.1 (2017/11/19) | Highlight v3.40
430  - HIGHLIGHT DEFAULTS FIX: Add `Escape` to list of default definitions, as well
431    as its RegEx string as proposed preset.
432
433v1.0.0 (2017/11/18) | Highlight v3.40
434  - First release.                                                          --]]
435