• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

Highlighter/H03-May-2022-14,39812,041

Highlighter.phpH A D26-Oct-202011.7 KiB400179

READMEH A D26-Oct-202017 KiB456353

TODOH A D26-Oct-2020324 136

abap.xmlH A D26-Oct-202024.9 KiB803778

avrc.xmlH A D26-Oct-20208.6 KiB317265

cpp.xmlH A D26-Oct-20205.4 KiB202151

css.xmlH A D26-Oct-202012.3 KiB369338

diff.xmlH A D26-Oct-20202.1 KiB4629

dtd.xmlH A D26-Oct-20202 KiB6749

generateH A D26-Oct-20204 KiB172150

generate.batH A D26-Oct-20204.5 KiB189162

html.xmlH A D26-Oct-2020996 3421

java.xmlH A D26-Oct-202095.9 KiB2,8252,640

javascript.xmlH A D26-Oct-20205.2 KiB175149

mysql.xmlH A D26-Oct-202012.2 KiB425380

package.xmlH A D26-Oct-202011.8 KiB244240

perl.xmlH A D26-Oct-202013.3 KiB440332

php.xmlH A D26-Oct-20206.2 KiB195155

python.xmlH A D26-Oct-20206.7 KiB230192

ruby.xmlH A D26-Oct-20204.6 KiB142110

sample.cssH A D26-Oct-2020703 5857

sh.xmlH A D26-Oct-20207.3 KiB243168

sql.xmlH A D26-Oct-202015.7 KiB497476

vbscript.xmlH A D26-Oct-20209.6 KiB306275

xml.xmlH A D26-Oct-20201.1 KiB3824

README

1# $Id$
2
3Introduction
4============
5
6Text_Highlighter is a class for syntax highlighting. The main idea is to
7simplify creation of subclasses implementing syntax highlighting for
8particular language. Subclasses do not implement any new functioanality, they
9just provide syntax highlighting rules. The rules sources are in XML format.
10To create a highlighter for a language, there is no need to code a new class
11manually. Simply describe the rules in XML file and use Text_Highlighter_Generator
12to create a new class.
13
14
15This document does not contain a formal description of API - it is very
16simple, and I believe providing some examples of code is sufficient.
17
18
19Highlighter XML source
20======================
21
22Basics
23------
24
25Creating a new syntax highlighter begins with describing the highlighting
26rules. There are two basic elements: block and region. A block is just a
27portion of text matching a regular expression and highlighted with a single
28color. Keyword is an example of a block. A region is defined by two regular
29expressions: one for start of region, and another for the end. The main
30difference from a block is that a region can contain blocks and regions
31(including same-named regions). An example of a region is a group of
32statements enclosed in curly brackets (this is used in many languages, for
33example PHP and C). Also, characters matching start and end of a region may be
34highlighted with their own color, and region contents with another.
35
36Blocks and regions may be declared as contained. Contained blocks and regions
37can only appear inside regions. If a region or a block is not declared as
38contained, it can appear both on top level and inside regions. Block or region
39declared as not-contained can only appear on top level.
40
41For any region, a list of blocks and regions that can appear inside this
42region can be specified.
43
44In this document, the term "color group" is used. Chunks of text assigned to
45same color group will be highlighted with same color. Note that in versions
46prior 0.5.0 color goups were refered as CSS classes, but since 0.5.0 not only
47HTML output is supported, so "color group" is more appropriate term.
48
49Elements
50--------
51
52The toplevel element is <highlight>. Attribute lang is required and denotes
53the name of the language. Its value is used as a part of generated class name,
54and must only contain letters, digits and underscores. Optional attribute
55case, when given value yes, makes the language case sensitive (default is case
56insensitive). Allowed subelements are:
57
58    * <authors>: Information about the authors of the file.
59        <author>: Information about a single author of the file. (May be used
60        multiple times, one per author.)
61                - name="...": Author's name. Required.
62                - email="...": Author's email address. Optional.
63
64    * <default>: Default color group.
65          - innerGroup="...": color group name. Required.
66
67    * <region>: Region definition
68          - name="...": Region name. Required.
69          - innerGroup="...": Default color group of region contents. Required.
70          - delimGroup="...": color group of start and end of region. Optional,
71            defaults to value of innerGroup attribute.
72          - start="...", end="...": Regular expression matching start and end
73            of region. Required. Regular expression delimiters are optional, but
74            if you need to specify delimiter, use /. The only case when the
75            delimiters are needed, is specifying regular expression modifiers,
76            such as m or U. Examples: \/\* or /$/m.
77          - contained="yes": Marks region as contained.
78          - never-contained="yes": Marks region as not-contained.
79          - <contains>: Elements allowed inside this region.
80                - all="yes" Region can contain any other region or block
81                (except not-contained). May be used multiple times.
82                      - <but> Do not allow certain regions or blocks.
83                            - region="..." Name of region not allowed within
84                              current region.
85                            - block="..." Name of block not allowed within
86                              current region.
87                - region="..." Name of region allowed within current region.
88                - block="..." Name of block allowed within current region.
89          - <onlyin> Only allow this region within certain regions. May be
90            used multiple times.
91                - block="..." Name of parent region
92
93    * <block>: Block definition
94          - name="...": Block name. Required.
95          - innerGroup="...": color group of block contents. Optional. If not
96            specified, color group of parent region or default color group will be
97            used. One would only want to omit this attribute if there are
98            keyword groups (see below) inherited from this block, and no special
99            highlighting should apply when the block does not match the keyword.
100          - match="..." Regular expression matching the block. Required.
101            Regular expression delimiters are optional, but if you need to
102            specify delimiter, use /. The only case when the delimiters are
103            needed, is specifying regular expression modifiers, such as m or U.
104            Examples: #|\/\/ or /$/m.
105          - contained="yes": Marks block as contained.
106          - never-contained="yes": Marks block as not-contained.
107          - <onlyin> Only allow this block within certain regions. May be used
108              multiple times.
109                - block="..." Name of parent region
110          - multiline="yes": Marks block as multi-line. By default, whole
111            blocks are assumed to reside in a single line. This make the things
112            faster. If you need to declare a multi-line block, use this
113            attribute.
114          - <partgroup>: Assigns another color group to a part of the block that
115              matched a subpattern.
116                - index="n": Subpattern index. Required.
117                - innerGroup="...": color group name. Required.
118
119              This is an example from CSS highlighter: the measure is matched as
120              a whole, but the measurement units are highlighted with different
121              color.
122
123                <block name="measure"  match="\d*\.?\d+(\%|em|ex|pc|pt|px|in|mm|cm)"
124                        innerGroup="number" contained="yes">
125                    <onlyin region="property"/>
126                    <partGroup index="1" innerGroup="string" />
127                </block>
128
129    * <keywords>: Keyword group definition. Keyword groups are useful when you
130      want to highlight some words that match a condition for a block with a
131      different color. Keywords are defined with literal match, not regular
132      expressions. For example, you have a block named identifier matching a
133      general identifier, and want to highlight reserved words (which match
134      this block as well) with different color. You inherit a keyword group
135      "reserved" from "identifier" block.
136          - name="...": Keyword group. Required.
137          - ifdef="...", ifndef="..." : Conditional declaration. See
138            "Conditions" below.
139          - inherits="...": Inherited block name. Required.
140          - innerGroup="...": color group of keyword group. Required.
141          - case="yes|no": Overrides case-sensitivity of the language.
142            Optional, defaults to global value.
143          - <keyword>: Single keyword definition.
144                - match="..." The keyword. Note: this is not a regular
145                  expression, but literal match (possibly case insensitive).
146
147Note that for BC reasons element partClass is alias for partGroup, and
148attributes innerClass and  delimClass  are aliases of innerGroup and
149delimGroup, respectively.
150
151
152Conditions
153----------
154
155Conditional declarations allow enabling or disabling certain highlighting
156rules at runtime. For example, Java highlighter has a very big list of
157keywords matching Java standard classes. Finding a match in this list can take
158much time. For that reason, corresponding keyword group is declared with
159"ifdef" attribute :
160
161  <keywords name="builtin" inherits="identifier" innerClass="builtin"
162            case="yes" ifdef="java.builtins">
163	<keyword match="AbstractAction" />
164	<keyword match="AbstractBorder" />
165	<keyword match="AbstractButton" />
166    ...
167    ...
168	<keyword match="_Remote_Stub" />
169	<keyword match="_ServantActivatorStub" />
170	<keyword match="_ServantLocatorStub" />
171  </keywords>
172
173This keyword group will be only enabled when "java.builtins" is passed as an
174element of "defines" option:
175
176    $options = array(
177        'defines' => array(
178            'java.builtins',
179        ),
180        'numbers' => HL_NUMBERS_TABLE,
181    );
182    $highlighter = Text_Highlighter::factory('java', $options);
183
184"ifndef" attribute has reverse meaning.
185
186Currently, "ifdef" and "ifndef" attributes are only supported for <keywords>
187tag.
188
189
190
191Class generation
192================
193
194Creating XML description of highlighting rules is the most complicated part of
195the process. To generate the class, you need just few lines of code:
196
197    <?php
198    require_once 'Text/Highlighter/Generator.php';
199    $generator = new Text_Highlighter_Generator('php.xml');
200    $generator->generate();
201    $generator->saveCode('PHP.php');
202    ?>
203
204
205
206Command-line class generation tool
207==================================
208
209Example from previous section looks pretty simple, but it does not handle any
210errors which may occur during parsing of XML source. The package provides a
211command-line script to make generation of classes even more simple, and takes
212care of possible errors. It is called generate (on Unix/Linux) or generate.bat
213(on Windows). This script is able to process multiple files in one run, and
214also to process XML from standard input and write generated code to standard
215output.
216
217    Usage:
218    generate options
219
220    Options:
221      -x filename, --xml=filename
222            source XML file. Multiple input files can be specified, in which
223            case each -x option must be followed by -p unless -d is specified
224            Defaults to stdin
225      -p filename, --php=filename
226            destination PHP file. Defaults to stdout. If specied multiple times,
227            each -p must follow -x
228      -d dirname, --dir=dirname
229            Default destination directory. File names will be taken from XML input
230            ("lang" attribute of <highlight> tag)
231      -h, --help
232            This help
233
234Examples
235
236    Read from php.xml, write to PHP.php
237
238        generate -x php.xml -p PHP.php
239
240    Read from php.xml, write to standard output
241
242        generate -x php.xml
243
244    Read from php.xml, write to PHP.php, read from xml.xml, write to XML.php
245
246        generate -x php.xml -p PHP.php -x xml.xml -p XML.php
247
248    Read from php.xml, write to /some/dir/PHP.php, read from xml.xml, write to
249    /some/dir/XML.php (assuming that xml.xml contains <highlight lang="xml">, and
250    php.xml contains <highlight lang="php">)
251
252        generate -x php.xml -x xml.xml -d /some/dir/
253
254
255
256Renderers
257=========
258
259Introduction
260------------
261
262Text_Highlighter supports renderes. Using renderers, you can get output in
263different formats. Two renderers are included in the package:
264
265    - HTML renderer. Generates HTML output. A style sheet should be linked to
266      the document to display colored text
267
268    - Console renderer. Can be used to output highlighted text to
269      color-capable terminals, either directly or trough less -r
270
271
272Renderers API
273-------------
274
275Renderers are subclasses of Text_Highlighter_Renderer. Renderer should
276override at least two methods - acceptToken and getOutput. Overriding other
277methods is optional, depending on the nature of renderer's output and details
278of implementation.
279
280    string reset()
281        resets renderer state. This method is called every time before a new
282        source file is highlighted.
283
284    string preprocess(string $code)
285        preprocesses code. Can be used, for example, to normalize whitespace
286        before highlighting. Returns preprocessed string.
287
288    void acceptToken(string $group, string $content)
289        the core method of the renderer. Highlighter passes chunks of text to
290        this method in $content, and color group in $group
291
292    void finalize()
293        signals the renderer that no more tokens are available.
294
295    mixed getOutput()
296        returns generated output.
297
298
299Setting renderer options
300--------------------------------
301
302Renderers accept an optional argument to their constructor  - options array.
303Elements of this array are renderer-specific.
304
305HTML renderer
306-------------
307
308HTML renderer produces HTML output with optional line numbering. The renderer
309itself does not provide information about actual colors of highlighted text.
310Instead, <span class="hl-XXX"> is used, where XXX is replaced with color group
311name (hl-var, hl-string, etc.). It is up to you to create a CSS stylesheet.
312If 'use_language' option with value evaluating to true was passed, class names
313will be formatted as "LANG-hl-XXX", where LANG is language name as defined in
314highlighter XML source ("lang" attribute of <highlight> tag) in lower case.
315
316There are 3 special CSS classes:
317
318    hl-main - this class applies to whole output or right table column,
319              depending on 'numbers' option
320    hl-gutter - applies to left column in table
321    hl-table - applies to whole table
322
323HTML renderer accepts following options (each being optional):
324
325    * numbers - line numbering style.
326        0 - no numbering (default)
327        HL_NUMBERS_LI - use <ol></ol> for line numbering
328        HL_NUMBERS_TABLE  - create a 2-column table, with line numbers in left
329                            column and highlighted text in right column
330
331    * tabsize - tabulation size. Defaults to 4
332
333    Example:
334
335        require_once 'Text/Highlighter/Renderer/Html.php';
336        $options = array(
337            'numbers' => HL_NUMBERS_LI,
338            'tabsize' => 8,
339        );
340        $renderer = new Text_Highlighter_Renderer_HTML($options);
341
342Console renderer
343----------------
344
345Console renderer produces output for displaying on a color-capable terminal,
346either directly or through less -r, using ANSI escape sequences. By default,
347this renderer only highlights most common color groups. Additional colors
348can be specified using 'colors' option. This renderer also accepts 'numbers'
349option - a boolean value, and 'tabsize' option.
350
351    Example :
352
353        require_once 'Text/Highlighter/Renderer/Console.php';
354        $colors = array(
355            'prepro' => "\033[35m",
356            'types' => "\033[32m",
357        );
358        $options = array(
359            'numbers' => true,
360            'tabsize' => 8,
361            'colors' => $colors,
362        );
363        $renderer = new Text_Highlighter_Renderer_Console($options);
364
365
366ANSI color escape sequences have the following format:
367
368    ESC[#;#;....;#m
369
370where ESC is character with ASCII code 27 (033 octal, 0x1B hexadecimal). # is
371one of the following:
372
373        0 for normal display
374        1 for bold on
375        4 underline (mono only)
376        5 blink on
377        7 reverse video on
378        8 nondisplayed (invisible)
379        30 black foreground
380        31 red foreground
381        32 green foreground
382        33 yellow foreground
383        34 blue foreground
384        35 magenta foreground
385        36 cyan foreground
386        37 white foreground
387        40 black background
388        41 red background
389        42 green background
390        43 yellow background
391        44 blue background
392        45 magenta background
393        46 cyan background
394        47 white background
395
396
397How to use Text_Highlighter class
398=================================
399
400Creating a highlighter object
401-----------------------------
402
403To create a highlighter for a certain language, use Text_Highlighter::factory()
404static method:
405
406    require_once 'Text/Highlighter.php';
407    $hl = Text_Highlighter::factory('php');
408
409
410Setting a renderer
411------------------
412
413Actual output is produced by a renderer.
414
415    require_once 'Text/Highlighter.php';
416    require_once 'Text/Highlighter/Renderer/Html.php';
417    $options = array(
418        'numbers' => HL_NUMBERS_LI,
419        'tabsize' => 8,
420    );
421    $renderer = new Text_Highlighter_Renderer_HTML($options);
422    $hl = Text_Highlighter::factory('php');
423    $hl->setRenderer($renderer);
424
425Note that for BC reasons, it is possible to use highlighter without setting a
426renderer. If no renderer is set, HTML renderer will be used by default. In
427this case, you should pass options as second parameter to factory method. The
428following example works exactly as previous one:
429
430    require_once 'Text/Highlighter.php';
431    $options = array(
432        'numbers' => HL_NUMBERS_LI,
433        'tabsize' => 8,
434    );
435    $hl = Text_Highlighter::factory('php', $options);
436
437
438Getting output
439--------------
440
441And finally, do the highlighting and get the output:
442
443    require_once 'Text/Highlighter.php';
444    require_once 'Text/Highlighter/Renderer/Html.php';
445    $options = array(
446        'numbers' => HL_NUMBERS_LI,
447        'tabsize' => 8,
448    );
449    $renderer = new Text_Highlighter_Renderer_HTML($options);
450    $hl = Text_Highlighter::factory('php');
451    $hl->setRenderer($renderer);
452    $html = $hl->highlight(file_get_contents('example.php'));
453
454# vim: set autoindent tabstop=4 shiftwidth=4 softtabstop=4 tw=78: */
455
456