1What is this?
2=============
3
4This is an implementation of John Gruber's [markdown][] in C. It uses a
5[parsing expression grammar (PEG)][] to define the syntax. This should
6allow easy modification and extension. It currently supports output in
7HTML, LaTeX, ODF, or groff_mm formats, and adding new formats is
8relatively easy.
9
10[parsing expression grammar (PEG)]: http://en.wikipedia.org/wiki/Parsing_expression_grammar
11[markdown]: http://daringfireball.net/projects/markdown/
12
13It is pretty fast. A 179K text file that takes 5.7 seconds for
14Markdown.pl (v. 1.0.1) to parse takes less than 0.2 seconds for this
15markdown. It does, however, use a lot of memory (up to 4M of heap space
16while parsing the 179K file, and up to 80K for a 4K file). (Note that
17the memory leaks in earlier versions of this program have now been
18plugged.)
19
20Both a library and a standalone program are provided.
21
22peg-markdown is written and maintained by John MacFarlane (jgm on
23github), with significant contributions by Ryan Tomayko (rtomayko).
24It is released under both the GPL and the MIT license; see LICENSE for
25details.
26
27Installing
28==========
29
30On a linux or unix-based system
31-------------------------------
32
33This program is written in portable ANSI C. It requires
34[glib2](http://www.gtk.org/download/index.php). Most *nix systems will have
35this installed already. The build system requires GNU make.
36
37The other required dependency, [Ian Piumarta's peg/leg PEG parser
38generator](http://piumarta.com/software/peg/), is included in the source
39directory. It will be built automatically. (However, it is not as portable
40as peg-markdown itself, and seems to require gcc.)
41
42To make the 'markdown' executable:
43
44 make
45
46(Or, on some systems, `gmake`.) Then, for usage instructions:
47
48 ./markdown --help
49
50To run John Gruber's Markdown 1.0.3 test suite:
51
52 make test
53
54The test suite will fail on one of the list tests. Here's why.
55Markdown.pl encloses "item one" in the following list in `<p>` tags:
56
57 1. item one
58 * subitem
59 * subitem
60
61 2. item two
62
63 3. item three
64
65peg-markdown does not enclose "item one" in `<p>` tags unless it has a
66following blank line. This is consistent with the official markdown
67syntax description, and lets the author of the document choose whether
68`<p>` tags are desired.
69
70Cross-compiling for Windows with MinGW on a linux box
71-----------------------------------------------------
72
73Prerequisites:
74
75* Linux system with MinGW cross compiler For Ubuntu:
76
77 sudo apt-get install mingw32
78
79* [Windows glib-2.0 binary & development files](http://www.gtk.org/download-windows.html).
80 Unzip files into cross-compiler directory tree (e.g., `/usr/i586-mingw32msvc`).
81
82Steps:
83
841. Create the markdown parser using Linux-compiled `leg` from peg-0.1.4:
85
86 ./peg-0.1.4/leg markdown_parser.leg >markdown_parser.c
87
88 (Note: The same thing could be accomplished by cross-compiling leg,
89 executing it on Windows, and copying the resulting C file to the Linux
90 cross-compiler host.)
91
922. Run the cross compiler with include flag for the Windows glib-2.0 headers:
93 for example,
94
95 /usr/bin/i586-mingw32msvc-cc -c \
96 -I/usr/i586-mingw32msvc/include/glib-2.0 \
97 -I/usr/i586-mingw32msvc/lib/glib-2.0/include -Wall -O3 -ansi markdown*.c
98
993. Link against Windows glib-2.0 headers: for example,
100
101 /usr/bin/i586-mingw32msvc-cc markdown*.o \
102 -Wl,-L/usr/i586-mingw32msvc/lib/glib,--dy,--warn-unresolved-symbols,-lglib-2.0 \
103 -o markdown.exe
104
105The resulting executable depends on the glib dll file, so be sure to
106load the glib binary on the Windows host.
107
108Compiling with MinGW on Windows
109-------------------------------
110
111These directions assume that MinGW is installed in `c:\MinGW` and glib-2.0
112is installed in the MinGW directory hierarchy (with the mingw bin directory
113in the system path).
114
115Unzip peg-markdown in a temp directory. From the directory with the
116peg-markdown source, execute:
117
118 cd peg-0.1.4
119 make PKG_CONFIG=c:/path/to/glib/bin/pkg-config.exe
120
121Extensions
122==========
123
124peg-markdown supports extensions to standard markdown syntax.
125These can be turned on using the command line flag `-x` or
126`--extensions`. `-x` by itself turns on all extensions. Extensions
127can also be turned on selectively, using individual command-line
128options. To see the available extensions:
129
130 ./markdown --help-extensions
131
132The `--smart` extension provides "smart quotes", dashes, and ellipses.
133
134The `--notes` extension provides a footnote syntax like that of
135Pandoc or PHP Markdown Extra.
136
137Using the library
138=================
139
140The library exports two functions:
141
142 GString * markdown_to_g_string(char *text, int extensions, int output_format);
143 char * markdown_to_string(char *text, int extensions, int output_format);
144
145The only difference between these is that `markdown_to_g_string` returns a
146`GString` (glib's automatically resizable string), while `markdown_to_string`
147returns a regular character pointer. The memory allocated for these must be
148freed by the calling program, using `g_string_free()` or `free()`.
149
150`text` is the markdown-formatted text to be converted. Note that tabs will
151be converted to spaces, using a four-space tab stop. Character encodings are
152ignored.
153
154`extensions` is a bit-field specifying which syntax extensions should be used.
155If `extensions` is 0, no extensions will be used. If it is `0xFFFFFF`,
156all extensions will be used. To set extensions selectively, use the
157bitwise `&` operator and the following constants:
158
159 - `EXT_SMART` turns on smart quotes, dashes, and ellipses.
160 - `EXT_NOTES` turns on footnote syntax. [Pandoc's footnote syntax][] is used here.
161 - `EXT_FILTER_HTML` filters out raw HTML (except for styles).
162 - `EXT_FILTER_STYLES` filters out styles in HTML.
163
164 [Pandoc's footnote syntax]: http://johnmacfarlane.net/pandoc/README.html#footnotes
165
166`output_format` is either `HTML_FORMAT`, `LATEX_FORMAT`, `ODF_FORMAT`,
167or `GROFF_MM_FORMAT`.
168
169To use the library, include `markdown_lib.h`. See `markdown.c` for an example.
170
171Hacking
172=======
173
174It should be pretty easy to modify the program to produce other formats,
175and to parse syntax extensions. A quick guide:
176
177 * `markdown_parser.leg` contains the grammar itself.
178
179 * `markdown_output.c` contains functions for printing the `Element`
180 structure in various output formats.
181
182 * To add an output format, add the format to `markdown_formats` in
183 `markdown_lib.h`. Then modify `print_element` in `markdown_output.c`,
184 and add functions `print_XXXX_string`, `print_XXXX_element`, and
185 `print_XXXX_element_list`. Also add an option in the main program
186 that selects the new format. Don't forget to add it to the list of
187 formats in the usage message.
188
189 * To add syntax extensions, define them in the PEG grammar
190 (`markdown_parser.leg`), using existing extensions as a guide. New
191 inline elements will need to be added to `Inline =`; new block
192 elements will need to be added to `Block =`. (Note: the order
193 of the alternatives does matter in PEG grammars.)
194
195 * If you need to add new types of elements, modify the `keys`
196 enum in `markdown_peg.h`.
197
198 * By using `&{ }` rules one can selectively disable extensions
199 depending on command-line options. For example,
200 `&{ extension(EXT_SMART) }` succeeds only if the `EXT_SMART` bit
201 of the global `syntax_extensions` is set. Add your option to
202 `markdown_extensions` in `markdown_lib.h`, and add an option in
203 `markdown.c` to turn on your extension.
204
205 * Note: Avoid using `[^abc]` character classes in the grammar, because
206 they cause problems with non-ascii input. Instead, use: `( !'a' !'b'
207 !'c' . )`
208
209Acknowledgements
210================
211
212Support for ODF output was added by Fletcher T. Penney.
213
214