Writing I-Ds and RFCs using Pandoc and xml2rfc 2.x

Writing I-Ds and RFCs using Pandoc and xml2rfc 2.x Google

123 Buckingham Palace Road London SW1W 9SH UK miek@miek.nl http://miek.nl/

General RFC Beautification Working Group RFC Request for Comments I-D Internet-Draft XML Pandoc Extensible Markup Language This memo presents a technique for using Pandoc syntax as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series. This version is adapted to work with "xml2rfc" version 2.x.

This document presents a technique for using Pandoc syntax as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series. This version is adapted to work with xml2rfc version 2.x. Pandoc is an "almost plain text" format and therefor particularly well suited for editing RFC-like documents. Note: this document is typeset in Pandoc and does not render completely correct when reading it on github.

Pandoc2rfc -- designed to do the right thing, until it doesn't. When writing we directly wrote the XML. Needless to say it was tedious even thought the XML of xml2rfc is very "light". The latest version of xml2rfc version 2 can be found here. During the last few years people have been developing markup languages that are very easy to remember and type. These languages have become known as almost plain text-markup languages. One of the first was the Markdown syntax. One that was developed later and incorporates Markdown and a number of extensions is Pandoc. The power of Pandoc also comes from the fact that it can be translated to numerous output formats, including, but not limited to: HTML, (plain) Markdown and docbook XML. So using Pandoc for writing RFCs seems like a sane choice. As xml2rfc uses XML, the easiest way would be to create docbook XML and transform that using XSLT. Pandoc2rfc does just that. The conversions are, in some way amusing, as we start off with (almost) plain text, use elaborate XML and end up with plain text again.

+-------------------+ pandoc +---------+ | ALMOST PLAIN TEXT | ------> | DOCBOOK | +-------------------+ +---------+ | | non-existent | | xsltproc faster way | | v v +------------+ xml2rfc +---------+ | PLAIN TEXT | <-------- | XML2RFC | +------------+ +---------+ The XML generated (the output after the xsltproc step in ) is suitable for inclusion in either the middle or back section of an RFC. The simplest way is to create a template XML file and include the appropriate XML:

<?xml version='1.0' ?> <!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [ <!ENTITY pandocMiddle PUBLIC '' 'middle.xml'> <!ENTITY pandocBack PUBLIC '' 'back.xml'> ]> <rfc ipr='trust200902' docName='draft-gieben-pandoc-rfcs-02'> <front> <title>Writing I-Ds and RFCs using Pandoc v2</title> </front> <middle> &pandocMiddle; </middle> <back> &pandocBack; </back> </rfc> See the Makefile for an example of this. In this case you need to edit 3 documents: middle.pdc - contains the main body of text; back.pdc - holds appendices and references; template.xml (probably a fairly static file). The draft ( draft.txt) you are reading now, is automatically created when you call make. The homepage of Pandoc2rfc is this github repository.

It needs xsltproc and pandoc to be installed. See the Pandoc user manual for the details on how to type in Pandoc style. And of course xml2rfc version two. When using Pandoc2rfc consider adding the following sentence to an Acknowledgements section:

This document was produced using the Pandoc2rfc tool.

When starting a new project with pandoc2rfc you'll need to copy the following files: Makefile transform.xslt And the above mentioned files: middle.pdc back.pdc template.xml After that you can start editing.

Sections with an anchor and title attributes ( ); Lists style=symbols (); style=numbers (); style=empty (); style=format %i, use roman lowercase numerals, (); style=format (%d), use roman uppercase numerals (); style=letters (lower- and uppercase, ); style=hanging (); Figure/artwork with a title (); Block quote this is converted to <list style="empty"> paragraph (); References external (eref) (); internal (xref) (), you can refer to: section (handled by Pandoc, see )); figures (handled by XSLT, see ); tables (handled by XSLT, see ). Citations, by using internal references; Spanx style=verb, style=emph and style=strong (); Tables with an anchor and title (); Indexes, by using footnotes ().

Lists inside a table (xml2rfc doesn't handle this); Pandoc markup in the caption for figures/artwork. Pandoc markup for table captions is supported; crefs: for comments (no input syntax available), use HTML comments: ;

The following people have helped to make Pandoc2rfc what it is today: Benno Overeinder, Erlend Hamnaberg, Matthijs Mekking, Trygve Laugstoel. This document was prepared using Pandoc2rfc.

So, what syntax do you need to use to get the correct output? Well, it is just Pandoc. The best introduction to the Pandoc style is given in this README from Pandoc itself. For convenience we list the most important ones in the following sections.

Paragraphs are separated with an empty line.

Just use the normal sectioning commands available in Pandoc, for instance:

# Section1 One Bla Converts to: <section title="Section1 One" anchor="section1-one"> If you have another section that is also named "Section1 One", that anchor will be called "section1-one-1", but only when the sections are in the same source file! Referencing the section is done with see [](#section1-one), as in see .

A good number of styles are supported.

A symbol list. * Item one; * Item two. Converts to <list style="symbol">: Item one; Item two.

A numbered list. 1. Item one; 1. Item two. Converts to <list style="numbers">: Item one; Item two.

Using the default list markers from Pandoc:

A list using the default list markers. #. Item one; #. Item two. Converts to <list style="empty">: Item one; Item two.

Use the supported Pandoc syntax:

ii. Item one; ii. Item two. Converts to <list style="format %i.">: Item one; Item two. If you use uppercase Roman numerals, they convert to a different style:

II. Item one; II. Item two. Yields <list style="format (%d) ">: Item one; Item two.

A numbered list.

a. Item one; b. Item two. Converts to <list style="letters">: Item one; Item two. Uppercasing the letters works too (note two spaces after the letter.

A. Item one; B. Item two. Becomes: Item one; Item two.

This is more like a description list, so we need to use:

First item that needs clarification: : Explanation one More stuff, because item is difficult to explain. * item1 * item2 Second item that needs clarification: : Explanation two Converts to: <list style="hanging"> and <t hangText="First item that..."> If you want a newline after the hangTexts, search for the string OPTION in transform.xsl and uncomment it.

Indent the paragraph with 4 spaces.

Like this Converts to: <figure><artwork>... Note that xml2rfc supports a caption with <artwork>. Pandoc does not support this, but Pandoc2rfc does. If you add a @Figure: some text as the last line, the artwork gets a title attribute with the text after @Figure:. It will also be possible to reference the artwork. If a caption is supplied the artwork will be centered. If a caption is needed but the figure should not be centered use @figure:\.

The reference anchor attribute will be: fig: + first 10 (normalized) characters from the caption. Where normalized means: Take the first 10 characters of the caption (i.e. this is the text after the string @Figure:); Spaces and single quotes (') are translated to a minus -; Uppercase letters translated to lowercase. So the first artwork with a caption will get fig:a-minimal- as a reference. See for instance . This anchoring is completely handled from within the xslt. Note that duplicate anchors are an XML validation error which will make xml2rfc fail.

Any paragraph like:

> quoted text Converts to: <t><list style="empty"> ... paragraph, making it indented.

Any reference like:

[Click here](URI) Converts to: <ulink target="URI">Click here ...

Any reference like:

[Click here](#localid) Converts to: <link target="localid">Click here ... For referring to RFCs (for which you manually need add the reference source in the template, with an external XML entity), you can just use:

[](#RFC2119) And it does the right thing. Referencing sections is done with:

See [](#pandoc-constructs) The word 'Section' is inserted automatically: ... see ... For referencing figures/artworks see . For referencing tables see .

The verb style can be selected with back-tics: `text` Converts to: <spanx style="verb"> ... And the emphasis style with asterisks: *text* or underscores: _text_ Converts to: <spanx style="emph"> ... And the emphasis style with double asterisks: **text** Converts to: <spanx style="strong"> ...

A table can be entered as:

Right Left Center Default ------- ------ ---------- ------- 12 12 12 12 123 123 123 123 1 1 1 1 Table: A caption describing the table. Is translated to <texttable> element in xml2rfc. You can choose multiple styles as input, but they all are converted to the same style (plain <texttable>) table in xml2rfc. The column alignment is copied over to the generated XML.

The caption is always translated to a title attribute. If a table has a caption, it will also get a reference. The reference anchor attribute will be: tab- + first 10 (normalized) characters from the caption. Where normalized means: Take the first 10 characters of the caption (i.e. this is the text after the string Table:); Spaces and single quotes (') are translated to a minus -; Uppercase letters translated to lowercase. So the first table with a caption will get tab-a-caption- for reference use. See for instance This anchoring is completely handled from within the xslt. Note that duplicate anchors are an XML validation error which will make xml2rfc fail.

The footnote syntax of Pandoc is slightly abused to support an index. Footnotes are entered in two steps, you have a marker in the text, and later you give actual footnote text. Like this:

[^1] [^1]: footnote text We re-use this syntax for the <iref> tag. The above text translates to:

<iref item="footnote text"/> Sub items are also supported. Use an exclamation mark (!) to separate them:

[^1]: item!sub item

As an author you will probably break up a draft in multiple files, each dealing with a subject or section. When doing so sections with the same title will clash with each other. Pandoc can deal with this situation, but only if the different sections are in the same file or processed in the same Pandoc run. Concatenating the different section files before processing them is a solution to this problem. You can, for instance, amend the Makefile and add something like this:

allsections.pdc: section.pdc.1 section.pdc.2 section.pdc.3 cat $@ > allsections.pdc And then process allsection.pdc in the normal way.

If you use double quotes in the documents title in the docName attribute, like:

<rfc ipr="trust200902" docName="draft-gieben-writing-rfcs-pandoc-02"> The Makefile will pick this up automatically and make a symbolic link:

draft-gieben-writing-rfcs-pandoc-00.txt -> draft.txt This makes uploading the file to the i-d tracker a bit easier.

The draft.xml target will generate an XML file with all XML included, so you can upload just one file to the I-D tracker.

If you are a VIM user you might be interested in a syntax highlighting file (see ) that slightly lightens up your reading experience while viewing a draft.txt from VIM.

This document raises no security issues.

This document has no actions for IANA.

VIM syntax file for RFCs and I-Ds Atoom Inc.

miek@miek.nl

Key words for use in RFCs to Indicate Requirement Levels Harvard University

1350 Mass. Ave. Cambridge MA 02138 - +1 617 495 3864 sob@harvard.edu

General keyword In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. Authors who follow these guidelines should incorporate this phrase near the beginning of their document: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. Note that the force of these words is modified by the requirement level of the document in which they are used. DNSSEC Operational Practices This document describes a set of practices for operating the DNS with security extensions (DNSSEC). The target audience is zone administrators deploying DNSSEC.</t><t> The document discusses operational aspects of using keys and signatures in the DNS. It discusses issues of key generation, key storage, signature generation, key rollover, and related policies.</t><t> This document obsoletes RFC 2541, as it covers more operational ground and gives more up-to-date requirements with respect to key sizes and the new DNSSEC specification. This memo provides information for the Internet community.

This appendix consists out of a few tests that should all render to proper xml2rfc XML.

Test a very long title.

This is discarded by xml2rfc.

This is a blockquote, how does it look?

A verbatim code block jkasjksajassjasjsajsajkas

Refer to RFC 2119 if you will. Or maybe you want to inspect in again. Or you might want to Click here.

underscores: underscores asterisks: asterisks double asterisks: double asterisks backticks: backticks

First we do And then item 1 item 2 And the other around. First we do Then Something Another thing Description lists: It works because of herbs. More explaining. Multiple paragraphs in such a list. lists in description lists. It works because of One Two More explaining It works because of One1 Two1 Itemize list Another item More explaining list with description lists. More Explanation ... Another explanation ... Go'bye Multiple paragraphs in a list. This is the first bullet point and it needs multiple paragraphs... ... to be explained properly. This is the next bullet. New paragraphs should be indented with 4 four spaces. Another item with some artwork, indented by 8 spaces.

Artwork Final item. xml2rfc does not allow this, so the second paragraph is faked with a

<vspace blankLines='1'> Ordered lists. First item Second item A lowercase roman list: Item 1 Item 2 An uppercase roman list. Item1 Item2 Item 3 And default list markers. Some surrounding text, to make it look better. First item. Use lot of text to get a real paragraphs sense. First item. Use lot of text to get a real paragraphs sense. First item. Use lot of text to get a real paragraphs sense. First item. Use lot of text to get a real paragraphs sense. Second item. So this is the second para in your list. Enjoy; Another item. Text at the end. Lowercase letters list. First item Second item Uppercase letters list. First item Second item And artwork in a description list. Tell something about it. Tell something about it. Tell something about it. Tell something about it. Tell something about it. Tell something about it.

miek.nl. IN NS a.miek.nl. a.miek.nl. IN A 192.0.2.1 ; <- this is glue Tell some more about it. Tell some more about it. Tell some more about it. Another description List with a sublist with a paragraph above the sublist First Item Second item Third item A paragraph that comes first But what do you know This is another list

Right Left Center Default 12 12 12 12 123 123 123 123 1 1 1 1 Centered Header Default Aligned Right Aligned Left Aligned First row 12.0 Example of a row that spans multiple lines. Second row 5.0 Here's another one. Note the blank line between rows. Fruit Price Advantages Bananas $1.34 built-in wrapper Oranges $2.10 cures scurvy Grid tables without a caption Fruit Price Advantages Bananas $1.34 built-in wrapper Oranges $2.10 cures scurvy This table has no caption, and therefor no reference. But you can refer to some of the other tables, with for instance:

See [](#tab-here-s-the) Which will become "See ". We should also be able to refer to the table numbers directly, to say things like 'Look at Tables , and .'

This is another example: Another bla bla.. as (1) shows...

This is a figure This is a figure This is a figure This is a figure And how a figure that is not centered, do to using figure and not Figure.

This is a figure This is a figure Test the use of @title:

This is a figure with a title This is a figure with a title @title: and here it is: a title, don't mess it up *

This is a verse text This is another line