1<refentry xmlns="http://docbook.org/ns/docbook" 2 xmlns:xlink="http://www.w3.org/1999/xlink" 3 xmlns:xi="http://www.w3.org/2001/XInclude" 4 xmlns:src="http://nwalsh.com/xmlns/litprog/fragment" 5 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 6 version="5.0" xml:id="make.index.markup"> 7<refmeta> 8<refentrytitle>make.index.markup</refentrytitle> 9<refmiscinfo class="other" otherclass="datatype">boolean</refmiscinfo> 10</refmeta> 11<refnamediv> 12<refname>make.index.markup</refname> 13<refpurpose>Generate XML index markup in the index?</refpurpose> 14</refnamediv> 15 16<refsynopsisdiv> 17<src:fragment xml:id="make.index.markup.frag"> 18<xsl:param name="make.index.markup" select="0"/> 19</src:fragment> 20</refsynopsisdiv> 21 22<refsection><info><title>Description</title></info> 23 24<para>This parameter enables a very neat trick for getting properly 25merged, collated back-of-the-book indexes. G. Ken Holman suggested 26this trick at Extreme Markup Languages 2002 and I'm indebted to him 27for it.</para> 28 29<para>Jeni Tennison's excellent code in 30<filename>autoidx.xsl</filename> does a great job of merging and 31sorting <tag>indexterm</tag>s in the document and building a 32back-of-the-book index. However, there's one thing that it cannot 33reasonably be expected to do: merge page numbers into ranges. (I would 34not have thought that it could collate and suppress duplicate page 35numbers, but in fact it appears to manage that task somehow.)</para> 36 37<para>Ken's trick is to produce a document in which the index at the 38back of the book is <quote>displayed</quote> in XML. Because the index 39is generated by the FO processor, all of the page numbers have been resolved. 40It's a bit hard to explain, but what it boils down to is that instead of having 41an index at the back of the book that looks like this:</para> 42 43<blockquote> 44<formalpara><info><title>A</title></info> 45<para>ap1, 1, 2, 3</para> 46</formalpara> 47</blockquote> 48 49<para>you get one that looks like this:</para> 50 51<blockquote> 52<programlisting><indexdiv>A</indexdiv> 53<indexentry> 54<primaryie>ap1</primaryie>, 55<phrase role="pageno">1</phrase>, 56<phrase role="pageno">2</phrase>, 57<phrase role="pageno">3</phrase> 58</indexentry></programlisting> 59</blockquote> 60 61<para>After building a PDF file with this sort of odd-looking index, you can 62extract the text from the PDF file and the result is a proper index expressed in 63XML.</para> 64 65<para>Now you have data that's amenable to processing and a simple Perl script 66(such as <filename>fo/pdf2index</filename>) can 67merge page ranges and generate a proper index.</para> 68 69<para>Finally, reformat your original document using this literal index instead of 70an automatically generated one and <quote>bingo</quote>!</para> 71 72</refsection> 73</refentry> 74