README
1
2
3 EntityMap -- Entity Mapping Tables
4
5 Version 0.1
6
7 Maintained by Ken MacLeod
8 ken@bitsko.slc.ut.us
9
10
11INTRODUCTION
12
13 EntityMap is a set of look-up tables for translating SGML
14 character entity names into output formats. This release of
15 EntityMap includes mappings for the ISO 8879:1986 character entity
16 sets to ASCII, Latin 1, TeX, Texinfo, and RTF.
17
18 EntityMap includes a Perl module for reading and querying the
19 entity mapping tables. Documentation is in PerlDoc in `EntityMap'
20 and will also be installed as a man page as `Text::EntityMap(3)'.
21
22STATUS
23
24 The mapping tables in this release come directly from GF (General
25 Formatter) by Gary Houston.
26
27 Upcoming releases will merge mappings from SGML Tools and Jade.
28
29ACKNOWLEDGEMENTS
30
31 These are Gary Houston's acknowledgements for the initial `sdata'
32 files:
33
34 * The tables for the conversion of `ISOlat1' to ``best'' ASCII
35 follow a system developed by Markus Kuhn.
36
37 * `ISOlat1.2tex' is based on a `latin1' to TeX table by (I
38 think) Peter Flynn.
39
40 * Other TeX symbols were grabbed individually from numerous
41 sources.
42
43INSTALLATION
44
45 If you are not using the Perl module you can copy the files in the
46 `sdata' directory to wherever you need them.
47
48 If you are using the Perl module, the following commands will
49 install the Perl module into your standard Perl library and
50 install the `sdata' files into `$PREFIX/lib/entity-map-0.1'.
51
52 zcat entity-map-0.1.tar.gz | tar xvf -
53 cd entity-map-0.1
54 ./configure
55 make
56 make install
57
58FORMAT
59
60 Each file contains one character entity per line. Each line is
61 the entity name, followed by a tab, followed by the replacement
62 text for that entity.
63
64 The replacement text should be already escaped properly for it's
65 output format.
66
67 If there is no equivalent output format for an entity, the
68 convention is use the entity name within braces (`{name}') so that
69 the braces appear in the output.
70
71 NOTE: The file format may change in the future. Other output
72 formats may also require a new file format.
73
74FILE NAMES
75
76 The current convention is `ENTITY-SET.2FORMAT' where ENTITY-SET
77 is the source entity set name (like `ISOpub') and FORMAT is an
78 identifier for the output format:
79
80 .2ab ASCII (best approximation)
81 .2as ASCII
82 .2l1b Latin 1 (best approximation)
83 .2l1s Latin 1
84 .2tex TeX
85 .2texi Texinfo
86 .2rtf RTF
87 .2tr TROFF
88 .2u8b UTF-8
89