1.. _formats:
2
3
4Translation Related File Formats
5********************************
6
7These are the different storage formats for translations and files associated
8with translations that are supported by the toolkit. See also
9:doc:`conformance` for standards conformance.
10
11The Translate Toolkit implements a set of :doc:`classes <base_classes>` for
12handling translation files which allows for a uniform API which covers other
13issues such as :doc:`quoting and escaping <quoting_and_escaping>` of text.
14
15.. _formats#primary_translation_formats:
16
17Primary translation formats
18===========================
19
20.. toctree::
21   :maxdepth: 1
22
23   xliff
24   Gettext PO <po>
25
26.. _formats#other_translation_formats:
27
28Other translation formats
29=========================
30
31.. toctree::
32   :maxdepth: 1
33   :hidden:
34
35   csv
36   ini
37   properties
38   dtd
39   gsi
40   php
41   ts
42   rc
43   strings
44   flex
45   catkeys
46   android
47   resx
48   mozilla_lang
49
50* :doc:`csv`
51* :doc:`ini` (including Inno Setup .isl dialect)
52* Java :doc:`properties` (also Mozilla derived properties files)
53* Mozilla :doc:`dtd`
54* OpenOffice.org :doc:`gsi` (Also called SDF)
55* :doc:`php` translation arrays
56* Qt Linguist :doc:`ts` (both 1.0 and 1.1 supported, 1.0 has a converter)
57* Symbian localization files
58* Windows :doc:`rc` files
59* Mac OSX :doc:`strings` files (also used on the iPhone) (from version 1.8)
60* Adobe :doc:`flex` files (from version 1.8)
61* Haiku :doc:`catkeys` (from version 1.8)
62* :doc:`android` (supports storage, not conversion)
63* :doc:`resx` .NET Resource files (.resx)
64* Mozilla :doc:`.lang <mozilla_lang>` files
65
66.. _formats#translation_memory_formats:
67
68Translation Memory formats
69==========================
70
71.. toctree::
72   :maxdepth: 1
73   :hidden:
74
75   tmx
76   wordfast
77
78* :doc:`tmx`
79* :doc:`wordfast`: TM
80* Trados: .txt TM (from v1.9.0 -- read only)
81
82.. _formats#glossary_formats:
83
84Glossary formats
85================
86
87.. toctree::
88   :maxdepth: 1
89   :hidden:
90
91   omegat_glossary
92   qt_phrase_book
93   tbx
94   utx
95
96* :doc:`omegat_glossary` (from v1.5.1)
97* :doc:`qt_phrase_book`
98* :doc:`tbx`
99* :doc:`utx` (from v1.9.0)
100
101.. _formats#formats_of_translatable_documents:
102
103Formats of translatable documents
104=================================
105
106.. toctree::
107   :maxdepth: 1
108   :hidden:
109
110   html
111   flatxml
112   ical
113   json
114   yaml
115   odf
116   text
117   wiki
118   subtitles
119
120* :doc:`html`
121* :doc:`flatxml` (single-level XML)
122* :doc:`ical`
123* :doc:`json`
124* :doc:`yaml`
125* :wp:`OpenDocument` -- all ODF file types
126* :doc:`Text <text>` -- plain text with blocks separated by whitespace
127* :doc:`Wiki <wiki>` -- :wp:`DokuWiki` and :wp:`MediaWiki` supported
128* :doc:`subtitles` -- various formats (v1.4)
129
130.. _formats#machine_readable_formats:
131
132Machine readable formats
133========================
134
135.. toctree::
136   :maxdepth: 1
137   :hidden:
138
139   mo
140   qm
141
142* Gettext :doc:`mo`
143* Qt :doc:`qm` (read-only)
144
145.. _formats#in_development:
146
147In development
148==============
149
150.. _formats#unsupported_formats:
151
152Unsupported formats
153===================
154
155Formats that we would like to support but don't currently support:
156
157.. toctree::
158   :maxdepth: 1
159   :hidden:
160
161   wml
162
163* Wordfast:
164
165  * `Glossary
166    <http://www.wordfast.net/index.php?lang=engb&whichpage=specifications#glo>`_
167    tab-delimited "source,target,comment" i.e. like OmegaT but unsure if any
168    extension is required.
169
170* Apple:
171
172  * `AppleGlot <ftp://ftp.apple.com/developer/tool_chest/localization_tools/appleglot/appleglot_3.2_usersguide.pdf>`_
173  * .plist -- see :issue:`633` and `plistlib
174    <https://docs.python.org/2/library/plistlib.html>`_ for Python
175
176* Adobe:
177
178  * FrameMaker's Maker Interchange Format -- `MIF
179    <http://help.adobe.com/en_US/FrameMaker/8.0/mif_reference.pdf>`_ (See also
180    `python-gendoc <http://lino.sourceforge.net/src/100.html>`_, and `Perl MIF
181    module
182    <http://search.cpan.org/~rst/FrameMaker-MifTree-0.075/lib/FrameMaker/MifTree.pm>`_)
183  * FrameMaker's `Maker Markup Language
184    <http://www.adobe.com/support/downloads/detail.jsp?ftpID=137>`_ (MML)
185
186* Microsoft
187
188  * Word, Excel, etc (probably through usage of OpenOffice.org)
189  * :wp:`OOXML` (at least at the text level we don't have to deal with much of
190    the mess inside OOXML).  See also: `Open XML SDK v1
191    <http://go.microsoft.com/fwlink/?LinkId=120908>`_
192  * :wp:`Rich Text Format <Rich_Text_Format>` (RTF) see also `pyrtf-ng
193    <http://code.google.com/p/pyrtf-ng/>`_
194  * :wp:`Open XML Paper Specification <Open_XML_Paper_Specification>`
195
196* XML related
197
198  * Generic XML
199  * :wp:`DocBook` (can be handled by KDE's :man:`xml2pot`)
200  * `SVG <http://www.w3.org/TR/SVG/>`_
201
202* :wp:`DITA <Darwin_Information_Typing_Architecture>`
203* :wp:`PDF <Portable_Document_Format>` see `spec
204  <http://www.adobe.com/devnet/pdf/pdf_reference.html>`_, `PDFedit
205  <http://pdfedit.cz/en/index.html>`_
206* :wp:`LaTeX` -- see `plasTeX
207  <http://plastex.sourceforge.net/plastex/index.html>`_, a Python framework for
208  processing LaTeX documents
209* `unoconv <http://dag.wiee.rs/home-made/unoconv/>`_ -- Python bindings to
210  OpenOffice.org UNO which could allow manipulation of all formats understood
211  by OpenOffice.org.
212* Trados:
213
214  * TTX (`Reverse Engineered DTD
215    <http://www.tracom.de/04/EN/techdoccenter/download/TRADOS_TTX-DTD.zip>`_,
216    `other discussion
217    <http://timsfoster.wordpress.com/2005/07/05/beds-mattresses-and-open-standards/>`_)
218  * Multiterm XML `TSV to MiltiTerm conversion script
219    <http://syntax.biz.pl/multiterm.html>`_ or `XLST
220    <http://translationzone.eu/mtxml2txt.html>`_
221  * .tmw
222  * .txt (You can interchange using TMX) `Format explanation
223    <http://translate.google.com/translate?js=y&prev=_t&hl=en&ie=UTF-8&layout=1&eotf=1&u=http%3A%2F%2Fwww.diemohrs.de%2Ftipps2_neu.html&sl=auto&tl=en>`_
224    with some `examples
225    <http://slaci.komarom.net/roli/Trados/TRADOS%206.5.5.439%20Freelance%20+%20TRADOS%20MultiTerm%20iX%206.0.1.209/TRADOS%206.5.5.439%20Freelance/Program%20Files/TRADOS/T65_FL/Samples/TW4Win/>`_.
226
227* Tcl: .msg files.  `Good documentation
228  <http://www.google.com/codesearch?hl=en&q=show:XvsRBDCljVk:M2kzUbm70Ts:D5EHICz0aaQ&sa=N&ct=rd&cs_p=http://www.scilab.org/download/4.0/scilab-4.0-src.tar.gz&cs_f=scilab-4.0/tcl/scipadsources/msg_files/AddingTranslations.txt>`_
229* Installers:
230
231  * NSIS installer: `Existing C++ implementation
232    <http://trac.vidalia-project.net/browser/vidalia/trunk/src/tools>`_
233  * WiX -- MSI (Microsoft Installer) creator.  `Localization instructions
234    <http://wix.mindcapers.com/wiki/Localization>`_, `more notes on
235    localisation
236    <http://www.mail-archive.com/wix-users@lists.sourceforge.net/msg15489.html>`_.
237    This is a custom XML format, another one!
238
239* catgets/`gencat
240  <http://pubs.opengroup.org/onlinepubs/009695399/utilities/gencat.html>`_:
241  precedes gettext, looking in man packages is the best information I could
242  find.  Also `LSB requires it
243  <http://www.linuxbase.org/navigator/browse/cmd_single.php?cmd=list-by-name&Cname=gencat>`_.
244  There is some info about the source (msgfile) format on `GNU website
245  <http://www.gnu.org/software/libc/manual/html_node/The-message-catalog-files.html#The-message-catalog-files>`_
246* :doc:`wml`
247* `GlossML <http://www.maxprograms.com/glossml/glossml.pdf>`_
248* Deja Vu External View: `Instructions sent to a translator
249  <http://dvx.atril.com/docs/DVX/InstructionsExternalView.pdf>`_, `Description
250  of external view options and process
251  <http://simmer-lossner.com/lib/presentations/External_Proofreading_for_DVX.pdf>`_
252
253.. _formats#unlikely_to_be_supported:
254
255Unlikely to be supported
256========================
257
258These formats are either: too difficult to implement, undocumented, can be
259processed using some intermediate format or used by too few people to justify
260the effort.  Or some combination or these issues.
261
262.. Mentioned but we want them at the end of the TOC or to move them to developer docs
263
264.. toctree::
265   :maxdepth: 1
266   :hidden:
267
268   conformance
269   base_classes
270   quoting_and_escaping
271
272