• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..07-May-2022-

build/lib/idna/H03-May-2022-11,00210,768

idna/H27-Jun-2020-11,00210,768

idna.egg-info/H07-May-2022-275212

tests/H27-Jun-2020-585449

tools/H27-Jun-2020-725583

HISTORY.rstH A D27-Jun-20204.5 KiB163117

LICENSE.rstH A D17-Feb-20201.5 KiB3527

MANIFEST.inH A D08-Dec-2019127 75

PKG-INFOH A D27-Jun-202010.5 KiB242182

README.rstH A D27-Jun-20207.6 KiB212153

setup.cfgH A D27-Jun-202067 85

setup.pyH A D27-Jun-20202.2 KiB6051

README.rst

1Internationalized Domain Names in Applications (IDNA)
2=====================================================
3
4Support for the Internationalised Domain Names in Applications
5(IDNA) protocol as specified in `RFC 5891 <http://tools.ietf.org/html/rfc5891>`_.
6This is the latest version of the protocol and is sometimes referred to as
7“IDNA 2008”.
8
9This library also provides support for Unicode Technical Standard 46,
10`Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_.
11
12This acts as a suitable replacement for the “encodings.idna” module that
13comes with the Python standard library, but only supports the
14old, deprecated IDNA specification (`RFC 3490 <http://tools.ietf.org/html/rfc3490>`_).
15
16Basic functions are simply executed:
17
18.. code-block:: pycon
19
20    # Python 3
21    >>> import idna
22    >>> idna.encode('ドメイン.テスト')
23    b'xn--eckwd4c7c.xn--zckzah'
24    >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
25    ドメイン.テスト
26
27    # Python 2
28    >>> import idna
29    >>> idna.encode(u'ドメイン.テスト')
30    'xn--eckwd4c7c.xn--zckzah'
31    >>> print idna.decode('xn--eckwd4c7c.xn--zckzah')
32    ドメイン.テスト
33
34Packages
35--------
36
37The latest tagged release version is published in the PyPI repository:
38
39.. image:: https://badge.fury.io/py/idna.svg
40   :target: http://badge.fury.io/py/idna
41
42
43Installation
44------------
45
46To install this library, you can use pip:
47
48.. code-block:: bash
49
50    $ pip install idna
51
52Alternatively, you can install the package using the bundled setup script:
53
54.. code-block:: bash
55
56    $ python setup.py install
57
58This library works with Python 2.7 and Python 3.4 or later.
59
60
61Usage
62-----
63
64For typical usage, the ``encode`` and ``decode`` functions will take a domain
65name argument and perform a conversion to A-labels or U-labels respectively.
66
67.. code-block:: pycon
68
69    # Python 3
70    >>> import idna
71    >>> idna.encode('ドメイン.テスト')
72    b'xn--eckwd4c7c.xn--zckzah'
73    >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
74    ドメイン.テスト
75
76You may use the codec encoding and decoding methods using the
77``idna.codec`` module:
78
79.. code-block:: pycon
80
81    # Python 2
82    >>> import idna.codec
83    >>> print u'домена.испытание'.encode('idna')
84    xn--80ahd1agd.xn--80akhbyknj4f
85    >>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna')
86    домена.испытание
87
88Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel``
89functions if necessary:
90
91.. code-block:: pycon
92
93    # Python 2
94    >>> idna.alabel(u'测试')
95    'xn--0zwm56d'
96
97Compatibility Mapping (UTS #46)
98+++++++++++++++++++++++++++++++
99
100As described in `RFC 5895 <http://tools.ietf.org/html/rfc5895>`_, the IDNA
101specification no longer normalizes input from different potential ways a user
102may input a domain name. This functionality, known as a “mapping”, is now
103considered by the specification to be a local user-interface issue distinct
104from IDNA conversion functionality.
105
106This library provides one such mapping, that was developed by the Unicode
107Consortium. Known as `Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_,
108it provides for both a regular mapping for typical applications, as well as
109a transitional mapping to help migrate from older IDNA 2003 applications.
110
111For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
112LETTER K* is not allowed (nor are capital letters in general). UTS 46 will
113convert this into lower case prior to applying the IDNA conversion.
114
115.. code-block:: pycon
116
117    # Python 3
118    >>> import idna
119    >>> idna.encode(u'Königsgäßchen')
120    ...
121    idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
122    >>> idna.encode('Königsgäßchen', uts46=True)
123    b'xn--knigsgchen-b4a3dun'
124    >>> print(idna.decode('xn--knigsgchen-b4a3dun'))
125    königsgäßchen
126
127Transitional processing provides conversions to help transition from the older
1282003 standard to the current standard. For example, in the original IDNA
129specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two
130*LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this
131conversion is not performed.
132
133.. code-block:: pycon
134
135    # Python 2
136    >>> idna.encode(u'Königsgäßchen', uts46=True, transitional=True)
137    'xn--knigsgsschen-lcb0w'
138
139Implementors should use transitional processing with caution, only in rare
140cases where conversion from legacy labels to current labels must be performed
141(i.e. IDNA implementations that pre-date 2008). For typical applications
142that just need to convert labels, transitional processing is unlikely to be
143beneficial and could produce unexpected incompatible results.
144
145``encodings.idna`` Compatibility
146++++++++++++++++++++++++++++++++
147
148Function calls from the Python built-in ``encodings.idna`` module are
149mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
150Simply substitute the ``import`` clause in your code to refer to the
151new module name.
152
153Exceptions
154----------
155
156All errors raised during the conversion following the specification should
157raise an exception derived from the ``idna.IDNAError`` base class.
158
159More specific exceptions that may be generated as ``idna.IDNABidiError``
160when the error reflects an illegal combination of left-to-right and right-to-left
161characters in a label; ``idna.InvalidCodepoint`` when a specific codepoint is
162an illegal character in an IDN label (i.e. INVALID); and ``idna.InvalidCodepointContext``
163when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO
164or CONTEXTJ but the contextual requirements are not satisfied.)
165
166Building and Diagnostics
167------------------------
168
169The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for
170performance. These tables are derived from computing against eligibility criteria
171in the respective standards. These tables are computed using the command-line
172script ``tools/idna-data``.
173
174This tool will fetch relevant tables from the Unicode Consortium and perform the
175required calculations to identify eligibility. It has three main modes:
176
177* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
178  the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
179  who wish to track this library against a different Unicode version may use this tool
180  to manually generate a different version of the ``idnadata.py`` and ``uts46data.py``
181  files.
182
183* ``idna-data make-table``. Generate a table of the IDNA disposition
184  (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
185  5892 and the pre-computed tables published by `IANA <http://iana.org/>`_.
186
187* ``idna-data U+0061``. Prints debugging output on the various properties
188  associated with an individual Unicode codepoint (in this case, U+0061), that are
189  used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
190  or analysis.
191
192The tool accepts a number of arguments, described using ``idna-data -h``. Most notably,
193the ``--version`` argument allows the specification of the version of Unicode to use
194in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata``
195will generate library data against Unicode 9.0.0.
196
197Note that this script requires Python 3, but all generated library data will work
198in Python 2.7.
199
200
201Testing
202-------
203
204The library has a test suite based on each rule of the IDNA specification, as
205well as tests that are provided as part of the Unicode Technical Standard 46,
206`Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_.
207
208The tests are run automatically on each commit at Travis CI:
209
210.. image:: https://travis-ci.org/kjd/idna.svg?branch=master
211   :target: https://travis-ci.org/kjd/idna
212