1****************************
2  What's New in Python 2.3
3****************************
4
5:Author: A.M. Kuchling
6
7.. |release| replace:: 1.01
8
9.. $Id: whatsnew23.tex 54631 2007-03-31 11:58:36Z georg.brandl $
10
11This article explains the new features in Python 2.3.  Python 2.3 was released
12on July 29, 2003.
13
14The main themes for Python 2.3 are polishing some of the features added in 2.2,
15adding various small but useful enhancements to the core language, and expanding
16the standard library.  The new object model introduced in the previous version
17has benefited from 18 months of bugfixes and from optimization efforts that have
18improved the performance of new-style classes.  A few new built-in functions
19have been added such as :func:`sum` and :func:`enumerate`.  The :keyword:`in`
20operator can now be used for substring searches (e.g. ``"ab" in "abc"`` returns
21:const:`True`).
22
23Some of the many new library features include Boolean, set, heap, and date/time
24data types, the ability to import modules from ZIP-format archives, metadata
25support for the long-awaited Python catalog, an updated version of IDLE, and
26modules for logging messages, wrapping text, parsing CSV files, processing
27command-line options, using BerkeleyDB databases...  the list of new and
28enhanced modules is lengthy.
29
30This article doesn't attempt to provide a complete specification of the new
31features, but instead provides a convenient overview.  For full details, you
32should refer to the documentation for Python 2.3, such as the Python Library
33Reference and the Python Reference Manual.  If you want to understand the
34complete implementation and design rationale, refer to the PEP for a particular
35new feature.
36
37.. ======================================================================
38
39
40PEP 218: A Standard Set Datatype
41================================
42
43The new :mod:`sets` module contains an implementation of a set datatype.  The
44:class:`Set` class is for mutable sets, sets that can have members added and
45removed.  The :class:`ImmutableSet` class is for sets that can't be modified,
46and instances of :class:`ImmutableSet` can therefore be used as dictionary keys.
47Sets are built on top of dictionaries, so the elements within a set must be
48hashable.
49
50Here's a simple example::
51
52   >>> import sets
53   >>> S = sets.Set([1,2,3])
54   >>> S
55   Set([1, 2, 3])
56   >>> 1 in S
57   True
58   >>> 0 in S
59   False
60   >>> S.add(5)
61   >>> S.remove(3)
62   >>> S
63   Set([1, 2, 5])
64   >>>
65
66The union and intersection of sets can be computed with the :meth:`union` and
67:meth:`intersection` methods; an alternative notation uses the bitwise operators
68``&`` and ``|``. Mutable sets also have in-place versions of these methods,
69:meth:`union_update` and :meth:`intersection_update`. ::
70
71   >>> S1 = sets.Set([1,2,3])
72   >>> S2 = sets.Set([4,5,6])
73   >>> S1.union(S2)
74   Set([1, 2, 3, 4, 5, 6])
75   >>> S1 | S2                  # Alternative notation
76   Set([1, 2, 3, 4, 5, 6])
77   >>> S1.intersection(S2)
78   Set([])
79   >>> S1 & S2                  # Alternative notation
80   Set([])
81   >>> S1.union_update(S2)
82   >>> S1
83   Set([1, 2, 3, 4, 5, 6])
84   >>>
85
86It's also possible to take the symmetric difference of two sets.  This is the
87set of all elements in the union that aren't in the intersection.  Another way
88of putting it is that the symmetric difference contains all elements that are in
89exactly one set.  Again, there's an alternative notation (``^``), and an
90in-place version with the ungainly name :meth:`symmetric_difference_update`. ::
91
92   >>> S1 = sets.Set([1,2,3,4])
93   >>> S2 = sets.Set([3,4,5,6])
94   >>> S1.symmetric_difference(S2)
95   Set([1, 2, 5, 6])
96   >>> S1 ^ S2
97   Set([1, 2, 5, 6])
98   >>>
99
100There are also :meth:`issubset` and :meth:`issuperset` methods for checking
101whether one set is a subset or superset of another::
102
103   >>> S1 = sets.Set([1,2,3])
104   >>> S2 = sets.Set([2,3])
105   >>> S2.issubset(S1)
106   True
107   >>> S1.issubset(S2)
108   False
109   >>> S1.issuperset(S2)
110   True
111   >>>
112
113
114.. seealso::
115
116   :pep:`218` - Adding a Built-In Set Object Type
117      PEP written by Greg V. Wilson. Implemented by Greg V. Wilson, Alex Martelli, and
118      GvR.
119
120.. ======================================================================
121
122
123.. _section-generators:
124
125PEP 255: Simple Generators
126==========================
127
128In Python 2.2, generators were added as an optional feature, to be enabled by a
129``from __future__ import generators`` directive.  In 2.3 generators no longer
130need to be specially enabled, and are now always present; this means that
131:keyword:`yield` is now always a keyword.  The rest of this section is a copy of
132the description of generators from the "What's New in Python 2.2" document; if
133you read it back when Python 2.2 came out, you can skip the rest of this
134section.
135
136You're doubtless familiar with how function calls work in Python or C. When you
137call a function, it gets a private namespace where its local variables are
138created.  When the function reaches a :keyword:`return` statement, the local
139variables are destroyed and the resulting value is returned to the caller.  A
140later call to the same function will get a fresh new set of local variables.
141But, what if the local variables weren't thrown away on exiting a function?
142What if you could later resume the function where it left off?  This is what
143generators provide; they can be thought of as resumable functions.
144
145Here's the simplest example of a generator function::
146
147   def generate_ints(N):
148       for i in range(N):
149           yield i
150
151A new keyword, :keyword:`yield`, was introduced for generators.  Any function
152containing a :keyword:`!yield` statement is a generator function; this is
153detected by Python's bytecode compiler which compiles the function specially as
154a result.
155
156When you call a generator function, it doesn't return a single value; instead it
157returns a generator object that supports the iterator protocol.  On executing
158the :keyword:`yield` statement, the generator outputs the value of ``i``,
159similar to a :keyword:`return` statement.  The big difference between
160:keyword:`!yield` and a :keyword:`!return` statement is that on reaching a
161:keyword:`!yield` the generator's state of execution is suspended and local
162variables are preserved.  On the next call to the generator's ``.next()``
163method, the function will resume executing immediately after the
164:keyword:`!yield` statement.  (For complicated reasons, the :keyword:`!yield`
165statement isn't allowed inside the :keyword:`try` block of a
166:keyword:`!try`...\ :keyword:`!finally` statement; read :pep:`255` for a full
167explanation of the interaction between :keyword:`!yield` and exceptions.)
168
169Here's a sample usage of the :func:`generate_ints` generator::
170
171   >>> gen = generate_ints(3)
172   >>> gen
173   <generator object at 0x8117f90>
174   >>> gen.next()
175   0
176   >>> gen.next()
177   1
178   >>> gen.next()
179   2
180   >>> gen.next()
181   Traceback (most recent call last):
182     File "stdin", line 1, in ?
183     File "stdin", line 2, in generate_ints
184   StopIteration
185
186You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
187generate_ints(3)``.
188
189Inside a generator function, the :keyword:`return` statement can only be used
190without a value, and signals the end of the procession of values; afterwards the
191generator cannot return any further values. :keyword:`!return` with a value, such
192as ``return 5``, is a syntax error inside a generator function.  The end of the
193generator's results can also be indicated by raising :exc:`StopIteration`
194manually, or by just letting the flow of execution fall off the bottom of the
195function.
196
197You could achieve the effect of generators manually by writing your own class
198and storing all the local variables of the generator as instance variables.  For
199example, returning a list of integers could be done by setting ``self.count`` to
2000, and having the :meth:`next` method increment ``self.count`` and return it.
201However, for a moderately complicated generator, writing a corresponding class
202would be much messier. :file:`Lib/test/test_generators.py` contains a number of
203more interesting examples.  The simplest one implements an in-order traversal of
204a tree using generators recursively. ::
205
206   # A recursive generator that generates Tree leaves in in-order.
207   def inorder(t):
208       if t:
209           for x in inorder(t.left):
210               yield x
211           yield t.label
212           for x in inorder(t.right):
213               yield x
214
215Two other examples in :file:`Lib/test/test_generators.py` produce solutions for
216the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no
217queen threatens another) and the Knight's Tour (a route that takes a knight to
218every square of an $NxN$ chessboard without visiting any square twice).
219
220The idea of generators comes from other programming languages, especially Icon
221(https://www.cs.arizona.edu/icon/), where the idea of generators is central.  In
222Icon, every expression and function call behaves like a generator.  One example
223from "An Overview of the Icon Programming Language" at
224https://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks
225like::
226
227   sentence := "Store it in the neighboring harbor"
228   if (i := find("or", sentence)) > 5 then write(i)
229
230In Icon the :func:`find` function returns the indexes at which the substring
231"or" is found: 3, 23, 33.  In the :keyword:`if` statement, ``i`` is first
232assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon
233retries it with the second value of 23.  23 is greater than 5, so the comparison
234now succeeds, and the code prints the value 23 to the screen.
235
236Python doesn't go nearly as far as Icon in adopting generators as a central
237concept.  Generators are considered part of the core Python language, but
238learning or using them isn't compulsory; if they don't solve any problems that
239you have, feel free to ignore them. One novel feature of Python's interface as
240compared to Icon's is that a generator's state is represented as a concrete
241object (the iterator) that can be passed around to other functions or stored in
242a data structure.
243
244
245.. seealso::
246
247   :pep:`255` - Simple Generators
248      Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland.  Implemented mostly
249      by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew.
250
251.. ======================================================================
252
253
254.. _section-encodings:
255
256PEP 263: Source Code Encodings
257==============================
258
259Python source files can now be declared as being in different character set
260encodings.  Encodings are declared by including a specially formatted comment in
261the first or second line of the source file.  For example, a UTF-8 file can be
262declared with::
263
264   #!/usr/bin/env python
265   # -*- coding: UTF-8 -*-
266
267Without such an encoding declaration, the default encoding used is 7-bit ASCII.
268Executing or importing modules that contain string literals with 8-bit
269characters and have no encoding declaration will result in a
270:exc:`DeprecationWarning` being signalled by Python 2.3; in 2.4 this will be a
271syntax error.
272
273The encoding declaration only affects Unicode string literals, which will be
274converted to Unicode using the specified encoding.  Note that Python identifiers
275are still restricted to ASCII characters, so you can't have variable names that
276use characters outside of the usual alphanumerics.
277
278
279.. seealso::
280
281   :pep:`263` - Defining Python Source Code Encodings
282      Written by Marc-André Lemburg and Martin von Löwis; implemented by Suzuki Hisao
283      and Martin von Löwis.
284
285.. ======================================================================
286
287
288PEP 273: Importing Modules from ZIP Archives
289============================================
290
291The new :mod:`zipimport` module adds support for importing modules from a
292ZIP-format archive.  You don't need to import the module explicitly; it will be
293automatically imported if a ZIP archive's filename is added to ``sys.path``.
294For example:
295
296.. code-block:: shell-session
297
298   amk@nyman:~/src/python$ unzip -l /tmp/example.zip
299   Archive:  /tmp/example.zip
300     Length     Date   Time    Name
301    --------    ----   ----    ----
302        8467  11-26-02 22:30   jwzthreading.py
303    --------                   -------
304        8467                   1 file
305   amk@nyman:~/src/python$ ./python
306   Python 2.3 (#1, Aug 1 2003, 19:54:32)
307   >>> import sys
308   >>> sys.path.insert(0, '/tmp/example.zip')  # Add .zip file to front of path
309   >>> import jwzthreading
310   >>> jwzthreading.__file__
311   '/tmp/example.zip/jwzthreading.py'
312   >>>
313
314An entry in ``sys.path`` can now be the filename of a ZIP archive. The ZIP
315archive can contain any kind of files, but only files named :file:`\*.py`,
316:file:`\*.pyc`, or :file:`\*.pyo` can be imported.  If an archive only contains
317:file:`\*.py` files, Python will not attempt to modify the archive by adding the
318corresponding :file:`\*.pyc` file, meaning that if a ZIP archive doesn't contain
319:file:`\*.pyc` files, importing may be rather slow.
320
321A path within the archive can also be specified to only import from a
322subdirectory; for example, the path :file:`/tmp/example.zip/lib/` would only
323import from the :file:`lib/` subdirectory within the archive.
324
325
326.. seealso::
327
328   :pep:`273` - Import Modules from Zip Archives
329      Written by James C. Ahlstrom,  who also provided an implementation. Python 2.3
330      follows the specification in :pep:`273`,  but uses an implementation written by
331      Just van Rossum  that uses the import hooks described in :pep:`302`. See section
332      :ref:`section-pep302` for a description of the new import hooks.
333
334.. ======================================================================
335
336
337PEP 277: Unicode file name support for Windows NT
338=================================================
339
340On Windows NT, 2000, and XP, the system stores file names as Unicode strings.
341Traditionally, Python has represented file names as byte strings, which is
342inadequate because it renders some file names inaccessible.
343
344Python now allows using arbitrary Unicode strings (within the limitations of the
345file system) for all functions that expect file names, most notably the
346:func:`open` built-in function. If a Unicode string is passed to
347:func:`os.listdir`, Python now returns a list of Unicode strings.  A new
348function, :func:`os.getcwdu`, returns the current directory as a Unicode string.
349
350Byte strings still work as file names, and on Windows Python will transparently
351convert them to Unicode using the ``mbcs`` encoding.
352
353Other systems also allow Unicode strings as file names but convert them to byte
354strings before passing them to the system, which can cause a :exc:`UnicodeError`
355to be raised. Applications can test whether arbitrary Unicode strings are
356supported as file names by checking :attr:`os.path.supports_unicode_filenames`,
357a Boolean value.
358
359Under MacOS, :func:`os.listdir` may now return Unicode filenames.
360
361
362.. seealso::
363
364   :pep:`277` - Unicode file name support for Windows NT
365      Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Löwis, and Mark
366      Hammond.
367
368.. ======================================================================
369
370
371.. index::
372   single: universal newlines; What's new
373
374PEP 278: Universal Newline Support
375==================================
376
377The three major operating systems used today are Microsoft Windows, Apple's
378Macintosh OS, and the various Unix derivatives.  A minor irritation of
379cross-platform work  is that these three platforms all use different characters to
380mark the ends of lines in text files.  Unix uses the linefeed (ASCII character
38110), MacOS uses the carriage return (ASCII character 13), and Windows uses a
382two-character sequence of a carriage return plus a newline.
383
384Python's file objects can now support end of line conventions other than the
385one followed by the platform on which Python is running. Opening a file with
386the mode ``'U'`` or ``'rU'`` will open a file for reading in :term:`universal
387newlines` mode.  All three line ending conventions will be translated to a
388``'\n'`` in the strings returned by the various file methods such as
389:meth:`read` and :meth:`readline`.
390
391Universal newline support is also used when importing modules and when executing
392a file with the :func:`execfile` function.  This means that Python modules can
393be shared between all three operating systems without needing to convert the
394line-endings.
395
396This feature can be disabled when compiling Python by specifying the
397:option:`!--without-universal-newlines` switch when running Python's
398:program:`configure` script.
399
400
401.. seealso::
402
403   :pep:`278` - Universal Newline Support
404      Written and implemented by Jack Jansen.
405
406.. ======================================================================
407
408
409.. _section-enumerate:
410
411PEP 279: enumerate()
412====================
413
414A new built-in function, :func:`enumerate`, will make certain loops a bit
415clearer.  ``enumerate(thing)``, where *thing* is either an iterator or a
416sequence, returns an iterator that will return ``(0, thing[0])``, ``(1,
417thing[1])``, ``(2, thing[2])``, and so forth.
418
419A common idiom to change every element of a list looks like this::
420
421   for i in range(len(L)):
422       item = L[i]
423       # ... compute some result based on item ...
424       L[i] = result
425
426This can be rewritten using :func:`enumerate` as::
427
428   for i, item in enumerate(L):
429       # ... compute some result based on item ...
430       L[i] = result
431
432
433.. seealso::
434
435   :pep:`279` - The enumerate() built-in function
436      Written and implemented by Raymond D. Hettinger.
437
438.. ======================================================================
439
440
441PEP 282: The logging Package
442============================
443
444A standard package for writing logs, :mod:`logging`, has been added to Python
4452.3.  It provides a powerful and flexible mechanism for generating logging
446output which can then be filtered and processed in various ways.  A
447configuration file written in a standard format can be used to control the
448logging behavior of a program.  Python includes handlers that will write log
449records to standard error or to a file or socket, send them to the system log,
450or even e-mail them to a particular address; of course, it's also possible to
451write your own handler classes.
452
453The :class:`Logger` class is the primary class. Most application code will deal
454with one or more :class:`Logger` objects, each one used by a particular
455subsystem of the application. Each :class:`Logger` is identified by a name, and
456names are organized into a hierarchy using ``.``  as the component separator.
457For example, you might have :class:`Logger` instances named ``server``,
458``server.auth`` and ``server.network``.  The latter two instances are below
459``server`` in the hierarchy.  This means that if you turn up the verbosity for
460``server`` or direct ``server`` messages to a different handler, the changes
461will also apply to records logged to ``server.auth`` and ``server.network``.
462There's also a root :class:`Logger` that's the parent of all other loggers.
463
464For simple uses, the :mod:`logging` package contains some convenience functions
465that always use the root log::
466
467   import logging
468
469   logging.debug('Debugging information')
470   logging.info('Informational message')
471   logging.warning('Warning:config file %s not found', 'server.conf')
472   logging.error('Error occurred')
473   logging.critical('Critical error -- shutting down')
474
475This produces the following output::
476
477   WARNING:root:Warning:config file server.conf not found
478   ERROR:root:Error occurred
479   CRITICAL:root:Critical error -- shutting down
480
481In the default configuration, informational and debugging messages are
482suppressed and the output is sent to standard error.  You can enable the display
483of informational and debugging messages by calling the :meth:`setLevel` method
484on the root logger.
485
486Notice the :func:`warning` call's use of string formatting operators; all of the
487functions for logging messages take the arguments ``(msg, arg1, arg2, ...)`` and
488log the string resulting from ``msg % (arg1, arg2, ...)``.
489
490There's also an :func:`exception` function that records the most recent
491traceback.  Any of the other functions will also record the traceback if you
492specify a true value for the keyword argument *exc_info*. ::
493
494   def f():
495       try:    1/0
496       except: logging.exception('Problem recorded')
497
498   f()
499
500This produces the following output::
501
502   ERROR:root:Problem recorded
503   Traceback (most recent call last):
504     File "t.py", line 6, in f
505       1/0
506   ZeroDivisionError: integer division or modulo by zero
507
508Slightly more advanced programs will use a logger other than the root logger.
509The ``getLogger(name)`` function is used to get a particular log, creating
510it if it doesn't exist yet. ``getLogger(None)`` returns the root logger. ::
511
512   log = logging.getLogger('server')
513    ...
514   log.info('Listening on port %i', port)
515    ...
516   log.critical('Disk full')
517    ...
518
519Log records are usually propagated up the hierarchy, so a message logged to
520``server.auth`` is also seen by ``server`` and ``root``, but a :class:`Logger`
521can prevent this by setting its :attr:`propagate` attribute to :const:`False`.
522
523There are more classes provided by the :mod:`logging` package that can be
524customized.  When a :class:`Logger` instance is told to log a message, it
525creates a :class:`LogRecord` instance that is sent to any number of different
526:class:`Handler` instances.  Loggers and handlers can also have an attached list
527of filters, and each filter can cause the :class:`LogRecord` to be ignored or
528can modify the record before passing it along.  When they're finally output,
529:class:`LogRecord` instances are converted to text by a :class:`Formatter`
530class.  All of these classes can be replaced by your own specially-written
531classes.
532
533With all of these features the :mod:`logging` package should provide enough
534flexibility for even the most complicated applications.  This is only an
535incomplete overview of its features, so please see the package's reference
536documentation for all of the details.  Reading :pep:`282` will also be helpful.
537
538
539.. seealso::
540
541   :pep:`282` - A Logging System
542      Written by Vinay Sajip and Trent Mick; implemented by Vinay Sajip.
543
544.. ======================================================================
545
546
547.. _section-bool:
548
549PEP 285: A Boolean Type
550=======================
551
552A Boolean type was added to Python 2.3.  Two new constants were added to the
553:mod:`__builtin__` module, :const:`True` and :const:`False`.  (:const:`True` and
554:const:`False` constants were added to the built-ins in Python 2.2.1, but the
5552.2.1 versions are simply set to integer values of 1 and 0 and aren't a
556different type.)
557
558The type object for this new type is named :class:`bool`; the constructor for it
559takes any Python value and converts it to :const:`True` or :const:`False`. ::
560
561   >>> bool(1)
562   True
563   >>> bool(0)
564   False
565   >>> bool([])
566   False
567   >>> bool( (1,) )
568   True
569
570Most of the standard library modules and built-in functions have been changed to
571return Booleans. ::
572
573   >>> obj = []
574   >>> hasattr(obj, 'append')
575   True
576   >>> isinstance(obj, list)
577   True
578   >>> isinstance(obj, tuple)
579   False
580
581Python's Booleans were added with the primary goal of making code clearer.  For
582example, if you're reading a function and encounter the statement ``return 1``,
583you might wonder whether the ``1`` represents a Boolean truth value, an index,
584or a coefficient that multiplies some other quantity.  If the statement is
585``return True``, however, the meaning of the return value is quite clear.
586
587Python's Booleans were *not* added for the sake of strict type-checking.  A very
588strict language such as Pascal would also prevent you performing arithmetic with
589Booleans, and would require that the expression in an :keyword:`if` statement
590always evaluate to a Boolean result.  Python is not this strict and never will
591be, as :pep:`285` explicitly says.  This means you can still use any expression
592in an :keyword:`!if` statement, even ones that evaluate to a list or tuple or
593some random object.  The Boolean type is a subclass of the :class:`int` class so
594that arithmetic using a Boolean still works. ::
595
596   >>> True + 1
597   2
598   >>> False + 1
599   1
600   >>> False * 75
601   0
602   >>> True * 75
603   75
604
605To sum up :const:`True` and :const:`False` in a sentence: they're alternative
606ways to spell the integer values 1 and 0, with the single difference that
607:func:`str` and :func:`repr` return the strings ``'True'`` and ``'False'``
608instead of ``'1'`` and ``'0'``.
609
610
611.. seealso::
612
613   :pep:`285` - Adding a bool type
614      Written and implemented by GvR.
615
616.. ======================================================================
617
618
619PEP 293: Codec Error Handling Callbacks
620=======================================
621
622When encoding a Unicode string into a byte string, unencodable characters may be
623encountered.  So far, Python has allowed specifying the error processing as
624either "strict" (raising :exc:`UnicodeError`), "ignore" (skipping the
625character), or "replace" (using a question mark in the output string), with
626"strict" being the default behavior. It may be desirable to specify alternative
627processing of such errors, such as inserting an XML character reference or HTML
628entity reference into the converted string.
629
630Python now has a flexible framework to add different processing strategies.  New
631error handlers can be added with :func:`codecs.register_error`, and codecs then
632can access the error handler with :func:`codecs.lookup_error`. An equivalent C
633API has been added for codecs written in C. The error handler gets the necessary
634state information such as the string being converted, the position in the string
635where the error was detected, and the target encoding.  The handler can then
636either raise an exception or return a replacement string.
637
638Two additional error handlers have been implemented using this framework:
639"backslashreplace" uses Python backslash quoting to represent unencodable
640characters and "xmlcharrefreplace" emits XML character references.
641
642
643.. seealso::
644
645   :pep:`293` - Codec Error Handling Callbacks
646      Written and implemented by Walter Dörwald.
647
648.. ======================================================================
649
650
651.. _section-pep301:
652
653PEP 301: Package Index and Metadata for Distutils
654=================================================
655
656Support for the long-requested Python catalog makes its first appearance in 2.3.
657
658The heart of the catalog is the new Distutils :command:`register` command.
659Running ``python setup.py register`` will collect the metadata describing a
660package, such as its name, version, maintainer, description, &c., and send it to
661a central catalog server.  The resulting catalog is available from
662https://pypi.org.
663
664To make the catalog a bit more useful, a new optional *classifiers* keyword
665argument has been added to the Distutils :func:`setup` function.  A list of
666`Trove <http://catb.org/~esr/trove/>`_-style strings can be supplied to help
667classify the software.
668
669Here's an example :file:`setup.py` with classifiers, written to be compatible
670with older versions of the Distutils::
671
672   from distutils import core
673   kw = {'name': "Quixote",
674         'version': "0.5.1",
675         'description': "A highly Pythonic Web application framework",
676         # ...
677         }
678
679   if (hasattr(core, 'setup_keywords') and
680       'classifiers' in core.setup_keywords):
681       kw['classifiers'] = \
682           ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
683            'Environment :: No Input/Output (Daemon)',
684            'Intended Audience :: Developers'],
685
686   core.setup(**kw)
687
688The full list of classifiers can be obtained by running  ``python setup.py
689register --list-classifiers``.
690
691
692.. seealso::
693
694   :pep:`301` - Package Index and Metadata for Distutils
695      Written and implemented by Richard Jones.
696
697.. ======================================================================
698
699
700.. _section-pep302:
701
702PEP 302: New Import Hooks
703=========================
704
705While it's been possible to write custom import hooks ever since the
706:mod:`ihooks` module was introduced in Python 1.3, no one has ever been really
707happy with it because writing new import hooks is difficult and messy.  There
708have been various proposed alternatives such as the :mod:`imputil` and :mod:`iu`
709modules, but none of them has ever gained much acceptance, and none of them were
710easily usable from C code.
711
712:pep:`302` borrows ideas from its predecessors, especially from Gordon
713McMillan's :mod:`iu` module.  Three new items  are added to the :mod:`sys`
714module:
715
716* ``sys.path_hooks`` is a list of callable objects; most  often they'll be
717  classes.  Each callable takes a string containing a path and either returns an
718  importer object that will handle imports from this path or raises an
719  :exc:`ImportError` exception if it can't handle this path.
720
721* ``sys.path_importer_cache`` caches importer objects for each path, so
722  ``sys.path_hooks`` will only need to be traversed once for each path.
723
724* ``sys.meta_path`` is a list of importer objects that will be traversed before
725  ``sys.path`` is checked.  This list is initially empty, but user code can add
726  objects to it.  Additional built-in and frozen modules can be imported by an
727  object added to this list.
728
729Importer objects must have a single method, ``find_module(fullname,
730path=None)``.  *fullname* will be a module or package name, e.g. ``string`` or
731``distutils.core``.  :meth:`find_module` must return a loader object that has a
732single method, ``load_module(fullname)``, that creates and returns the
733corresponding module object.
734
735Pseudo-code for Python's new import logic, therefore, looks something like this
736(simplified a bit; see :pep:`302` for the full details)::
737
738   for mp in sys.meta_path:
739       loader = mp(fullname)
740       if loader is not None:
741           <module> = loader.load_module(fullname)
742
743   for path in sys.path:
744       for hook in sys.path_hooks:
745           try:
746               importer = hook(path)
747           except ImportError:
748               # ImportError, so try the other path hooks
749               pass
750           else:
751               loader = importer.find_module(fullname)
752               <module> = loader.load_module(fullname)
753
754   # Not found!
755   raise ImportError
756
757
758.. seealso::
759
760   :pep:`302` - New Import Hooks
761      Written by Just van Rossum and Paul Moore. Implemented by Just van Rossum.
762
763.. ======================================================================
764
765
766.. _section-pep305:
767
768PEP 305: Comma-separated Files
769==============================
770
771Comma-separated files are a format frequently used for exporting data from
772databases and spreadsheets.  Python 2.3 adds a parser for comma-separated files.
773
774Comma-separated format is deceptively simple at first glance::
775
776   Costs,150,200,3.95
777
778Read a line and call ``line.split(',')``: what could be simpler? But toss in
779string data that can contain commas, and things get more complicated::
780
781   "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"
782
783A big ugly regular expression can parse this, but using the new  :mod:`csv`
784package is much simpler::
785
786   import csv
787
788   input = open('datafile', 'rb')
789   reader = csv.reader(input)
790   for line in reader:
791       print line
792
793The :func:`reader` function takes a number of different options. The field
794separator isn't limited to the comma and can be changed to any character, and so
795can the quoting and line-ending characters.
796
797Different dialects of comma-separated files can be defined and registered;
798currently there are two dialects, both used by Microsoft Excel. A separate
799:class:`csv.writer` class will generate comma-separated files from a succession
800of tuples or lists, quoting strings that contain the delimiter.
801
802
803.. seealso::
804
805   :pep:`305` - CSV File API
806      Written and implemented  by Kevin Altis, Dave Cole, Andrew McNamara, Skip
807      Montanaro, Cliff Wells.
808
809.. ======================================================================
810
811
812.. _section-pep307:
813
814PEP 307: Pickle Enhancements
815============================
816
817The :mod:`pickle` and :mod:`cPickle` modules received some attention during the
8182.3 development cycle.  In 2.2, new-style classes could be pickled without
819difficulty, but they weren't pickled very compactly; :pep:`307` quotes a trivial
820example where a new-style class results in a pickled string three times longer
821than that for a classic class.
822
823The solution was to invent a new pickle protocol.  The :func:`pickle.dumps`
824function has supported a text-or-binary flag  for a long time.  In 2.3, this
825flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle
826format, 1 is the old binary format, and now 2 is a new 2.3-specific format.  A
827new constant, :const:`pickle.HIGHEST_PROTOCOL`, can be used to select the
828fanciest protocol available.
829
830Unpickling is no longer considered a safe operation.  2.2's :mod:`pickle`
831provided hooks for trying to prevent unsafe classes from being unpickled
832(specifically, a :attr:`__safe_for_unpickling__` attribute), but none of this
833code was ever audited and therefore it's all been ripped out in 2.3.  You should
834not unpickle untrusted data in any version of Python.
835
836To reduce the pickling overhead for new-style classes, a new interface for
837customizing pickling was added using three special methods:
838:meth:`__getstate__`, :meth:`__setstate__`, and :meth:`__getnewargs__`.  Consult
839:pep:`307` for the full semantics  of these methods.
840
841As a way to compress pickles yet further, it's now possible to use integer codes
842instead of long strings to identify pickled classes. The Python Software
843Foundation will maintain a list of standardized codes; there's also a range of
844codes for private use.  Currently no codes have been specified.
845
846
847.. seealso::
848
849   :pep:`307` - Extensions to the pickle protocol
850      Written and implemented  by Guido van Rossum and Tim Peters.
851
852.. ======================================================================
853
854
855.. _section-slices:
856
857Extended Slices
858===============
859
860Ever since Python 1.4, the slicing syntax has supported an optional third "step"
861or "stride" argument.  For example, these are all legal Python syntax:
862``L[1:10:2]``, ``L[:-1:1]``, ``L[::-1]``.  This was added to Python at the
863request of the developers of Numerical Python, which uses the third argument
864extensively.  However, Python's built-in list, tuple, and string sequence types
865have never supported this feature, raising a :exc:`TypeError` if you tried it.
866Michael Hudson contributed a patch to fix this shortcoming.
867
868For example, you can now easily extract the elements of a list that have even
869indexes::
870
871   >>> L = range(10)
872   >>> L[::2]
873   [0, 2, 4, 6, 8]
874
875Negative values also work to make a copy of the same list in reverse order::
876
877   >>> L[::-1]
878   [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
879
880This also works for tuples, arrays, and strings::
881
882   >>> s='abcd'
883   >>> s[::2]
884   'ac'
885   >>> s[::-1]
886   'dcba'
887
888If you have a mutable sequence such as a list or an array you can assign to or
889delete an extended slice, but there are some differences between assignment to
890extended and regular slices.  Assignment to a regular slice can be used to
891change the length of the sequence::
892
893   >>> a = range(3)
894   >>> a
895   [0, 1, 2]
896   >>> a[1:3] = [4, 5, 6]
897   >>> a
898   [0, 4, 5, 6]
899
900Extended slices aren't this flexible.  When assigning to an extended slice, the
901list on the right hand side of the statement must contain the same number of
902items as the slice it is replacing::
903
904   >>> a = range(4)
905   >>> a
906   [0, 1, 2, 3]
907   >>> a[::2]
908   [0, 2]
909   >>> a[::2] = [0, -1]
910   >>> a
911   [0, 1, -1, 3]
912   >>> a[::2] = [0,1,2]
913   Traceback (most recent call last):
914     File "<stdin>", line 1, in ?
915   ValueError: attempt to assign sequence of size 3 to extended slice of size 2
916
917Deletion is more straightforward::
918
919   >>> a = range(4)
920   >>> a
921   [0, 1, 2, 3]
922   >>> a[::2]
923   [0, 2]
924   >>> del a[::2]
925   >>> a
926   [1, 3]
927
928One can also now pass slice objects to the :meth:`__getitem__` methods of the
929built-in sequences::
930
931   >>> range(10).__getitem__(slice(0, 5, 2))
932   [0, 2, 4]
933
934Or use slice objects directly in subscripts::
935
936   >>> range(10)[slice(0, 5, 2)]
937   [0, 2, 4]
938
939To simplify implementing sequences that support extended slicing, slice objects
940now have a method ``indices(length)`` which, given the length of a sequence,
941returns a ``(start, stop, step)`` tuple that can be passed directly to
942:func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a
943manner consistent with regular slices (and this innocuous phrase hides a welter
944of confusing details!).  The method is intended to be used like this::
945
946   class FakeSeq:
947       ...
948       def calc_item(self, i):
949           ...
950       def __getitem__(self, item):
951           if isinstance(item, slice):
952               indices = item.indices(len(self))
953               return FakeSeq([self.calc_item(i) for i in range(*indices)])
954           else:
955               return self.calc_item(i)
956
957From this example you can also see that the built-in :class:`slice` object is
958now the type object for the slice type, and is no longer a function.  This is
959consistent with Python 2.2, where :class:`int`, :class:`str`, etc., underwent
960the same change.
961
962.. ======================================================================
963
964
965Other Language Changes
966======================
967
968Here are all of the changes that Python 2.3 makes to the core Python language.
969
970* The :keyword:`yield` statement is now always a keyword, as described in
971  section :ref:`section-generators` of this document.
972
973* A new built-in function :func:`enumerate` was added, as described in section
974  :ref:`section-enumerate` of this document.
975
976* Two new constants, :const:`True` and :const:`False` were added along with the
977  built-in :class:`bool` type, as described in section :ref:`section-bool` of this
978  document.
979
980* The :func:`int` type constructor will now return a long integer instead of
981  raising an :exc:`OverflowError` when a string or floating-point number is too
982  large to fit into an integer.  This can lead to the paradoxical result that
983  ``isinstance(int(expression), int)`` is false, but that seems unlikely to cause
984  problems in practice.
985
986* Built-in types now support the extended slicing syntax, as described in
987  section :ref:`section-slices` of this document.
988
989* A new built-in function, ``sum(iterable, start=0)``,  adds up the numeric
990  items in the iterable object and returns their sum.  :func:`sum` only accepts
991  numbers, meaning that you can't use it to concatenate a bunch of strings.
992  (Contributed by Alex Martelli.)
993
994* ``list.insert(pos, value)`` used to  insert *value* at the front of the list
995  when *pos* was negative.  The behaviour has now been changed to be consistent
996  with slice indexing, so when *pos* is -1 the value will be inserted before the
997  last element, and so forth.
998
999* ``list.index(value)``, which searches for *value*  within the list and returns
1000  its index, now takes optional  *start* and *stop* arguments to limit the search
1001  to  only part of the list.
1002
1003* Dictionaries have a new method, ``pop(key[, *default*])``, that returns
1004  the value corresponding to *key* and removes that key/value pair from the
1005  dictionary.  If the requested key isn't present in the dictionary, *default* is
1006  returned if it's specified and :exc:`KeyError` raised if it isn't. ::
1007
1008     >>> d = {1:2}
1009     >>> d
1010     {1: 2}
1011     >>> d.pop(4)
1012     Traceback (most recent call last):
1013       File "stdin", line 1, in ?
1014     KeyError: 4
1015     >>> d.pop(1)
1016     2
1017     >>> d.pop(1)
1018     Traceback (most recent call last):
1019       File "stdin", line 1, in ?
1020     KeyError: 'pop(): dictionary is empty'
1021     >>> d
1022     {}
1023     >>>
1024
1025  There's also a new class method,  ``dict.fromkeys(iterable, value)``, that
1026  creates a dictionary with keys taken from the supplied iterator *iterable* and
1027  all values set to *value*, defaulting to ``None``.
1028
1029  (Patches contributed by Raymond Hettinger.)
1030
1031  Also, the :func:`dict` constructor now accepts keyword arguments to simplify
1032  creating small dictionaries::
1033
1034     >>> dict(red=1, blue=2, green=3, black=4)
1035     {'blue': 2, 'black': 4, 'green': 3, 'red': 1}
1036
1037  (Contributed by Just van Rossum.)
1038
1039* The :keyword:`assert` statement no longer checks the ``__debug__`` flag, so
1040  you can no longer disable assertions by assigning to ``__debug__``. Running
1041  Python with the :option:`-O` switch will still generate code that doesn't
1042  execute any assertions.
1043
1044* Most type objects are now callable, so you can use them to create new objects
1045  such as functions, classes, and modules.  (This means that the :mod:`new` module
1046  can be deprecated in a future Python version, because you can now use the type
1047  objects available in the :mod:`types` module.) For example, you can create a new
1048  module object with the following code:
1049
1050  ::
1051
1052     >>> import types
1053     >>> m = types.ModuleType('abc','docstring')
1054     >>> m
1055     <module 'abc' (built-in)>
1056     >>> m.__doc__
1057     'docstring'
1058
1059* A new warning, :exc:`PendingDeprecationWarning` was added to indicate features
1060  which are in the process of being deprecated.  The warning will *not* be printed
1061  by default.  To check for use of features that will be deprecated in the future,
1062  supply :option:`-Walways::PendingDeprecationWarning:: <-W>` on the command line or
1063  use :func:`warnings.filterwarnings`.
1064
1065* The process of deprecating string-based exceptions, as in ``raise "Error
1066  occurred"``, has begun.  Raising a string will now trigger
1067  :exc:`PendingDeprecationWarning`.
1068
1069* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
1070  warning.  In a future version of Python, ``None`` may finally become a keyword.
1071
1072* The :meth:`xreadlines` method of file objects, introduced in Python 2.1, is no
1073  longer necessary because files now behave as their own iterator.
1074  :meth:`xreadlines` was originally introduced as a faster way to loop over all
1075  the lines in a file, but now you can simply write ``for line in file_obj``.
1076  File objects also have a new read-only :attr:`encoding` attribute that gives the
1077  encoding used by the file; Unicode strings written to the file will be
1078  automatically  converted to bytes using the given encoding.
1079
1080* The method resolution order used by new-style classes has changed, though
1081  you'll only notice the difference if you have a really complicated inheritance
1082  hierarchy.  Classic classes are unaffected by this change.  Python 2.2
1083  originally used a topological sort of a class's ancestors, but 2.3 now uses the
1084  C3 algorithm as described in the paper `"A Monotonic Superclass Linearization
1085  for Dylan" <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.3910>`_. To
1086  understand the motivation for this change,  read Michele Simionato's article
1087  `"Python 2.3 Method Resolution Order" <http://www.phyast.pitt.edu/~micheles/mro.html>`_, or
1088  read the thread on python-dev starting with the message at
1089  https://mail.python.org/pipermail/python-dev/2002-October/029035.html. Samuele
1090  Pedroni first pointed out the problem and also implemented the fix by coding the
1091  C3 algorithm.
1092
1093* Python runs multithreaded programs by switching between threads after
1094  executing N bytecodes.  The default value for N has been increased from 10 to
1095  100 bytecodes, speeding up single-threaded applications by reducing the
1096  switching overhead.  Some multithreaded applications may suffer slower response
1097  time, but that's easily fixed by setting the limit back to a lower number using
1098  ``sys.setcheckinterval(N)``. The limit can be retrieved with the new
1099  :func:`sys.getcheckinterval` function.
1100
1101* One minor but far-reaching change is that the names of extension types defined
1102  by the modules included with Python now contain the module and a ``'.'`` in
1103  front of the type name.  For example, in Python 2.2, if you created a socket and
1104  printed its :attr:`__class__`, you'd get this output::
1105
1106     >>> s = socket.socket()
1107     >>> s.__class__
1108     <type 'socket'>
1109
1110  In 2.3, you get this::
1111
1112     >>> s.__class__
1113     <type '_socket.socket'>
1114
1115* One of the noted incompatibilities between old- and new-style classes has been
1116  removed: you can now assign to the :attr:`~definition.__name__` and :attr:`~class.__bases__`
1117  attributes of new-style classes.  There are some restrictions on what can be
1118  assigned to :attr:`~class.__bases__` along the lines of those relating to assigning to
1119  an instance's :attr:`~instance.__class__` attribute.
1120
1121.. ======================================================================
1122
1123
1124String Changes
1125--------------
1126
1127* The :keyword:`in` operator now works differently for strings. Previously, when
1128  evaluating ``X in Y`` where *X* and *Y* are strings, *X* could only be a single
1129  character. That's now changed; *X* can be a string of any length, and ``X in Y``
1130  will return :const:`True` if *X* is a substring of *Y*.  If *X* is the empty
1131  string, the result is always :const:`True`. ::
1132
1133     >>> 'ab' in 'abcd'
1134     True
1135     >>> 'ad' in 'abcd'
1136     False
1137     >>> '' in 'abcd'
1138     True
1139
1140  Note that this doesn't tell you where the substring starts; if you need that
1141  information, use the :meth:`find` string method.
1142
1143* The :meth:`strip`, :meth:`lstrip`, and :meth:`rstrip` string methods now have
1144  an optional argument for specifying the characters to strip.  The default is
1145  still to remove all whitespace characters::
1146
1147     >>> '   abc '.strip()
1148     'abc'
1149     >>> '><><abc<><><>'.strip('<>')
1150     'abc'
1151     >>> '><><abc<><><>\n'.strip('<>')
1152     'abc<><><>\n'
1153     >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
1154     u'\u4001abc'
1155     >>>
1156
1157  (Suggested by Simon Brunning and implemented by Walter Dörwald.)
1158
1159* The :meth:`startswith` and :meth:`endswith` string methods now accept negative
1160  numbers for the *start* and *end* parameters.
1161
1162* Another new string method is :meth:`zfill`, originally a function in the
1163  :mod:`string` module.  :meth:`zfill` pads a numeric string with zeros on the
1164  left until it's the specified width. Note that the ``%`` operator is still more
1165  flexible and powerful than :meth:`zfill`. ::
1166
1167     >>> '45'.zfill(4)
1168     '0045'
1169     >>> '12345'.zfill(4)
1170     '12345'
1171     >>> 'goofy'.zfill(6)
1172     '0goofy'
1173
1174  (Contributed by Walter Dörwald.)
1175
1176* A new type object, :class:`basestring`, has been added. Both 8-bit strings and
1177  Unicode strings inherit from this type, so ``isinstance(obj, basestring)`` will
1178  return :const:`True` for either kind of string.  It's a completely abstract
1179  type, so you can't create :class:`basestring` instances.
1180
1181* Interned strings are no longer immortal and will now be garbage-collected in
1182  the usual way when the only reference to them is from the internal dictionary of
1183  interned strings.  (Implemented by Oren Tirosh.)
1184
1185.. ======================================================================
1186
1187
1188Optimizations
1189-------------
1190
1191* The creation of new-style class instances has been made much faster; they're
1192  now faster than classic classes!
1193
1194* The :meth:`sort` method of list objects has been extensively rewritten by Tim
1195  Peters, and the implementation is significantly faster.
1196
1197* Multiplication of large long integers is now much faster thanks to an
1198  implementation of Karatsuba multiplication, an algorithm that scales better than
1199  the O(n\*n) required for the grade-school multiplication algorithm.  (Original
1200  patch by Christopher A. Craig, and significantly reworked by Tim Peters.)
1201
1202* The ``SET_LINENO`` opcode is now gone.  This may provide a small speed
1203  increase, depending on your compiler's idiosyncrasies. See section
1204  :ref:`23section-other` for a longer explanation. (Removed by Michael Hudson.)
1205
1206* :func:`xrange` objects now have their own iterator, making ``for i in
1207  xrange(n)`` slightly faster than ``for i in range(n)``.  (Patch by Raymond
1208  Hettinger.)
1209
1210* A number of small rearrangements have been made in various hotspots to improve
1211  performance, such as inlining a function or removing some code.  (Implemented
1212  mostly by GvR, but lots of people have contributed single changes.)
1213
1214The net result of the 2.3 optimizations is that Python 2.3 runs the  pystone
1215benchmark around 25% faster than Python 2.2.
1216
1217.. ======================================================================
1218
1219
1220New, Improved, and Deprecated Modules
1221=====================================
1222
1223As usual, Python's standard library received a number of enhancements and bug
1224fixes.  Here's a partial list of the most notable changes, sorted alphabetically
1225by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
1226complete list of changes, or look through the CVS logs for all the details.
1227
1228* The :mod:`array` module now supports arrays of Unicode characters using the
1229  ``'u'`` format character.  Arrays also now support using the ``+=`` assignment
1230  operator to add another array's contents, and the ``*=`` assignment operator to
1231  repeat an array. (Contributed by Jason Orendorff.)
1232
1233* The :mod:`bsddb` module has been replaced by version 4.1.6 of the `PyBSDDB
1234  <http://pybsddb.sourceforge.net>`_ package, providing a more complete interface
1235  to the transactional features of the BerkeleyDB library.
1236
1237  The old version of the module has been renamed to  :mod:`bsddb185` and is no
1238  longer built automatically; you'll  have to edit :file:`Modules/Setup` to enable
1239  it.  Note that the new :mod:`bsddb` package is intended to be compatible with
1240  the  old module, so be sure to file bugs if you discover any incompatibilities.
1241  When upgrading to Python 2.3, if the new interpreter is compiled with a new
1242  version of  the underlying BerkeleyDB library, you will almost certainly have to
1243  convert your database files to the new version.  You can do this fairly easily
1244  with the new scripts :file:`db2pickle.py` and :file:`pickle2db.py` which you
1245  will find in the distribution's :file:`Tools/scripts` directory.  If you've
1246  already been using the PyBSDDB package and importing it as :mod:`bsddb3`, you
1247  will have to change your ``import`` statements to import it as :mod:`bsddb`.
1248
1249* The new :mod:`bz2` module is an interface to the bz2 data compression library.
1250  bz2-compressed data is usually smaller than  corresponding
1251  :mod:`zlib`\ -compressed data. (Contributed by Gustavo Niemeyer.)
1252
1253* A set of standard date/time types has been added in the new :mod:`datetime`
1254  module.  See the following section for more details.
1255
1256* The Distutils :class:`Extension` class now supports an extra constructor
1257  argument named *depends* for listing additional source files that an extension
1258  depends on.  This lets Distutils recompile the module if any of the dependency
1259  files are modified.  For example, if :file:`sampmodule.c` includes the header
1260  file :file:`sample.h`, you would create the :class:`Extension` object like
1261  this::
1262
1263     ext = Extension("samp",
1264                     sources=["sampmodule.c"],
1265                     depends=["sample.h"])
1266
1267  Modifying :file:`sample.h` would then cause the module to be recompiled.
1268  (Contributed by Jeremy Hylton.)
1269
1270* Other minor changes to Distutils: it now checks for the :envvar:`CC`,
1271  :envvar:`CFLAGS`, :envvar:`CPP`, :envvar:`LDFLAGS`, and :envvar:`CPPFLAGS`
1272  environment variables, using them to override the settings in Python's
1273  configuration (contributed by Robert Weber).
1274
1275* Previously the :mod:`doctest` module would only search the docstrings of
1276  public methods and functions for test cases, but it now also examines private
1277  ones as well.  The :func:`DocTestSuite` function creates a
1278  :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests.
1279
1280* The new ``gc.get_referents(object)`` function returns a list of all the
1281  objects referenced by *object*.
1282
1283* The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that
1284  supports the same arguments as the existing :func:`getopt` function but uses
1285  GNU-style scanning mode. The existing :func:`getopt` stops processing options as
1286  soon as a non-option argument is encountered, but in GNU-style mode processing
1287  continues, meaning that options and arguments can be mixed.  For example::
1288
1289     >>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1290     ([('-f', 'filename')], ['output', '-v'])
1291     >>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1292     ([('-f', 'filename'), ('-v', '')], ['output'])
1293
1294  (Contributed by Peter Åstrand.)
1295
1296* The :mod:`grp`, :mod:`pwd`, and :mod:`resource` modules now return enhanced
1297  tuples::
1298
1299     >>> import grp
1300     >>> g = grp.getgrnam('amk')
1301     >>> g.gr_name, g.gr_gid
1302     ('amk', 500)
1303
1304* The :mod:`gzip` module can now handle files exceeding 2 GiB.
1305
1306* The new :mod:`heapq` module contains an implementation of a heap queue
1307  algorithm.  A heap is an array-like data structure that keeps items in a
1308  partially sorted order such that, for every index *k*, ``heap[k] <=
1309  heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]``.  This makes it quick to remove the
1310  smallest item, and inserting a new item while maintaining the heap property is
1311  O(lg n).  (See https://xlinux.nist.gov/dads//HTML/priorityque.html for more
1312  information about the priority queue data structure.)
1313
1314  The :mod:`heapq` module provides :func:`heappush` and :func:`heappop` functions
1315  for adding and removing items while maintaining the heap property on top of some
1316  other mutable Python sequence type.  Here's an example that uses a Python list::
1317
1318     >>> import heapq
1319     >>> heap = []
1320     >>> for item in [3, 7, 5, 11, 1]:
1321     ...    heapq.heappush(heap, item)
1322     ...
1323     >>> heap
1324     [1, 3, 5, 11, 7]
1325     >>> heapq.heappop(heap)
1326     1
1327     >>> heapq.heappop(heap)
1328     3
1329     >>> heap
1330     [5, 7, 11]
1331
1332  (Contributed by Kevin O'Connor.)
1333
1334* The IDLE integrated development environment has been updated using the code
1335  from the IDLEfork project (http://idlefork.sourceforge.net).  The most notable feature is
1336  that the code being developed is now executed in a subprocess, meaning that
1337  there's no longer any need for manual ``reload()`` operations. IDLE's core code
1338  has been incorporated into the standard library as the :mod:`idlelib` package.
1339
1340* The :mod:`imaplib` module now supports IMAP over SSL. (Contributed by Piers
1341  Lauder and Tino Lange.)
1342
1343* The :mod:`itertools` contains a number of useful functions for use with
1344  iterators, inspired by various functions provided by the ML and Haskell
1345  languages.  For example, ``itertools.ifilter(predicate, iterator)`` returns all
1346  elements in the iterator for which the function :func:`predicate` returns
1347  :const:`True`, and ``itertools.repeat(obj, N)`` returns ``obj`` *N* times.
1348  There are a number of other functions in the module; see the package's reference
1349  documentation for details.
1350  (Contributed by Raymond Hettinger.)
1351
1352* Two new functions in the :mod:`math` module, ``degrees(rads)`` and
1353  ``radians(degs)``, convert between radians and degrees.  Other functions in
1354  the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always
1355  required input values measured in radians.  Also, an optional *base* argument
1356  was added to :func:`math.log` to make it easier to compute logarithms for bases
1357  other than ``e`` and ``10``.  (Contributed by Raymond Hettinger.)
1358
1359* Several new POSIX functions (:func:`getpgid`, :func:`killpg`, :func:`lchown`,
1360  :func:`loadavg`, :func:`major`, :func:`makedev`, :func:`minor`, and
1361  :func:`mknod`) were added to the :mod:`posix` module that underlies the
1362  :mod:`os` module. (Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S.
1363  Otkidach.)
1364
1365* In the :mod:`os` module, the :func:`\*stat` family of functions can now report
1366  fractions of a second in a timestamp.  Such time stamps are represented as
1367  floats, similar to the value returned by :func:`time.time`.
1368
1369  During testing, it was found that some applications will break if time stamps
1370  are floats.  For compatibility, when using the tuple interface of the
1371  :class:`stat_result` time stamps will be represented as integers. When using
1372  named fields (a feature first introduced in Python 2.2), time stamps are still
1373  represented as integers, unless :func:`os.stat_float_times` is invoked to enable
1374  float return values::
1375
1376     >>> os.stat("/tmp").st_mtime
1377     1034791200
1378     >>> os.stat_float_times(True)
1379     >>> os.stat("/tmp").st_mtime
1380     1034791200.6335014
1381
1382  In Python 2.4, the default will change to always returning floats.
1383
1384  Application developers should enable this feature only if all their libraries
1385  work properly when confronted with floating point time stamps, or if they use
1386  the tuple API. If used, the feature should be activated on an application level
1387  instead of trying to enable it on a per-use basis.
1388
1389* The :mod:`optparse` module contains a new parser for command-line arguments
1390  that can convert option values to a particular Python type  and will
1391  automatically generate a usage message.  See the following section for  more
1392  details.
1393
1394* The old and never-documented :mod:`linuxaudiodev` module has been deprecated,
1395  and a new version named :mod:`ossaudiodev` has been added.  The module was
1396  renamed because the OSS sound drivers can be used on platforms other than Linux,
1397  and the interface has also been tidied and brought up to date in various ways.
1398  (Contributed by Greg Ward and Nicholas FitzRoy-Dale.)
1399
1400* The new :mod:`platform` module contains a number of functions that try to
1401  determine various properties of the platform you're running on.  There are
1402  functions for getting the architecture, CPU type, the Windows OS version, and
1403  even the Linux distribution version. (Contributed by Marc-André Lemburg.)
1404
1405* The parser objects provided by the :mod:`pyexpat` module can now optionally
1406  buffer character data, resulting in fewer calls to your character data handler
1407  and therefore faster performance.  Setting the parser object's
1408  :attr:`buffer_text` attribute to :const:`True` will enable buffering.
1409
1410* The ``sample(population, k)`` function was added to the :mod:`random`
1411  module.  *population* is a sequence or :class:`xrange` object containing the
1412  elements of a population, and :func:`sample` chooses *k* elements from the
1413  population without replacing chosen elements.  *k* can be any value up to
1414  ``len(population)``. For example::
1415
1416     >>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn']
1417     >>> random.sample(days, 3)      # Choose 3 elements
1418     ['St', 'Sn', 'Th']
1419     >>> random.sample(days, 7)      # Choose 7 elements
1420     ['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn']
1421     >>> random.sample(days, 7)      # Choose 7 again
1422     ['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th']
1423     >>> random.sample(days, 8)      # Can't choose eight
1424     Traceback (most recent call last):
1425       File "<stdin>", line 1, in ?
1426       File "random.py", line 414, in sample
1427           raise ValueError, "sample larger than population"
1428     ValueError: sample larger than population
1429     >>> random.sample(xrange(1,10000,2), 10)   # Choose ten odd nos. under 10000
1430     [3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195]
1431
1432  The :mod:`random` module now uses a new algorithm, the Mersenne Twister,
1433  implemented in C.  It's faster and more extensively studied than the previous
1434  algorithm.
1435
1436  (All changes contributed by Raymond Hettinger.)
1437
1438* The :mod:`readline` module also gained a number of new functions:
1439  :func:`get_history_item`, :func:`get_current_history_length`, and
1440  :func:`redisplay`.
1441
1442* The :mod:`rexec` and :mod:`Bastion` modules have been declared dead, and
1443  attempts to import them will fail with a :exc:`RuntimeError`.  New-style classes
1444  provide new ways to break out of the restricted execution environment provided
1445  by :mod:`rexec`, and no one has interest in fixing them or time to do so.  If
1446  you have applications using :mod:`rexec`, rewrite them to use something else.
1447
1448  (Sticking with Python 2.2 or 2.1 will not make your applications any safer
1449  because there are known bugs in the :mod:`rexec` module in those versions.  To
1450  repeat: if you're using :mod:`rexec`, stop using it immediately.)
1451
1452* The :mod:`rotor` module has been deprecated because the  algorithm it uses for
1453  encryption is not believed to be secure.  If you need encryption, use one of the
1454  several AES Python modules that are available separately.
1455
1456* The :mod:`shutil` module gained a ``move(src, dest)`` function that
1457  recursively moves a file or directory to a new location.
1458
1459* Support for more advanced POSIX signal handling was added to the :mod:`signal`
1460  but then removed again as it proved impossible to make it work reliably across
1461  platforms.
1462
1463* The :mod:`socket` module now supports timeouts.  You can call the
1464  ``settimeout(t)`` method on a socket object to set a timeout of *t* seconds.
1465  Subsequent socket operations that take longer than *t* seconds to complete will
1466  abort and raise a :exc:`socket.timeout` exception.
1467
1468  The original timeout implementation was by Tim O'Malley.  Michael Gilfix
1469  integrated it into the Python :mod:`socket` module and shepherded it through a
1470  lengthy review.  After the code was checked in, Guido van Rossum rewrote parts
1471  of it.  (This is a good example of a collaborative development process in
1472  action.)
1473
1474* On Windows, the :mod:`socket` module now ships with Secure  Sockets Layer
1475  (SSL) support.
1476
1477* The value of the C :const:`PYTHON_API_VERSION` macro is now exposed at the
1478  Python level as ``sys.api_version``.  The current exception can be cleared by
1479  calling the new :func:`sys.exc_clear` function.
1480
1481* The new :mod:`tarfile` module  allows reading from and writing to
1482  :program:`tar`\ -format archive files. (Contributed by Lars Gustäbel.)
1483
1484* The new :mod:`textwrap` module contains functions for wrapping strings
1485  containing paragraphs of text.  The ``wrap(text, width)`` function takes a
1486  string and returns a list containing the text split into lines of no more than
1487  the chosen width.  The ``fill(text, width)`` function returns a single
1488  string, reformatted to fit into lines no longer than the chosen width. (As you
1489  can guess, :func:`fill` is built on top of :func:`wrap`.  For example::
1490
1491     >>> import textwrap
1492     >>> paragraph = "Not a whit, we defy augury: ... more text ..."
1493     >>> textwrap.wrap(paragraph, 60)
1494     ["Not a whit, we defy augury: there's a special providence in",
1495      "the fall of a sparrow. If it be now, 'tis not to come; if it",
1496      ...]
1497     >>> print textwrap.fill(paragraph, 35)
1498     Not a whit, we defy augury: there's
1499     a special providence in the fall of
1500     a sparrow. If it be now, 'tis not
1501     to come; if it be not to come, it
1502     will be now; if it be not now, yet
1503     it will come: the readiness is all.
1504     >>>
1505
1506  The module also contains a :class:`TextWrapper` class that actually implements
1507  the text wrapping strategy.   Both the :class:`TextWrapper` class and the
1508  :func:`wrap` and :func:`fill` functions support a number of additional keyword
1509  arguments for fine-tuning the formatting; consult the module's documentation
1510  for details. (Contributed by Greg Ward.)
1511
1512* The :mod:`thread` and :mod:`threading` modules now have companion modules,
1513  :mod:`dummy_thread` and :mod:`dummy_threading`, that provide a do-nothing
1514  implementation of the :mod:`thread` module's interface for platforms where
1515  threads are not supported.  The intention is to simplify thread-aware modules
1516  (ones that *don't* rely on threads to run) by putting the following code at the
1517  top::
1518
1519     try:
1520         import threading as _threading
1521     except ImportError:
1522         import dummy_threading as _threading
1523
1524  In this example, :mod:`_threading` is used as the module name to make it clear
1525  that the module being used is not necessarily the actual :mod:`threading`
1526  module. Code can call functions and use classes in :mod:`_threading` whether or
1527  not threads are supported, avoiding an :keyword:`if` statement and making the
1528  code slightly clearer.  This module will not magically make multithreaded code
1529  run without threads; code that waits for another thread to return or to do
1530  something will simply hang forever.
1531
1532* The :mod:`time` module's :func:`strptime` function has long been an annoyance
1533  because it uses the platform C library's :func:`strptime` implementation, and
1534  different platforms sometimes have odd bugs.  Brett Cannon contributed a
1535  portable implementation that's written in pure Python and should behave
1536  identically on all platforms.
1537
1538* The new :mod:`timeit` module helps measure how long snippets of Python code
1539  take to execute.  The :file:`timeit.py` file can be run directly from the
1540  command line, or the module's :class:`Timer` class can be imported and used
1541  directly.  Here's a short example that figures out whether it's faster to
1542  convert an 8-bit string to Unicode by appending an empty Unicode string to it or
1543  by using the :func:`unicode` function::
1544
1545     import timeit
1546
1547     timer1 = timeit.Timer('unicode("abc")')
1548     timer2 = timeit.Timer('"abc" + u""')
1549
1550     # Run three trials
1551     print timer1.repeat(repeat=3, number=100000)
1552     print timer2.repeat(repeat=3, number=100000)
1553
1554     # On my laptop this outputs:
1555     # [0.36831796169281006, 0.37441694736480713, 0.35304892063140869]
1556     # [0.17574405670166016, 0.18193507194519043, 0.17565798759460449]
1557
1558* The :mod:`Tix` module has received various bug fixes and updates for the
1559  current version of the Tix package.
1560
1561* The :mod:`Tkinter` module now works with a thread-enabled  version of Tcl.
1562  Tcl's threading model requires that widgets only be accessed from the thread in
1563  which they're created; accesses from another thread can cause Tcl to panic.  For
1564  certain Tcl interfaces, :mod:`Tkinter` will now automatically avoid this  when a
1565  widget is accessed from a different thread by marshalling a command, passing it
1566  to the correct thread, and waiting for the results.  Other interfaces can't be
1567  handled automatically but :mod:`Tkinter` will now raise an exception on such an
1568  access so that you can at least find out about the problem.  See
1569  https://mail.python.org/pipermail/python-dev/2002-December/031107.html for a more
1570  detailed explanation of this change.  (Implemented by Martin von Löwis.)
1571
1572* Calling Tcl methods through :mod:`_tkinter` no longer  returns only strings.
1573  Instead, if Tcl returns other objects those objects are converted to their
1574  Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
1575  object if no Python equivalent exists. This behavior can be controlled through
1576  the :meth:`wantobjects` method of :class:`tkapp` objects.
1577
1578  When using :mod:`_tkinter` through the :mod:`Tkinter` module (as most Tkinter
1579  applications will), this feature is always activated. It should not cause
1580  compatibility problems, since Tkinter would always convert string results to
1581  Python types where possible.
1582
1583  If any incompatibilities are found, the old behavior can be restored by setting
1584  the :attr:`wantobjects` variable in the :mod:`Tkinter` module to false before
1585  creating the first :class:`tkapp` object. ::
1586
1587     import Tkinter
1588     Tkinter.wantobjects = 0
1589
1590  Any breakage caused by this change should be reported as a bug.
1591
1592* The :mod:`UserDict` module has a new :class:`DictMixin` class which defines
1593  all dictionary methods for classes that already have a minimum mapping
1594  interface.  This greatly simplifies writing classes that need to be
1595  substitutable for dictionaries, such as the classes in  the :mod:`shelve`
1596  module.
1597
1598  Adding the mix-in as a superclass provides the full dictionary interface
1599  whenever the class defines :meth:`__getitem__`, :meth:`__setitem__`,
1600  :meth:`__delitem__`, and :meth:`keys`. For example::
1601
1602     >>> import UserDict
1603     >>> class SeqDict(UserDict.DictMixin):
1604     ...     """Dictionary lookalike implemented with lists."""
1605     ...     def __init__(self):
1606     ...         self.keylist = []
1607     ...         self.valuelist = []
1608     ...     def __getitem__(self, key):
1609     ...         try:
1610     ...             i = self.keylist.index(key)
1611     ...         except ValueError:
1612     ...             raise KeyError
1613     ...         return self.valuelist[i]
1614     ...     def __setitem__(self, key, value):
1615     ...         try:
1616     ...             i = self.keylist.index(key)
1617     ...             self.valuelist[i] = value
1618     ...         except ValueError:
1619     ...             self.keylist.append(key)
1620     ...             self.valuelist.append(value)
1621     ...     def __delitem__(self, key):
1622     ...         try:
1623     ...             i = self.keylist.index(key)
1624     ...         except ValueError:
1625     ...             raise KeyError
1626     ...         self.keylist.pop(i)
1627     ...         self.valuelist.pop(i)
1628     ...     def keys(self):
1629     ...         return list(self.keylist)
1630     ...
1631     >>> s = SeqDict()
1632     >>> dir(s)      # See that other dictionary methods are implemented
1633     ['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__',
1634      '__init__', '__iter__', '__len__', '__module__', '__repr__',
1635      '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems',
1636      'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem',
1637      'setdefault', 'update', 'valuelist', 'values']
1638
1639  (Contributed by Raymond Hettinger.)
1640
1641* The DOM implementation in :mod:`xml.dom.minidom` can now generate XML output
1642  in a particular encoding by providing an optional encoding argument to the
1643  :meth:`toxml` and :meth:`toprettyxml` methods of DOM nodes.
1644
1645* The :mod:`xmlrpclib` module now supports an XML-RPC extension for handling nil
1646  data values such as Python's ``None``.  Nil values are always supported on
1647  unmarshalling an XML-RPC response.  To generate requests containing ``None``,
1648  you must supply a true value for the *allow_none* parameter when creating a
1649  :class:`Marshaller` instance.
1650
1651* The new :mod:`DocXMLRPCServer` module allows writing self-documenting XML-RPC
1652  servers. Run it in demo mode (as a program) to see it in action.   Pointing the
1653  Web browser to the RPC server produces pydoc-style documentation; pointing
1654  xmlrpclib to the server allows invoking the actual methods. (Contributed by
1655  Brian Quinlan.)
1656
1657* Support for internationalized domain names (RFCs 3454, 3490, 3491, and 3492)
1658  has been added. The "idna" encoding can be used to convert between a Unicode
1659  domain name and the ASCII-compatible encoding (ACE) of that name. ::
1660
1661     >{}>{}> u"www.Alliancefrançaise.nu".encode("idna")
1662     'www.xn--alliancefranaise-npb.nu'
1663
1664  The :mod:`socket` module has also been extended to transparently convert
1665  Unicode hostnames to the ACE version before passing them to the C library.
1666  Modules that deal with hostnames such as :mod:`httplib` and :mod:`ftplib`)
1667  also support Unicode host names; :mod:`httplib` also sends HTTP ``Host``
1668  headers using the ACE version of the domain name.  :mod:`urllib` supports
1669  Unicode URLs with non-ASCII host names as long as the ``path`` part of the URL
1670  is ASCII only.
1671
1672  To implement this change, the :mod:`stringprep` module, the  ``mkstringprep``
1673  tool and the ``punycode`` encoding have been added.
1674
1675.. ======================================================================
1676
1677
1678Date/Time Type
1679--------------
1680
1681Date and time types suitable for expressing timestamps were added as the
1682:mod:`datetime` module.  The types don't support different calendars or many
1683fancy features, and just stick to the basics of representing time.
1684
1685The three primary types are: :class:`date`, representing a day, month, and year;
1686:class:`~datetime.time`, consisting of hour, minute, and second; and :class:`~datetime.datetime`,
1687which contains all the attributes of both :class:`date` and :class:`~datetime.time`.
1688There's also a :class:`timedelta` class representing differences between two
1689points in time, and time zone logic is implemented by classes inheriting from
1690the abstract :class:`tzinfo` class.
1691
1692You can create instances of :class:`date` and :class:`~datetime.time` by either supplying
1693keyword arguments to the appropriate constructor, e.g.
1694``datetime.date(year=1972, month=10, day=15)``, or by using one of a number of
1695class methods.  For example, the :meth:`date.today` class method returns the
1696current local date.
1697
1698Once created, instances of the date/time classes are all immutable. There are a
1699number of methods for producing formatted strings from objects::
1700
1701   >>> import datetime
1702   >>> now = datetime.datetime.now()
1703   >>> now.isoformat()
1704   '2002-12-30T21:27:03.994956'
1705   >>> now.ctime()  # Only available on date, datetime
1706   'Mon Dec 30 21:27:03 2002'
1707   >>> now.strftime('%Y %d %b')
1708   '2002 30 Dec'
1709
1710The :meth:`replace` method allows modifying one or more fields  of a
1711:class:`date` or :class:`~datetime.datetime` instance, returning a new instance::
1712
1713   >>> d = datetime.datetime.now()
1714   >>> d
1715   datetime.datetime(2002, 12, 30, 22, 15, 38, 827738)
1716   >>> d.replace(year=2001, hour = 12)
1717   datetime.datetime(2001, 12, 30, 12, 15, 38, 827738)
1718   >>>
1719
1720Instances can be compared, hashed, and converted to strings (the result is the
1721same as that of :meth:`isoformat`).  :class:`date` and :class:`~datetime.datetime`
1722instances can be subtracted from each other, and added to :class:`timedelta`
1723instances.  The largest missing feature is that there's no standard library
1724support for parsing strings and getting back a :class:`date` or
1725:class:`~datetime.datetime`.
1726
1727For more information, refer to the module's reference documentation.
1728(Contributed by Tim Peters.)
1729
1730.. ======================================================================
1731
1732
1733The optparse Module
1734-------------------
1735
1736The :mod:`getopt` module provides simple parsing of command-line arguments.  The
1737new :mod:`optparse` module (originally named Optik) provides more elaborate
1738command-line parsing that follows the Unix conventions, automatically creates
1739the output for :option:`!--help`, and can perform different actions for different
1740options.
1741
1742You start by creating an instance of :class:`OptionParser` and telling it what
1743your program's options are. ::
1744
1745   import sys
1746   from optparse import OptionParser
1747
1748   op = OptionParser()
1749   op.add_option('-i', '--input',
1750                 action='store', type='string', dest='input',
1751                 help='set input filename')
1752   op.add_option('-l', '--length',
1753                 action='store', type='int', dest='length',
1754                 help='set maximum length of output')
1755
1756Parsing a command line is then done by calling the :meth:`parse_args` method. ::
1757
1758   options, args = op.parse_args(sys.argv[1:])
1759   print options
1760   print args
1761
1762This returns an object containing all of the option values, and a list of
1763strings containing the remaining arguments.
1764
1765Invoking the script with the various arguments now works as you'd expect it to.
1766Note that the length argument is automatically converted to an integer.
1767
1768.. code-block:: shell-session
1769
1770   $ ./python opt.py -i data arg1
1771   <Values at 0x400cad4c: {'input': 'data', 'length': None}>
1772   ['arg1']
1773   $ ./python opt.py --input=data --length=4
1774   <Values at 0x400cad2c: {'input': 'data', 'length': 4}>
1775   []
1776   $
1777
1778The help message is automatically generated for you:
1779
1780.. code-block:: shell-session
1781
1782   $ ./python opt.py --help
1783   usage: opt.py [options]
1784
1785   options:
1786     -h, --help            show this help message and exit
1787     -iINPUT, --input=INPUT
1788                           set input filename
1789     -lLENGTH, --length=LENGTH
1790                           set maximum length of output
1791   $
1792
1793See the module's documentation for more details.
1794
1795
1796Optik was written by Greg Ward, with suggestions from the readers of the Getopt
1797SIG.
1798
1799.. ======================================================================
1800
1801
1802.. _section-pymalloc:
1803
1804Pymalloc: A Specialized Object Allocator
1805========================================
1806
1807Pymalloc, a specialized object allocator written by Vladimir Marangozov, was a
1808feature added to Python 2.1.  Pymalloc is intended to be faster than the system
1809:c:func:`malloc` and to have less memory overhead for allocation patterns typical
1810of Python programs. The allocator uses C's :c:func:`malloc` function to get large
1811pools of memory and then fulfills smaller memory requests from these pools.
1812
1813In 2.1 and 2.2, pymalloc was an experimental feature and wasn't enabled by
1814default; you had to explicitly enable it when compiling Python by providing the
1815:option:`!--with-pymalloc` option to the :program:`configure` script.  In 2.3,
1816pymalloc has had further enhancements and is now enabled by default; you'll have
1817to supply :option:`!--without-pymalloc` to disable it.
1818
1819This change is transparent to code written in Python; however, pymalloc may
1820expose bugs in C extensions.  Authors of C extension modules should test their
1821code with pymalloc enabled, because some incorrect code may cause core dumps at
1822runtime.
1823
1824There's one particularly common error that causes problems.  There are a number
1825of memory allocation functions in Python's C API that have previously just been
1826aliases for the C library's :c:func:`malloc` and :c:func:`free`, meaning that if
1827you accidentally called mismatched functions the error wouldn't be noticeable.
1828When the object allocator is enabled, these functions aren't aliases of
1829:c:func:`malloc` and :c:func:`free` any more, and calling the wrong function to
1830free memory may get you a core dump.  For example, if memory was allocated using
1831:c:func:`PyObject_Malloc`, it has to be freed using :c:func:`PyObject_Free`, not
1832:c:func:`free`.  A few modules included with Python fell afoul of this and had to
1833be fixed; doubtless there are more third-party modules that will have the same
1834problem.
1835
1836As part of this change, the confusing multiple interfaces for allocating memory
1837have been consolidated down into two API families. Memory allocated with one
1838family must not be manipulated with functions from the other family.  There is
1839one family for allocating chunks of memory and another family of functions
1840specifically for allocating Python objects.
1841
1842* To allocate and free an undistinguished chunk of memory use the "raw memory"
1843  family: :c:func:`PyMem_Malloc`, :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free`.
1844
1845* The "object memory" family is the interface to the pymalloc facility described
1846  above and is biased towards a large number of "small" allocations:
1847  :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and :c:func:`PyObject_Free`.
1848
1849* To allocate and free Python objects, use the "object" family
1850  :c:func:`PyObject_New`, :c:func:`PyObject_NewVar`, and :c:func:`PyObject_Del`.
1851
1852Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides debugging
1853features to catch memory overwrites and doubled frees in both extension modules
1854and in the interpreter itself.  To enable this support, compile a debugging
1855version of the Python interpreter by running :program:`configure` with
1856:option:`!--with-pydebug`.
1857
1858To aid extension writers, a header file :file:`Misc/pymemcompat.h` is
1859distributed with the source to Python 2.3 that allows Python extensions to use
1860the 2.3 interfaces to memory allocation while compiling against any version of
1861Python since 1.5.2.  You would copy the file from Python's source distribution
1862and bundle it with the source of your extension.
1863
1864
1865.. seealso::
1866
1867   https://hg.python.org/cpython/file/default/Objects/obmalloc.c
1868      For the full details of the pymalloc implementation, see the comments at
1869      the top of the file :file:`Objects/obmalloc.c` in the Python source code.
1870      The above link points to the file within the python.org SVN browser.
1871
1872.. ======================================================================
1873
1874
1875Build and C API Changes
1876=======================
1877
1878Changes to Python's build process and to the C API include:
1879
1880* The cycle detection implementation used by the garbage collection has proven
1881  to be stable, so it's now been made mandatory.  You can no longer compile Python
1882  without it, and the :option:`!--with-cycle-gc` switch to :program:`configure` has
1883  been removed.
1884
1885* Python can now optionally be built as a shared library
1886  (:file:`libpython2.3.so`) by supplying :option:`!--enable-shared` when running
1887  Python's :program:`configure` script.  (Contributed by Ondrej Palkovsky.)
1888
1889* The :c:macro:`DL_EXPORT` and :c:macro:`DL_IMPORT` macros are now deprecated.
1890  Initialization functions for Python extension modules should now be declared
1891  using the new macro :c:macro:`PyMODINIT_FUNC`, while the Python core will
1892  generally use the :c:macro:`PyAPI_FUNC` and :c:macro:`PyAPI_DATA` macros.
1893
1894* The interpreter can be compiled without any docstrings for the built-in
1895  functions and modules by supplying :option:`!--without-doc-strings` to the
1896  :program:`configure` script. This makes the Python executable about 10% smaller,
1897  but will also mean that you can't get help for Python's built-ins.  (Contributed
1898  by Gustavo Niemeyer.)
1899
1900* The :c:func:`PyArg_NoArgs` macro is now deprecated, and code that uses it
1901  should be changed.  For Python 2.2 and later, the method definition table can
1902  specify the :const:`METH_NOARGS` flag, signalling that there are no arguments,
1903  and the argument checking can then be removed.  If compatibility with pre-2.2
1904  versions of Python is important, the code could use ``PyArg_ParseTuple(args,
1905  "")`` instead, but this will be slower than using :const:`METH_NOARGS`.
1906
1907* :c:func:`PyArg_ParseTuple` accepts new format characters for various sizes of
1908  unsigned integers: ``B`` for :c:type:`unsigned char`, ``H`` for :c:type:`unsigned
1909  short int`,  ``I`` for :c:type:`unsigned int`,  and ``K`` for :c:type:`unsigned
1910  long long`.
1911
1912* A new function, ``PyObject_DelItemString(mapping, char *key)`` was added
1913  as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``.
1914
1915* File objects now manage their internal string buffer differently, increasing
1916  it exponentially when needed.  This results in the benchmark tests in
1917  :file:`Lib/test/test_bufio.py` speeding up considerably (from 57 seconds to 1.7
1918  seconds, according to one measurement).
1919
1920* It's now possible to define class and static methods for a C extension type by
1921  setting either the :const:`METH_CLASS` or :const:`METH_STATIC` flags in a
1922  method's :c:type:`PyMethodDef` structure.
1923
1924* Python now includes a copy of the Expat XML parser's source code, removing any
1925  dependence on a system version or local installation of Expat.
1926
1927* If you dynamically allocate type objects in your extension, you should be
1928  aware of a change in the rules relating to the :attr:`__module__` and
1929  :attr:`~definition.__name__` attributes.  In summary, you will want to ensure the type's
1930  dictionary contains a ``'__module__'`` key; making the module name the part of
1931  the type name leading up to the final period will no longer have the desired
1932  effect.  For more detail, read the API reference documentation or the  source.
1933
1934.. ======================================================================
1935
1936
1937Port-Specific Changes
1938---------------------
1939
1940Support for a port to IBM's OS/2 using the EMX runtime environment was merged
1941into the main Python source tree.  EMX is a POSIX emulation layer over the OS/2
1942system APIs.  The Python port for EMX tries to support all the POSIX-like
1943capability exposed by the EMX runtime, and mostly succeeds; :func:`fork` and
1944:func:`fcntl` are restricted by the limitations of the underlying emulation
1945layer.  The standard OS/2 port, which uses IBM's Visual Age compiler, also
1946gained support for case-sensitive import semantics as part of the integration of
1947the EMX port into CVS.  (Contributed by Andrew MacIntyre.)
1948
1949On MacOS, most toolbox modules have been weaklinked to improve backward
1950compatibility.  This means that modules will no longer fail to load if a single
1951routine is missing on the current OS version. Instead calling the missing
1952routine will raise an exception. (Contributed by Jack Jansen.)
1953
1954The RPM spec files, found in the :file:`Misc/RPM/` directory in the Python
1955source distribution, were updated for 2.3.  (Contributed by Sean Reifschneider.)
1956
1957Other new platforms now supported by Python include AtheOS
1958(http://www.atheos.cx/), GNU/Hurd, and OpenVMS.
1959
1960.. ======================================================================
1961
1962
1963.. _23section-other:
1964
1965Other Changes and Fixes
1966=======================
1967
1968As usual, there were a bunch of other improvements and bugfixes scattered
1969throughout the source tree.  A search through the CVS change logs finds there
1970were 523 patches applied and 514 bugs fixed between Python 2.2 and 2.3.  Both
1971figures are likely to be underestimates.
1972
1973Some of the more notable changes are:
1974
1975* If the :envvar:`PYTHONINSPECT` environment variable is set, the Python
1976  interpreter will enter the interactive prompt after running a Python program, as
1977  if Python had been invoked with the :option:`-i` option. The environment
1978  variable can be set before running the Python interpreter, or it can be set by
1979  the Python program as part of its execution.
1980
1981* The :file:`regrtest.py` script now provides a way to allow "all resources
1982  except *foo*."  A resource name passed to the :option:`!-u` option can now be
1983  prefixed with a hyphen (``'-'``) to mean "remove this resource."  For example,
1984  the option '``-uall,-bsddb``' could be used to enable the use of all resources
1985  except ``bsddb``.
1986
1987* The tools used to build the documentation now work under Cygwin as well as
1988  Unix.
1989
1990* The ``SET_LINENO`` opcode has been removed.  Back in the mists of time, this
1991  opcode was needed to produce line numbers in tracebacks and support trace
1992  functions (for, e.g., :mod:`pdb`). Since Python 1.5, the line numbers in
1993  tracebacks have been computed using a different mechanism that works with
1994  "python -O".  For Python 2.3 Michael Hudson implemented a similar scheme to
1995  determine when to call the trace function, removing the need for ``SET_LINENO``
1996  entirely.
1997
1998  It would be difficult to detect any resulting difference from Python code, apart
1999  from a slight speed up when Python is run without :option:`-O`.
2000
2001  C extensions that access the :attr:`f_lineno` field of frame objects should
2002  instead call ``PyCode_Addr2Line(f->f_code, f->f_lasti)``. This will have the
2003  added effect of making the code work as desired under "python -O" in earlier
2004  versions of Python.
2005
2006  A nifty new feature is that trace functions can now assign to the
2007  :attr:`f_lineno` attribute of frame objects, changing the line that will be
2008  executed next.  A ``jump`` command has been added to the :mod:`pdb` debugger
2009  taking advantage of this new feature. (Implemented by Richie Hindle.)
2010
2011.. ======================================================================
2012
2013
2014Porting to Python 2.3
2015=====================
2016
2017This section lists previously described changes that may require changes to your
2018code:
2019
2020* :keyword:`yield` is now always a keyword; if it's used as a variable name in
2021  your code, a different name must be chosen.
2022
2023* For strings *X* and *Y*, ``X in Y`` now works if *X* is more than one
2024  character long.
2025
2026* The :func:`int` type constructor will now return a long integer instead of
2027  raising an :exc:`OverflowError` when a string or floating-point number is too
2028  large to fit into an integer.
2029
2030* If you have Unicode strings that contain 8-bit characters, you must declare
2031  the file's encoding (UTF-8, Latin-1, or whatever) by adding a comment to the top
2032  of the file.  See section :ref:`section-encodings` for more information.
2033
2034* Calling Tcl methods through :mod:`_tkinter` no longer  returns only strings.
2035  Instead, if Tcl returns other objects those objects are converted to their
2036  Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
2037  object if no Python equivalent exists.
2038
2039* Large octal and hex literals such as ``0xffffffff`` now trigger a
2040  :exc:`FutureWarning`. Currently they're stored as 32-bit numbers and result in a
2041  negative value, but in Python 2.4 they'll become positive long integers.
2042
2043  There are a few ways to fix this warning.  If you really need a positive number,
2044  just add an ``L`` to the end of the literal.  If you're trying to get a 32-bit
2045  integer with low bits set and have previously used an expression such as ``~(1
2046  << 31)``, it's probably clearest to start with all bits set and clear the
2047  desired upper bits. For example, to clear just the top bit (bit 31), you could
2048  write ``0xffffffffL &~(1L<<31)``.
2049
2050* You can no longer disable assertions by assigning to ``__debug__``.
2051
2052* The Distutils :func:`setup` function has gained various new keyword arguments
2053  such as *depends*.  Old versions of the Distutils will abort if passed unknown
2054  keywords.  A solution is to check for the presence of the new
2055  :func:`get_distutil_options` function in your :file:`setup.py` and only uses the
2056  new keywords with a version of the Distutils that supports them::
2057
2058     from distutils import core
2059
2060     kw = {'sources': 'foo.c', ...}
2061     if hasattr(core, 'get_distutil_options'):
2062         kw['depends'] = ['foo.h']
2063     ext = Extension(**kw)
2064
2065* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
2066  warning.
2067
2068* Names of extension types defined by the modules included with Python now
2069  contain the module and a ``'.'`` in front of the type name.
2070
2071.. ======================================================================
2072
2073
2074.. _23acks:
2075
2076Acknowledgements
2077================
2078
2079The author would like to thank the following people for offering suggestions,
2080corrections and assistance with various drafts of this article: Jeff Bauer,
2081Simon Brunning, Brett Cannon, Michael Chermside, Andrew Dalke, Scott David
2082Daniels, Fred L. Drake, Jr., David Fraser,  Kelly Gerber, Raymond Hettinger,
2083Michael Hudson, Chris Lambert, Detlef Lannert, Martin von Löwis, Andrew
2084MacIntyre, Lalo Martins, Chad Netzer, Gustavo Niemeyer, Neal Norwitz, Hans
2085Nowak, Chris Reedy, Francesco Ricciardi, Vinay Sajip, Neil Schemenauer, Roman
2086Suzi, Jason Tishler, Just van Rossum.
2087