1**************************** 2 What's New In Python 3.1 3**************************** 4 5:Author: Raymond Hettinger 6 7.. $Id$ 8 Rules for maintenance: 9 10 * Anyone can add text to this document. Do not spend very much time 11 on the wording of your changes, because your text will probably 12 get rewritten to some degree. 13 14 * The maintainer will go through Misc/NEWS periodically and add 15 changes; it's therefore more important to add your changes to 16 Misc/NEWS than to this file. 17 18 * This is not a complete list of every single change; completeness 19 is the purpose of Misc/NEWS. Some changes I consider too small 20 or esoteric to include. If such a change is added to the text, 21 I'll just remove it. (This is another reason you shouldn't spend 22 too much time on writing your addition.) 23 24 * If you want to draw your new text to the attention of the 25 maintainer, add 'XXX' to the beginning of the paragraph or 26 section. 27 28 * It's OK to just add a fragmentary note about a change. For 29 example: "XXX Describe the transmogrify() function added to the 30 socket module." The maintainer will research the change and 31 write the necessary text. 32 33 * You can comment out your additions if you like, but it's not 34 necessary (especially when a final release is some months away). 35 36 * Credit the author of a patch or bugfix. Just the name is 37 sufficient; the e-mail address isn't necessary. 38 39 * It's helpful to add the bug/patch number as a comment: 40 41 % Patch 12345 42 XXX Describe the transmogrify() function added to the socket 43 module. 44 (Contributed by P.Y. Developer.) 45 46 This saves the maintainer the effort of going through the SVN log 47 when researching a change. 48 49This article explains the new features in Python 3.1, compared to 3.0. 50 51 52PEP 372: Ordered Dictionaries 53============================= 54 55Regular Python dictionaries iterate over key/value pairs in arbitrary order. 56Over the years, a number of authors have written alternative implementations 57that remember the order that the keys were originally inserted. Based on 58the experiences from those implementations, a new 59:class:`collections.OrderedDict` class has been introduced. 60 61The OrderedDict API is substantially the same as regular dictionaries 62but will iterate over keys and values in a guaranteed order depending on 63when a key was first inserted. If a new entry overwrites an existing entry, 64the original insertion position is left unchanged. Deleting an entry and 65reinserting it will move it to the end. 66 67The standard library now supports use of ordered dictionaries in several 68modules. The :mod:`configparser` module uses them by default. This lets 69configuration files be read, modified, and then written back in their original 70order. The *_asdict()* method for :func:`collections.namedtuple` now 71returns an ordered dictionary with the values appearing in the same order as 72the underlying tuple indices. The :mod:`json` module is being built-out with 73an *object_pairs_hook* to allow OrderedDicts to be built by the decoder. 74Support was also added for third-party tools like `PyYAML <http://pyyaml.org/>`_. 75 76.. seealso:: 77 78 :pep:`372` - Ordered Dictionaries 79 PEP written by Armin Ronacher and Raymond Hettinger. Implementation 80 written by Raymond Hettinger. 81 82 83PEP 378: Format Specifier for Thousands Separator 84================================================= 85 86The built-in :func:`format` function and the :meth:`str.format` method use 87a mini-language that now includes a simple, non-locale aware way to format 88a number with a thousands separator. That provides a way to humanize a 89program's output, improving its professional appearance and readability:: 90 91 >>> format(1234567, ',d') 92 '1,234,567' 93 >>> format(1234567.89, ',.2f') 94 '1,234,567.89' 95 >>> format(12345.6 + 8901234.12j, ',f') 96 '12,345.600000+8,901,234.120000j' 97 >>> format(Decimal('1234567.89'), ',f') 98 '1,234,567.89' 99 100The supported types are :class:`int`, :class:`float`, :class:`complex` 101and :class:`decimal.Decimal`. 102 103Discussions are underway about how to specify alternative separators 104like dots, spaces, apostrophes, or underscores. Locale-aware applications 105should use the existing *n* format specifier which already has some support 106for thousands separators. 107 108.. seealso:: 109 110 :pep:`378` - Format Specifier for Thousands Separator 111 PEP written by Raymond Hettinger and implemented by Eric Smith and 112 Mark Dickinson. 113 114 115Other Language Changes 116====================== 117 118Some smaller changes made to the core Python language are: 119 120* Directories and zip archives containing a :file:`__main__.py` 121 file can now be executed directly by passing their name to the 122 interpreter. The directory/zipfile is automatically inserted as the 123 first entry in sys.path. (Suggestion and initial patch by Andy Chu; 124 revised patch by Phillip J. Eby and Nick Coghlan; :issue:`1739468`.) 125 126* The :func:`int` type gained a ``bit_length`` method that returns the 127 number of bits necessary to represent its argument in binary:: 128 129 >>> n = 37 130 >>> bin(37) 131 '0b100101' 132 >>> n.bit_length() 133 6 134 >>> n = 2**123-1 135 >>> n.bit_length() 136 123 137 >>> (n+1).bit_length() 138 124 139 140 (Contributed by Fredrik Johansson, Victor Stinner, Raymond Hettinger, 141 and Mark Dickinson; :issue:`3439`.) 142 143* The fields in :func:`format` strings can now be automatically 144 numbered:: 145 146 >>> 'Sir {} of {}'.format('Gallahad', 'Camelot') 147 'Sir Gallahad of Camelot' 148 149 Formerly, the string would have required numbered fields such as: 150 ``'Sir {0} of {1}'``. 151 152 (Contributed by Eric Smith; :issue:`5237`.) 153 154* The :func:`string.maketrans` function is deprecated and is replaced by new 155 static methods, :meth:`bytes.maketrans` and :meth:`bytearray.maketrans`. 156 This change solves the confusion around which types were supported by the 157 :mod:`string` module. Now, :class:`str`, :class:`bytes`, and 158 :class:`bytearray` each have their own **maketrans** and **translate** 159 methods with intermediate translation tables of the appropriate type. 160 161 (Contributed by Georg Brandl; :issue:`5675`.) 162 163* The syntax of the :keyword:`with` statement now allows multiple context 164 managers in a single statement:: 165 166 >>> with open('mylog.txt') as infile, open('a.out', 'w') as outfile: 167 ... for line in infile: 168 ... if '<critical>' in line: 169 ... outfile.write(line) 170 171 With the new syntax, the :func:`contextlib.nested` function is no longer 172 needed and is now deprecated. 173 174 (Contributed by Georg Brandl and Mattias Brändström; 175 `appspot issue 53094 <https://codereview.appspot.com/53094>`_.) 176 177* ``round(x, n)`` now returns an integer if *x* is an integer. 178 Previously it returned a float:: 179 180 >>> round(1123, -2) 181 1100 182 183 (Contributed by Mark Dickinson; :issue:`4707`.) 184 185* Python now uses David Gay's algorithm for finding the shortest floating 186 point representation that doesn't change its value. This should help 187 mitigate some of the confusion surrounding binary floating point 188 numbers. 189 190 The significance is easily seen with a number like ``1.1`` which does not 191 have an exact equivalent in binary floating point. Since there is no exact 192 equivalent, an expression like ``float('1.1')`` evaluates to the nearest 193 representable value which is ``0x1.199999999999ap+0`` in hex or 194 ``1.100000000000000088817841970012523233890533447265625`` in decimal. That 195 nearest value was and still is used in subsequent floating point 196 calculations. 197 198 What is new is how the number gets displayed. Formerly, Python used a 199 simple approach. The value of ``repr(1.1)`` was computed as ``format(1.1, 200 '.17g')`` which evaluated to ``'1.1000000000000001'``. The advantage of 201 using 17 digits was that it relied on IEEE-754 guarantees to assure that 202 ``eval(repr(1.1))`` would round-trip exactly to its original value. The 203 disadvantage is that many people found the output to be confusing (mistaking 204 intrinsic limitations of binary floating point representation as being a 205 problem with Python itself). 206 207 The new algorithm for ``repr(1.1)`` is smarter and returns ``'1.1'``. 208 Effectively, it searches all equivalent string representations (ones that 209 get stored with the same underlying float value) and returns the shortest 210 representation. 211 212 The new algorithm tends to emit cleaner representations when possible, but 213 it does not change the underlying values. So, it is still the case that 214 ``1.1 + 2.2 != 3.3`` even though the representations may suggest otherwise. 215 216 The new algorithm depends on certain features in the underlying floating 217 point implementation. If the required features are not found, the old 218 algorithm will continue to be used. Also, the text pickle protocols 219 assure cross-platform portability by using the old algorithm. 220 221 (Contributed by Eric Smith and Mark Dickinson; :issue:`1580`) 222 223New, Improved, and Deprecated Modules 224===================================== 225 226* Added a :class:`collections.Counter` class to support convenient 227 counting of unique items in a sequence or iterable:: 228 229 >>> Counter(['red', 'blue', 'red', 'green', 'blue', 'blue']) 230 Counter({'blue': 3, 'red': 2, 'green': 1}) 231 232 (Contributed by Raymond Hettinger; :issue:`1696199`.) 233 234* Added a new module, :mod:`tkinter.ttk` for access to the Tk themed widget set. 235 The basic idea of ttk is to separate, to the extent possible, the code 236 implementing a widget's behavior from the code implementing its appearance. 237 238 (Contributed by Guilherme Polo; :issue:`2983`.) 239 240* The :class:`gzip.GzipFile` and :class:`bz2.BZ2File` classes now support 241 the context management protocol:: 242 243 >>> # Automatically close file after writing 244 >>> with gzip.GzipFile(filename, "wb") as f: 245 ... f.write(b"xxx") 246 247 (Contributed by Antoine Pitrou.) 248 249* The :mod:`decimal` module now supports methods for creating a 250 decimal object from a binary :class:`float`. The conversion is 251 exact but can sometimes be surprising:: 252 253 >>> Decimal.from_float(1.1) 254 Decimal('1.100000000000000088817841970012523233890533447265625') 255 256 The long decimal result shows the actual binary fraction being 257 stored for *1.1*. The fraction has many digits because *1.1* cannot 258 be exactly represented in binary. 259 260 (Contributed by Raymond Hettinger and Mark Dickinson.) 261 262* The :mod:`itertools` module grew two new functions. The 263 :func:`itertools.combinations_with_replacement` function is one of 264 four for generating combinatorics including permutations and Cartesian 265 products. The :func:`itertools.compress` function mimics its namesake 266 from APL. Also, the existing :func:`itertools.count` function now has 267 an optional *step* argument and can accept any type of counting 268 sequence including :class:`fractions.Fraction` and 269 :class:`decimal.Decimal`:: 270 271 >>> [p+q for p,q in combinations_with_replacement('LOVE', 2)] 272 ['LL', 'LO', 'LV', 'LE', 'OO', 'OV', 'OE', 'VV', 'VE', 'EE'] 273 274 >>> list(compress(data=range(10), selectors=[0,0,1,1,0,1,0,1,0,0])) 275 [2, 3, 5, 7] 276 277 >>> c = count(start=Fraction(1,2), step=Fraction(1,6)) 278 >>> [next(c), next(c), next(c), next(c)] 279 [Fraction(1, 2), Fraction(2, 3), Fraction(5, 6), Fraction(1, 1)] 280 281 (Contributed by Raymond Hettinger.) 282 283* :func:`collections.namedtuple` now supports a keyword argument 284 *rename* which lets invalid fieldnames be automatically converted to 285 positional names in the form _0, _1, etc. This is useful when 286 the field names are being created by an external source such as a 287 CSV header, SQL field list, or user input:: 288 289 >>> query = input() 290 SELECT region, dept, count(*) FROM main GROUPBY region, dept 291 292 >>> cursor.execute(query) 293 >>> query_fields = [desc[0] for desc in cursor.description] 294 >>> UserQuery = namedtuple('UserQuery', query_fields, rename=True) 295 >>> pprint.pprint([UserQuery(*row) for row in cursor]) 296 [UserQuery(region='South', dept='Shipping', _2=185), 297 UserQuery(region='North', dept='Accounting', _2=37), 298 UserQuery(region='West', dept='Sales', _2=419)] 299 300 (Contributed by Raymond Hettinger; :issue:`1818`.) 301 302* The :func:`re.sub`, :func:`re.subn` and :func:`re.split` functions now 303 accept a flags parameter. 304 305 (Contributed by Gregory Smith.) 306 307* The :mod:`logging` module now implements a simple :class:`logging.NullHandler` 308 class for applications that are not using logging but are calling 309 library code that does. Setting-up a null handler will suppress 310 spurious warnings such as "No handlers could be found for logger foo":: 311 312 >>> h = logging.NullHandler() 313 >>> logging.getLogger("foo").addHandler(h) 314 315 (Contributed by Vinay Sajip; :issue:`4384`). 316 317* The :mod:`runpy` module which supports the ``-m`` command line switch 318 now supports the execution of packages by looking for and executing 319 a ``__main__`` submodule when a package name is supplied. 320 321 (Contributed by Andi Vajda; :issue:`4195`.) 322 323* The :mod:`pdb` module can now access and display source code loaded via 324 :mod:`zipimport` (or any other conformant :pep:`302` loader). 325 326 (Contributed by Alexander Belopolsky; :issue:`4201`.) 327 328* :class:`functools.partial` objects can now be pickled. 329 330 (Suggested by Antoine Pitrou and Jesse Noller. Implemented by 331 Jack Diederich; :issue:`5228`.) 332 333* Add :mod:`pydoc` help topics for symbols so that ``help('@')`` 334 works as expected in the interactive environment. 335 336 (Contributed by David Laban; :issue:`4739`.) 337 338* The :mod:`unittest` module now supports skipping individual tests or classes 339 of tests. And it supports marking a test as an expected failure, a test that 340 is known to be broken, but shouldn't be counted as a failure on a 341 TestResult:: 342 343 class TestGizmo(unittest.TestCase): 344 345 @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows") 346 def test_gizmo_on_windows(self): 347 ... 348 349 @unittest.expectedFailure 350 def test_gimzo_without_required_library(self): 351 ... 352 353 Also, tests for exceptions have been builtout to work with context managers 354 using the :keyword:`with` statement:: 355 356 def test_division_by_zero(self): 357 with self.assertRaises(ZeroDivisionError): 358 x / 0 359 360 In addition, several new assertion methods were added including 361 :func:`assertSetEqual`, :func:`assertDictEqual`, 362 :func:`assertDictContainsSubset`, :func:`assertListEqual`, 363 :func:`assertTupleEqual`, :func:`assertSequenceEqual`, 364 :func:`assertRaisesRegexp`, :func:`assertIsNone`, 365 and :func:`assertIsNotNone`. 366 367 (Contributed by Benjamin Peterson and Antoine Pitrou.) 368 369* The :mod:`io` module has three new constants for the :meth:`seek` 370 method :data:`SEEK_SET`, :data:`SEEK_CUR`, and :data:`SEEK_END`. 371 372* The :attr:`sys.version_info` tuple is now a named tuple:: 373 374 >>> sys.version_info 375 sys.version_info(major=3, minor=1, micro=0, releaselevel='alpha', serial=2) 376 377 (Contributed by Ross Light; :issue:`4285`.) 378 379* The :mod:`nntplib` and :mod:`imaplib` modules now support IPv6. 380 381 (Contributed by Derek Morr; :issue:`1655` and :issue:`1664`.) 382 383* The :mod:`pickle` module has been adapted for better interoperability with 384 Python 2.x when used with protocol 2 or lower. The reorganization of the 385 standard library changed the formal reference for many objects. For 386 example, ``__builtin__.set`` in Python 2 is called ``builtins.set`` in Python 387 3. This change confounded efforts to share data between different versions of 388 Python. But now when protocol 2 or lower is selected, the pickler will 389 automatically use the old Python 2 names for both loading and dumping. This 390 remapping is turned-on by default but can be disabled with the *fix_imports* 391 option:: 392 393 >>> s = {1, 2, 3} 394 >>> pickle.dumps(s, protocol=0) 395 b'c__builtin__\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.' 396 >>> pickle.dumps(s, protocol=0, fix_imports=False) 397 b'cbuiltins\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.' 398 399 An unfortunate but unavoidable side-effect of this change is that protocol 2 400 pickles produced by Python 3.1 won't be readable with Python 3.0. The latest 401 pickle protocol, protocol 3, should be used when migrating data between 402 Python 3.x implementations, as it doesn't attempt to remain compatible with 403 Python 2.x. 404 405 (Contributed by Alexandre Vassalotti and Antoine Pitrou, :issue:`6137`.) 406 407* A new module, :mod:`importlib` was added. It provides a complete, portable, 408 pure Python reference implementation of the :keyword:`import` statement and its 409 counterpart, the :func:`__import__` function. It represents a substantial 410 step forward in documenting and defining the actions that take place during 411 imports. 412 413 (Contributed by Brett Cannon.) 414 415Optimizations 416============= 417 418Major performance enhancements have been added: 419 420* The new I/O library (as defined in :pep:`3116`) was mostly written in 421 Python and quickly proved to be a problematic bottleneck in Python 3.0. 422 In Python 3.1, the I/O library has been entirely rewritten in C and is 423 2 to 20 times faster depending on the task at hand. The pure Python 424 version is still available for experimentation purposes through 425 the ``_pyio`` module. 426 427 (Contributed by Amaury Forgeot d'Arc and Antoine Pitrou.) 428 429* Added a heuristic so that tuples and dicts containing only untrackable objects 430 are not tracked by the garbage collector. This can reduce the size of 431 collections and therefore the garbage collection overhead on long-running 432 programs, depending on their particular use of datatypes. 433 434 (Contributed by Antoine Pitrou, :issue:`4688`.) 435 436* Enabling a configure option named ``--with-computed-gotos`` 437 on compilers that support it (notably: gcc, SunPro, icc), the bytecode 438 evaluation loop is compiled with a new dispatch mechanism which gives 439 speedups of up to 20%, depending on the system, the compiler, and 440 the benchmark. 441 442 (Contributed by Antoine Pitrou along with a number of other participants, 443 :issue:`4753`). 444 445* The decoding of UTF-8, UTF-16 and LATIN-1 is now two to four times 446 faster. 447 448 (Contributed by Antoine Pitrou and Amaury Forgeot d'Arc, :issue:`4868`.) 449 450* The :mod:`json` module now has a C extension to substantially improve 451 its performance. In addition, the API was modified so that json works 452 only with :class:`str`, not with :class:`bytes`. That change makes the 453 module closely match the `JSON specification <http://json.org/>`_ 454 which is defined in terms of Unicode. 455 456 (Contributed by Bob Ippolito and converted to Py3.1 by Antoine Pitrou 457 and Benjamin Peterson; :issue:`4136`.) 458 459* Unpickling now interns the attribute names of pickled objects. This saves 460 memory and allows pickles to be smaller. 461 462 (Contributed by Jake McGuire and Antoine Pitrou; :issue:`5084`.) 463 464IDLE 465==== 466 467* IDLE's format menu now provides an option to strip trailing whitespace 468 from a source file. 469 470 (Contributed by Roger D. Serwy; :issue:`5150`.) 471 472Build and C API Changes 473======================= 474 475Changes to Python's build process and to the C API include: 476 477* Integers are now stored internally either in base ``2**15`` or in base 478 ``2**30``, the base being determined at build time. Previously, they 479 were always stored in base ``2**15``. Using base ``2**30`` gives 480 significant performance improvements on 64-bit machines, but 481 benchmark results on 32-bit machines have been mixed. Therefore, 482 the default is to use base ``2**30`` on 64-bit machines and base ``2**15`` 483 on 32-bit machines; on Unix, there's a new configure option 484 ``--enable-big-digits`` that can be used to override this default. 485 486 Apart from the performance improvements this change should be invisible to 487 end users, with one exception: for testing and debugging purposes there's a 488 new :attr:`sys.int_info` that provides information about the 489 internal format, giving the number of bits per digit and the size in bytes 490 of the C type used to store each digit:: 491 492 >>> import sys 493 >>> sys.int_info 494 sys.int_info(bits_per_digit=30, sizeof_digit=4) 495 496 (Contributed by Mark Dickinson; :issue:`4258`.) 497 498* The :c:func:`PyLong_AsUnsignedLongLong()` function now handles a negative 499 *pylong* by raising :exc:`OverflowError` instead of :exc:`TypeError`. 500 501 (Contributed by Mark Dickinson and Lisandro Dalcrin; :issue:`5175`.) 502 503* Deprecated :c:func:`PyNumber_Int`. Use :c:func:`PyNumber_Long` instead. 504 505 (Contributed by Mark Dickinson; :issue:`4910`.) 506 507* Added a new :c:func:`PyOS_string_to_double` function to replace the 508 deprecated functions :c:func:`PyOS_ascii_strtod` and :c:func:`PyOS_ascii_atof`. 509 510 (Contributed by Mark Dickinson; :issue:`5914`.) 511 512* Added :c:type:`PyCapsule` as a replacement for the :c:type:`PyCObject` API. 513 The principal difference is that the new type has a well defined interface 514 for passing typing safety information and a less complicated signature 515 for calling a destructor. The old type had a problematic API and is now 516 deprecated. 517 518 (Contributed by Larry Hastings; :issue:`5630`.) 519 520Porting to Python 3.1 521===================== 522 523This section lists previously described changes and other bugfixes 524that may require changes to your code: 525 526* The new floating point string representations can break existing doctests. 527 For example:: 528 529 def e(): 530 '''Compute the base of natural logarithms. 531 532 >>> e() 533 2.7182818284590451 534 535 ''' 536 return sum(1/math.factorial(x) for x in reversed(range(30))) 537 538 doctest.testmod() 539 540 ********************************************************************** 541 Failed example: 542 e() 543 Expected: 544 2.7182818284590451 545 Got: 546 2.718281828459045 547 ********************************************************************** 548 549* The automatic name remapping in the pickle module for protocol 2 or lower can 550 make Python 3.1 pickles unreadable in Python 3.0. One solution is to use 551 protocol 3. Another solution is to set the *fix_imports* option to ``False``. 552 See the discussion above for more details. 553