1:mod:`tarfile` --- Read and write tar archive files 2=================================================== 3 4.. module:: tarfile 5 :synopsis: Read and write tar-format archive files. 6 7.. moduleauthor:: Lars Gustäbel <lars@gustaebel.de> 8.. sectionauthor:: Lars Gustäbel <lars@gustaebel.de> 9 10**Source code:** :source:`Lib/tarfile.py` 11 12-------------- 13 14The :mod:`tarfile` module makes it possible to read and write tar 15archives, including those using gzip, bz2 and lzma compression. 16Use the :mod:`zipfile` module to read or write :file:`.zip` files, or the 17higher-level functions in :ref:`shutil <archiving-operations>`. 18 19Some facts and figures: 20 21* reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives 22 if the respective modules are available. 23 24* read/write support for the POSIX.1-1988 (ustar) format. 25 26* read/write support for the GNU tar format including *longname* and *longlink* 27 extensions, read-only support for all variants of the *sparse* extension 28 including restoration of sparse files. 29 30* read/write support for the POSIX.1-2001 (pax) format. 31 32* handles directories, regular files, hardlinks, symbolic links, fifos, 33 character devices and block devices and is able to acquire and restore file 34 information like timestamp, access permissions and owner. 35 36.. versionchanged:: 3.3 37 Added support for :mod:`lzma` compression. 38 39 40.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs) 41 42 Return a :class:`TarFile` object for the pathname *name*. For detailed 43 information on :class:`TarFile` objects and the keyword arguments that are 44 allowed, see :ref:`tarfile-objects`. 45 46 *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults 47 to ``'r'``. Here is a full list of mode combinations: 48 49 +------------------+---------------------------------------------+ 50 | mode | action | 51 +==================+=============================================+ 52 | ``'r' or 'r:*'`` | Open for reading with transparent | 53 | | compression (recommended). | 54 +------------------+---------------------------------------------+ 55 | ``'r:'`` | Open for reading exclusively without | 56 | | compression. | 57 +------------------+---------------------------------------------+ 58 | ``'r:gz'`` | Open for reading with gzip compression. | 59 +------------------+---------------------------------------------+ 60 | ``'r:bz2'`` | Open for reading with bzip2 compression. | 61 +------------------+---------------------------------------------+ 62 | ``'r:xz'`` | Open for reading with lzma compression. | 63 +------------------+---------------------------------------------+ 64 | ``'x'`` or | Create a tarfile exclusively without | 65 | ``'x:'`` | compression. | 66 | | Raise an :exc:`FileExistsError` exception | 67 | | if it already exists. | 68 +------------------+---------------------------------------------+ 69 | ``'x:gz'`` | Create a tarfile with gzip compression. | 70 | | Raise an :exc:`FileExistsError` exception | 71 | | if it already exists. | 72 +------------------+---------------------------------------------+ 73 | ``'x:bz2'`` | Create a tarfile with bzip2 compression. | 74 | | Raise an :exc:`FileExistsError` exception | 75 | | if it already exists. | 76 +------------------+---------------------------------------------+ 77 | ``'x:xz'`` | Create a tarfile with lzma compression. | 78 | | Raise an :exc:`FileExistsError` exception | 79 | | if it already exists. | 80 +------------------+---------------------------------------------+ 81 | ``'a' or 'a:'`` | Open for appending with no compression. The | 82 | | file is created if it does not exist. | 83 +------------------+---------------------------------------------+ 84 | ``'w' or 'w:'`` | Open for uncompressed writing. | 85 +------------------+---------------------------------------------+ 86 | ``'w:gz'`` | Open for gzip compressed writing. | 87 +------------------+---------------------------------------------+ 88 | ``'w:bz2'`` | Open for bzip2 compressed writing. | 89 +------------------+---------------------------------------------+ 90 | ``'w:xz'`` | Open for lzma compressed writing. | 91 +------------------+---------------------------------------------+ 92 93 Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode* 94 is not suitable to open a certain (compressed) file for reading, 95 :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this. If a 96 compression method is not supported, :exc:`CompressionError` is raised. 97 98 If *fileobj* is specified, it is used as an alternative to a :term:`file object` 99 opened in binary mode for *name*. It is supposed to be at position 0. 100 101 For modes ``'w:gz'``, ``'r:gz'``, ``'w:bz2'``, ``'r:bz2'``, ``'x:gz'``, 102 ``'x:bz2'``, :func:`tarfile.open` accepts the keyword argument 103 *compresslevel* (default ``9``) to specify the compression level of the file. 104 105 For special purposes, there is a second format for *mode*: 106 ``'filemode|[compression]'``. :func:`tarfile.open` will return a :class:`TarFile` 107 object that processes its data as a stream of blocks. No random seeking will 108 be done on the file. If given, *fileobj* may be any object that has a 109 :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize* 110 specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant 111 in combination with e.g. ``sys.stdin``, a socket :term:`file object` or a tape 112 device. However, such a :class:`TarFile` object is limited in that it does 113 not allow random access, see :ref:`tar-examples`. The currently 114 possible modes: 115 116 +-------------+--------------------------------------------+ 117 | Mode | Action | 118 +=============+============================================+ 119 | ``'r|*'`` | Open a *stream* of tar blocks for reading | 120 | | with transparent compression. | 121 +-------------+--------------------------------------------+ 122 | ``'r|'`` | Open a *stream* of uncompressed tar blocks | 123 | | for reading. | 124 +-------------+--------------------------------------------+ 125 | ``'r|gz'`` | Open a gzip compressed *stream* for | 126 | | reading. | 127 +-------------+--------------------------------------------+ 128 | ``'r|bz2'`` | Open a bzip2 compressed *stream* for | 129 | | reading. | 130 +-------------+--------------------------------------------+ 131 | ``'r|xz'`` | Open an lzma compressed *stream* for | 132 | | reading. | 133 +-------------+--------------------------------------------+ 134 | ``'w|'`` | Open an uncompressed *stream* for writing. | 135 +-------------+--------------------------------------------+ 136 | ``'w|gz'`` | Open a gzip compressed *stream* for | 137 | | writing. | 138 +-------------+--------------------------------------------+ 139 | ``'w|bz2'`` | Open a bzip2 compressed *stream* for | 140 | | writing. | 141 +-------------+--------------------------------------------+ 142 | ``'w|xz'`` | Open an lzma compressed *stream* for | 143 | | writing. | 144 +-------------+--------------------------------------------+ 145 146 .. versionchanged:: 3.5 147 The ``'x'`` (exclusive creation) mode was added. 148 149 .. versionchanged:: 3.6 150 The *name* parameter accepts a :term:`path-like object`. 151 152 153.. class:: TarFile 154 :noindex: 155 156 Class for reading and writing tar archives. Do not use this class directly: 157 use :func:`tarfile.open` instead. See :ref:`tarfile-objects`. 158 159 160.. function:: is_tarfile(name) 161 162 Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile` 163 module can read. 164 165 166The :mod:`tarfile` module defines the following exceptions: 167 168 169.. exception:: TarError 170 171 Base class for all :mod:`tarfile` exceptions. 172 173 174.. exception:: ReadError 175 176 Is raised when a tar archive is opened, that either cannot be handled by the 177 :mod:`tarfile` module or is somehow invalid. 178 179 180.. exception:: CompressionError 181 182 Is raised when a compression method is not supported or when the data cannot be 183 decoded properly. 184 185 186.. exception:: StreamError 187 188 Is raised for the limitations that are typical for stream-like :class:`TarFile` 189 objects. 190 191 192.. exception:: ExtractError 193 194 Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if 195 :attr:`TarFile.errorlevel`\ ``== 2``. 196 197 198.. exception:: HeaderError 199 200 Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid. 201 202 203The following constants are available at the module level: 204 205.. data:: ENCODING 206 207 The default character encoding: ``'utf-8'`` on Windows, the value returned by 208 :func:`sys.getfilesystemencoding` otherwise. 209 210 211Each of the following constants defines a tar archive format that the 212:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for 213details. 214 215 216.. data:: USTAR_FORMAT 217 218 POSIX.1-1988 (ustar) format. 219 220 221.. data:: GNU_FORMAT 222 223 GNU tar format. 224 225 226.. data:: PAX_FORMAT 227 228 POSIX.1-2001 (pax) format. 229 230 231.. data:: DEFAULT_FORMAT 232 233 The default format for creating archives. This is currently :const:`PAX_FORMAT`. 234 235 .. versionchanged:: 3.8 236 The default format for new archives was changed to 237 :const:`PAX_FORMAT` from :const:`GNU_FORMAT`. 238 239 240.. seealso:: 241 242 Module :mod:`zipfile` 243 Documentation of the :mod:`zipfile` standard module. 244 245 :ref:`archiving-operations` 246 Documentation of the higher-level archiving facilities provided by the 247 standard :mod:`shutil` module. 248 249 `GNU tar manual, Basic Tar Format <https://www.gnu.org/software/tar/manual/html_node/Standard.html>`_ 250 Documentation for tar archive files, including GNU tar extensions. 251 252 253.. _tarfile-objects: 254 255TarFile Objects 256--------------- 257 258The :class:`TarFile` object provides an interface to a tar archive. A tar 259archive is a sequence of blocks. An archive member (a stored file) is made up of 260a header block followed by data blocks. It is possible to store a file in a tar 261archive several times. Each archive member is represented by a :class:`TarInfo` 262object, see :ref:`tarinfo-objects` for details. 263 264A :class:`TarFile` object can be used as a context manager in a :keyword:`with` 265statement. It will automatically be closed when the block is completed. Please 266note that in the event of an exception an archive opened for writing will not 267be finalized; only the internally used file object will be closed. See the 268:ref:`tar-examples` section for a use case. 269 270.. versionadded:: 3.2 271 Added support for the context management protocol. 272 273.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors='surrogateescape', pax_headers=None, debug=0, errorlevel=0) 274 275 All following arguments are optional and can be accessed as instance attributes 276 as well. 277 278 *name* is the pathname of the archive. *name* may be a :term:`path-like object`. 279 It can be omitted if *fileobj* is given. 280 In this case, the file object's :attr:`name` attribute is used if it exists. 281 282 *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append 283 data to an existing file, ``'w'`` to create a new file overwriting an existing 284 one, or ``'x'`` to create a new file only if it does not already exist. 285 286 If *fileobj* is given, it is used for reading or writing data. If it can be 287 determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used 288 from position 0. 289 290 .. note:: 291 292 *fileobj* is not closed, when :class:`TarFile` is closed. 293 294 *format* controls the archive format for writing. It must be one of the constants 295 :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are 296 defined at module level. When reading, format will be automatically detected, even 297 if different formats are present in a single archive. 298 299 The *tarinfo* argument can be used to replace the default :class:`TarInfo` class 300 with a different one. 301 302 If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it 303 is :const:`True`, add the content of the target files to the archive. This has no 304 effect on systems that do not support symbolic links. 305 306 If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive. 307 If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members 308 as possible. This is only useful for reading concatenated or damaged archives. 309 310 *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug 311 messages). The messages are written to ``sys.stderr``. 312 313 If *errorlevel* is ``0``, all errors are ignored when using :meth:`TarFile.extract`. 314 Nevertheless, they appear as error messages in the debug output, when debugging 315 is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` 316 exceptions. If ``2``, all *non-fatal* errors are raised as :exc:`TarError` 317 exceptions as well. 318 319 The *encoding* and *errors* arguments define the character encoding to be 320 used for reading or writing the archive and how conversion errors are going 321 to be handled. The default settings will work for most users. 322 See section :ref:`tar-unicode` for in-depth information. 323 324 The *pax_headers* argument is an optional dictionary of strings which 325 will be added as a pax global header if *format* is :const:`PAX_FORMAT`. 326 327 .. versionchanged:: 3.2 328 Use ``'surrogateescape'`` as the default for the *errors* argument. 329 330 .. versionchanged:: 3.5 331 The ``'x'`` (exclusive creation) mode was added. 332 333 .. versionchanged:: 3.6 334 The *name* parameter accepts a :term:`path-like object`. 335 336 337.. classmethod:: TarFile.open(...) 338 339 Alternative constructor. The :func:`tarfile.open` function is actually a 340 shortcut to this classmethod. 341 342 343.. method:: TarFile.getmember(name) 344 345 Return a :class:`TarInfo` object for member *name*. If *name* can not be found 346 in the archive, :exc:`KeyError` is raised. 347 348 .. note:: 349 350 If a member occurs more than once in the archive, its last occurrence is assumed 351 to be the most up-to-date version. 352 353 354.. method:: TarFile.getmembers() 355 356 Return the members of the archive as a list of :class:`TarInfo` objects. The 357 list has the same order as the members in the archive. 358 359 360.. method:: TarFile.getnames() 361 362 Return the members as a list of their names. It has the same order as the list 363 returned by :meth:`getmembers`. 364 365 366.. method:: TarFile.list(verbose=True, *, members=None) 367 368 Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`, 369 only the names of the members are printed. If it is :const:`True`, output 370 similar to that of :program:`ls -l` is produced. If optional *members* is 371 given, it must be a subset of the list returned by :meth:`getmembers`. 372 373 .. versionchanged:: 3.5 374 Added the *members* parameter. 375 376 377.. method:: TarFile.next() 378 379 Return the next member of the archive as a :class:`TarInfo` object, when 380 :class:`TarFile` is opened for reading. Return :const:`None` if there is no more 381 available. 382 383 384.. method:: TarFile.extractall(path=".", members=None, *, numeric_owner=False) 385 386 Extract all members from the archive to the current working directory or 387 directory *path*. If optional *members* is given, it must be a subset of the 388 list returned by :meth:`getmembers`. Directory information like owner, 389 modification time and permissions are set after all members have been extracted. 390 This is done to work around two problems: A directory's modification time is 391 reset each time a file is created in it. And, if a directory's permissions do 392 not allow writing, extracting files to it will fail. 393 394 If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile 395 are used to set the owner/group for the extracted files. Otherwise, the named 396 values from the tarfile are used. 397 398 .. warning:: 399 400 Never extract archives from untrusted sources without prior inspection. 401 It is possible that files are created outside of *path*, e.g. members 402 that have absolute filenames starting with ``"/"`` or filenames with two 403 dots ``".."``. 404 405 .. versionchanged:: 3.5 406 Added the *numeric_owner* parameter. 407 408 .. versionchanged:: 3.6 409 The *path* parameter accepts a :term:`path-like object`. 410 411 412.. method:: TarFile.extract(member, path="", set_attrs=True, *, numeric_owner=False) 413 414 Extract a member from the archive to the current working directory, using its 415 full name. Its file information is extracted as accurately as possible. *member* 416 may be a filename or a :class:`TarInfo` object. You can specify a different 417 directory using *path*. *path* may be a :term:`path-like object`. 418 File attributes (owner, mtime, mode) are set unless *set_attrs* is false. 419 420 If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile 421 are used to set the owner/group for the extracted files. Otherwise, the named 422 values from the tarfile are used. 423 424 .. note:: 425 426 The :meth:`extract` method does not take care of several extraction issues. 427 In most cases you should consider using the :meth:`extractall` method. 428 429 .. warning:: 430 431 See the warning for :meth:`extractall`. 432 433 .. versionchanged:: 3.2 434 Added the *set_attrs* parameter. 435 436 .. versionchanged:: 3.5 437 Added the *numeric_owner* parameter. 438 439 .. versionchanged:: 3.6 440 The *path* parameter accepts a :term:`path-like object`. 441 442 443.. method:: TarFile.extractfile(member) 444 445 Extract a member from the archive as a file object. *member* may be a filename 446 or a :class:`TarInfo` object. If *member* is a regular file or a link, an 447 :class:`io.BufferedReader` object is returned. Otherwise, :const:`None` is 448 returned. 449 450 .. versionchanged:: 3.3 451 Return an :class:`io.BufferedReader` object. 452 453 454.. method:: TarFile.add(name, arcname=None, recursive=True, *, filter=None) 455 456 Add the file *name* to the archive. *name* may be any type of file 457 (directory, fifo, symbolic link, etc.). If given, *arcname* specifies an 458 alternative name for the file in the archive. Directories are added 459 recursively by default. This can be avoided by setting *recursive* to 460 :const:`False`. Recursion adds entries in sorted order. 461 If *filter* is given, it 462 should be a function that takes a :class:`TarInfo` object argument and 463 returns the changed :class:`TarInfo` object. If it instead returns 464 :const:`None` the :class:`TarInfo` object will be excluded from the 465 archive. See :ref:`tar-examples` for an example. 466 467 .. versionchanged:: 3.2 468 Added the *filter* parameter. 469 470 .. versionchanged:: 3.7 471 Recursion adds entries in sorted order. 472 473 474.. method:: TarFile.addfile(tarinfo, fileobj=None) 475 476 Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given, 477 it should be a :term:`binary file`, and 478 ``tarinfo.size`` bytes are read from it and added to the archive. You can 479 create :class:`TarInfo` objects directly, or by using :meth:`gettarinfo`. 480 481 482.. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None) 483 484 Create a :class:`TarInfo` object from the result of :func:`os.stat` or 485 equivalent on an existing file. The file is either named by *name*, or 486 specified as a :term:`file object` *fileobj* with a file descriptor. 487 *name* may be a :term:`path-like object`. If 488 given, *arcname* specifies an alternative name for the file in the 489 archive, otherwise, the name is taken from *fileobj*’s 490 :attr:`~io.FileIO.name` attribute, or the *name* argument. The name 491 should be a text string. 492 493 You can modify 494 some of the :class:`TarInfo`’s attributes before you add it using :meth:`addfile`. 495 If the file object is not an ordinary file object positioned at the 496 beginning of the file, attributes such as :attr:`~TarInfo.size` may need 497 modifying. This is the case for objects such as :class:`~gzip.GzipFile`. 498 The :attr:`~TarInfo.name` may also be modified, in which case *arcname* 499 could be a dummy string. 500 501 .. versionchanged:: 3.6 502 The *name* parameter accepts a :term:`path-like object`. 503 504 505.. method:: TarFile.close() 506 507 Close the :class:`TarFile`. In write mode, two finishing zero blocks are 508 appended to the archive. 509 510 511.. attribute:: TarFile.pax_headers 512 513 A dictionary containing key-value pairs of pax global headers. 514 515 516 517.. _tarinfo-objects: 518 519TarInfo Objects 520--------------- 521 522A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside 523from storing all required attributes of a file (like file type, size, time, 524permissions, owner etc.), it provides some useful methods to determine its type. 525It does *not* contain the file's data itself. 526 527:class:`TarInfo` objects are returned by :class:`TarFile`'s methods 528:meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`. 529 530 531.. class:: TarInfo(name="") 532 533 Create a :class:`TarInfo` object. 534 535 536.. classmethod:: TarInfo.frombuf(buf, encoding, errors) 537 538 Create and return a :class:`TarInfo` object from string buffer *buf*. 539 540 Raises :exc:`HeaderError` if the buffer is invalid. 541 542 543.. classmethod:: TarInfo.fromtarfile(tarfile) 544 545 Read the next member from the :class:`TarFile` object *tarfile* and return it as 546 a :class:`TarInfo` object. 547 548 549.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='surrogateescape') 550 551 Create a string buffer from a :class:`TarInfo` object. For information on the 552 arguments see the constructor of the :class:`TarFile` class. 553 554 .. versionchanged:: 3.2 555 Use ``'surrogateescape'`` as the default for the *errors* argument. 556 557 558A ``TarInfo`` object has the following public data attributes: 559 560 561.. attribute:: TarInfo.name 562 563 Name of the archive member. 564 565 566.. attribute:: TarInfo.size 567 568 Size in bytes. 569 570 571.. attribute:: TarInfo.mtime 572 573 Time of last modification. 574 575 576.. attribute:: TarInfo.mode 577 578 Permission bits. 579 580 581.. attribute:: TarInfo.type 582 583 File type. *type* is usually one of these constants: :const:`REGTYPE`, 584 :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`, 585 :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`, 586 :const:`GNUTYPE_SPARSE`. To determine the type of a :class:`TarInfo` object 587 more conveniently, use the ``is*()`` methods below. 588 589 590.. attribute:: TarInfo.linkname 591 592 Name of the target file name, which is only present in :class:`TarInfo` objects 593 of type :const:`LNKTYPE` and :const:`SYMTYPE`. 594 595 596.. attribute:: TarInfo.uid 597 598 User ID of the user who originally stored this member. 599 600 601.. attribute:: TarInfo.gid 602 603 Group ID of the user who originally stored this member. 604 605 606.. attribute:: TarInfo.uname 607 608 User name. 609 610 611.. attribute:: TarInfo.gname 612 613 Group name. 614 615 616.. attribute:: TarInfo.pax_headers 617 618 A dictionary containing key-value pairs of an associated pax extended header. 619 620 621A :class:`TarInfo` object also provides some convenient query methods: 622 623 624.. method:: TarInfo.isfile() 625 626 Return :const:`True` if the :class:`Tarinfo` object is a regular file. 627 628 629.. method:: TarInfo.isreg() 630 631 Same as :meth:`isfile`. 632 633 634.. method:: TarInfo.isdir() 635 636 Return :const:`True` if it is a directory. 637 638 639.. method:: TarInfo.issym() 640 641 Return :const:`True` if it is a symbolic link. 642 643 644.. method:: TarInfo.islnk() 645 646 Return :const:`True` if it is a hard link. 647 648 649.. method:: TarInfo.ischr() 650 651 Return :const:`True` if it is a character device. 652 653 654.. method:: TarInfo.isblk() 655 656 Return :const:`True` if it is a block device. 657 658 659.. method:: TarInfo.isfifo() 660 661 Return :const:`True` if it is a FIFO. 662 663 664.. method:: TarInfo.isdev() 665 666 Return :const:`True` if it is one of character device, block device or FIFO. 667 668 669.. _tarfile-commandline: 670.. program:: tarfile 671 672Command-Line Interface 673---------------------- 674 675.. versionadded:: 3.4 676 677The :mod:`tarfile` module provides a simple command-line interface to interact 678with tar archives. 679 680If you want to create a new tar archive, specify its name after the :option:`-c` 681option and then list the filename(s) that should be included: 682 683.. code-block:: shell-session 684 685 $ python -m tarfile -c monty.tar spam.txt eggs.txt 686 687Passing a directory is also acceptable: 688 689.. code-block:: shell-session 690 691 $ python -m tarfile -c monty.tar life-of-brian_1979/ 692 693If you want to extract a tar archive into the current directory, use 694the :option:`-e` option: 695 696.. code-block:: shell-session 697 698 $ python -m tarfile -e monty.tar 699 700You can also extract a tar archive into a different directory by passing the 701directory's name: 702 703.. code-block:: shell-session 704 705 $ python -m tarfile -e monty.tar other-dir/ 706 707For a list of the files in a tar archive, use the :option:`-l` option: 708 709.. code-block:: shell-session 710 711 $ python -m tarfile -l monty.tar 712 713 714Command-line options 715~~~~~~~~~~~~~~~~~~~~ 716 717.. cmdoption:: -l <tarfile> 718 --list <tarfile> 719 720 List files in a tarfile. 721 722.. cmdoption:: -c <tarfile> <source1> ... <sourceN> 723 --create <tarfile> <source1> ... <sourceN> 724 725 Create tarfile from source files. 726 727.. cmdoption:: -e <tarfile> [<output_dir>] 728 --extract <tarfile> [<output_dir>] 729 730 Extract tarfile into the current directory if *output_dir* is not specified. 731 732.. cmdoption:: -t <tarfile> 733 --test <tarfile> 734 735 Test whether the tarfile is valid or not. 736 737.. cmdoption:: -v, --verbose 738 739 Verbose output. 740 741.. _tar-examples: 742 743Examples 744-------- 745 746How to extract an entire tar archive to the current working directory:: 747 748 import tarfile 749 tar = tarfile.open("sample.tar.gz") 750 tar.extractall() 751 tar.close() 752 753How to extract a subset of a tar archive with :meth:`TarFile.extractall` using 754a generator function instead of a list:: 755 756 import os 757 import tarfile 758 759 def py_files(members): 760 for tarinfo in members: 761 if os.path.splitext(tarinfo.name)[1] == ".py": 762 yield tarinfo 763 764 tar = tarfile.open("sample.tar.gz") 765 tar.extractall(members=py_files(tar)) 766 tar.close() 767 768How to create an uncompressed tar archive from a list of filenames:: 769 770 import tarfile 771 tar = tarfile.open("sample.tar", "w") 772 for name in ["foo", "bar", "quux"]: 773 tar.add(name) 774 tar.close() 775 776The same example using the :keyword:`with` statement:: 777 778 import tarfile 779 with tarfile.open("sample.tar", "w") as tar: 780 for name in ["foo", "bar", "quux"]: 781 tar.add(name) 782 783How to read a gzip compressed tar archive and display some member information:: 784 785 import tarfile 786 tar = tarfile.open("sample.tar.gz", "r:gz") 787 for tarinfo in tar: 788 print(tarinfo.name, "is", tarinfo.size, "bytes in size and is ", end="") 789 if tarinfo.isreg(): 790 print("a regular file.") 791 elif tarinfo.isdir(): 792 print("a directory.") 793 else: 794 print("something else.") 795 tar.close() 796 797How to create an archive and reset the user information using the *filter* 798parameter in :meth:`TarFile.add`:: 799 800 import tarfile 801 def reset(tarinfo): 802 tarinfo.uid = tarinfo.gid = 0 803 tarinfo.uname = tarinfo.gname = "root" 804 return tarinfo 805 tar = tarfile.open("sample.tar.gz", "w:gz") 806 tar.add("foo", filter=reset) 807 tar.close() 808 809 810.. _tar-formats: 811 812Supported tar formats 813--------------------- 814 815There are three tar formats that can be created with the :mod:`tarfile` module: 816 817* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames 818 up to a length of at best 256 characters and linknames up to 100 characters. 819 The maximum file size is 8 GiB. This is an old and limited but widely 820 supported format. 821 822* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and 823 linknames, files bigger than 8 GiB and sparse files. It is the de facto 824 standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar 825 extensions for long names, sparse file support is read-only. 826 827* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible 828 format with virtually no limits. It supports long filenames and linknames, large 829 files and stores pathnames in a portable way. Modern tar implementations, 830 including GNU tar, bsdtar/libarchive and star, fully support extended *pax* 831 features; some old or unmaintained libraries may not, but should treat 832 *pax* archives as if they were in the universally-supported *ustar* format. 833 It is the current default format for new archives. 834 835 It extends the existing *ustar* format with extra headers for information 836 that cannot be stored otherwise. There are two flavours of pax headers: 837 Extended headers only affect the subsequent file header, global 838 headers are valid for the complete archive and affect all following files. 839 All the data in a pax header is encoded in *UTF-8* for portability reasons. 840 841There are some more variants of the tar format which can be read, but not 842created: 843 844* The ancient V7 format. This is the first tar format from Unix Seventh Edition, 845 storing only regular files and directories. Names must not be longer than 100 846 characters, there is no user/group name information. Some archives have 847 miscalculated header checksums in case of fields with non-ASCII characters. 848 849* The SunOS tar extended format. This format is a variant of the POSIX.1-2001 850 pax format, but is not compatible. 851 852.. _tar-unicode: 853 854Unicode issues 855-------------- 856 857The tar format was originally conceived to make backups on tape drives with the 858main focus on preserving file system information. Nowadays tar archives are 859commonly used for file distribution and exchanging archives over networks. One 860problem of the original format (which is the basis of all other formats) is 861that there is no concept of supporting different character encodings. For 862example, an ordinary tar archive created on a *UTF-8* system cannot be read 863correctly on a *Latin-1* system if it contains non-*ASCII* characters. Textual 864metadata (like filenames, linknames, user/group names) will appear damaged. 865Unfortunately, there is no way to autodetect the encoding of an archive. The 866pax format was designed to solve this problem. It stores non-ASCII metadata 867using the universal character encoding *UTF-8*. 868 869The details of character conversion in :mod:`tarfile` are controlled by the 870*encoding* and *errors* keyword arguments of the :class:`TarFile` class. 871 872*encoding* defines the character encoding to use for the metadata in the 873archive. The default value is :func:`sys.getfilesystemencoding` or ``'ascii'`` 874as a fallback. Depending on whether the archive is read or written, the 875metadata must be either decoded or encoded. If *encoding* is not set 876appropriately, this conversion may fail. 877 878The *errors* argument defines how characters are treated that cannot be 879converted. Possible values are listed in section :ref:`error-handlers`. 880The default scheme is ``'surrogateescape'`` which Python also uses for its 881file system calls, see :ref:`os-filenames`. 882 883For :const:`PAX_FORMAT` archives (the default), *encoding* is generally not needed 884because all the metadata is stored using *UTF-8*. *encoding* is only used in 885the rare cases when binary pax headers are decoded or when strings with 886surrogate characters are stored. 887