1:mod:`gzip` --- Support for :program:`gzip` files
2=================================================
3
4.. module:: gzip
5   :synopsis: Interfaces for gzip compression and decompression using file objects.
6
7**Source code:** :source:`Lib/gzip.py`
8
9--------------
10
11This module provides a simple interface to compress and decompress files just
12like the GNU programs :program:`gzip` and :program:`gunzip` would.
13
14The data compression is provided by the :mod:`zlib` module.
15
16The :mod:`gzip` module provides the :class:`GzipFile` class, as well as the
17:func:`.open`, :func:`compress` and :func:`decompress` convenience functions.
18The :class:`GzipFile` class reads and writes :program:`gzip`\ -format files,
19automatically compressing or decompressing the data so that it looks like an
20ordinary :term:`file object`.
21
22Note that additional file formats which can be decompressed by the
23:program:`gzip` and :program:`gunzip` programs, such  as those produced by
24:program:`compress` and :program:`pack`, are not supported by this module.
25
26The module defines the following items:
27
28
29.. function:: open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
30
31   Open a gzip-compressed file in binary or text mode, returning a :term:`file
32   object`.
33
34   The *filename* argument can be an actual filename (a :class:`str` or
35   :class:`bytes` object), or an existing file object to read from or write to.
36
37   The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``,
38   ``'w'``, ``'wb'``, ``'x'`` or ``'xb'`` for binary mode, or ``'rt'``,
39   ``'at'``, ``'wt'``, or ``'xt'`` for text mode. The default is ``'rb'``.
40
41   The *compresslevel* argument is an integer from 0 to 9, as for the
42   :class:`GzipFile` constructor.
43
44   For binary mode, this function is equivalent to the :class:`GzipFile`
45   constructor: ``GzipFile(filename, mode, compresslevel)``. In this case, the
46   *encoding*, *errors* and *newline* arguments must not be provided.
47
48   For text mode, a :class:`GzipFile` object is created, and wrapped in an
49   :class:`io.TextIOWrapper` instance with the specified encoding, error
50   handling behavior, and line ending(s).
51
52   .. versionchanged:: 3.3
53      Added support for *filename* being a file object, support for text mode,
54      and the *encoding*, *errors* and *newline* arguments.
55
56   .. versionchanged:: 3.4
57      Added support for the ``'x'``, ``'xb'`` and ``'xt'`` modes.
58
59   .. versionchanged:: 3.6
60      Accepts a :term:`path-like object`.
61
62.. exception:: BadGzipFile
63
64   An exception raised for invalid gzip files.  It inherits :exc:`OSError`.
65   :exc:`EOFError` and :exc:`zlib.error` can also be raised for invalid gzip
66   files.
67
68   .. versionadded:: 3.8
69
70.. class:: GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)
71
72   Constructor for the :class:`GzipFile` class, which simulates most of the
73   methods of a :term:`file object`, with the exception of the :meth:`truncate`
74   method.  At least one of *fileobj* and *filename* must be given a non-trivial
75   value.
76
77   The new class instance is based on *fileobj*, which can be a regular file, an
78   :class:`io.BytesIO` object, or any other object which simulates a file.  It
79   defaults to ``None``, in which case *filename* is opened to provide a file
80   object.
81
82   When *fileobj* is not ``None``, the *filename* argument is only used to be
83   included in the :program:`gzip` file header, which may include the original
84   filename of the uncompressed file.  It defaults to the filename of *fileobj*, if
85   discernible; otherwise, it defaults to the empty string, and in this case the
86   original filename is not included in the header.
87
88   The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
89   ``'wb'``, ``'x'``, or ``'xb'``, depending on whether the file will be read or
90   written.  The default is the mode of *fileobj* if discernible; otherwise, the
91   default is ``'rb'``.
92
93   Note that the file is always opened in binary mode. To open a compressed file
94   in text mode, use :func:`.open` (or wrap your :class:`GzipFile` with an
95   :class:`io.TextIOWrapper`).
96
97   The *compresslevel* argument is an integer from ``0`` to ``9`` controlling
98   the level of compression; ``1`` is fastest and produces the least
99   compression, and ``9`` is slowest and produces the most compression. ``0``
100   is no compression. The default is ``9``.
101
102   The *mtime* argument is an optional numeric timestamp to be written to
103   the last modification time field in the stream when compressing.  It
104   should only be provided in compression mode.  If omitted or ``None``, the
105   current time is used.  See the :attr:`mtime` attribute for more details.
106
107   Calling a :class:`GzipFile` object's :meth:`close` method does not close
108   *fileobj*, since you might wish to append more material after the compressed
109   data.  This also allows you to pass an :class:`io.BytesIO` object opened for
110   writing as *fileobj*, and retrieve the resulting memory buffer using the
111   :class:`io.BytesIO` object's :meth:`~io.BytesIO.getvalue` method.
112
113   :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface,
114   including iteration and the :keyword:`with` statement.  Only the
115   :meth:`truncate` method isn't implemented.
116
117   :class:`GzipFile` also provides the following method and attribute:
118
119   .. method:: peek(n)
120
121      Read *n* uncompressed bytes without advancing the file position.
122      At most one single read on the compressed stream is done to satisfy
123      the call.  The number of bytes returned may be more or less than
124      requested.
125
126      .. note:: While calling :meth:`peek` does not change the file position of
127         the :class:`GzipFile`, it may change the position of the underlying
128         file object (e.g. if the :class:`GzipFile` was constructed with the
129         *fileobj* parameter).
130
131      .. versionadded:: 3.2
132
133   .. attribute:: mtime
134
135      When decompressing, the value of the last modification time field in
136      the most recently read header may be read from this attribute, as an
137      integer.  The initial value before reading any headers is ``None``.
138
139      All :program:`gzip` compressed streams are required to contain this
140      timestamp field.  Some programs, such as :program:`gunzip`\ , make use
141      of the timestamp.  The format is the same as the return value of
142      :func:`time.time` and the :attr:`~os.stat_result.st_mtime` attribute of
143      the object returned by :func:`os.stat`.
144
145   .. versionchanged:: 3.1
146      Support for the :keyword:`with` statement was added, along with the
147      *mtime* constructor argument and :attr:`mtime` attribute.
148
149   .. versionchanged:: 3.2
150      Support for zero-padded and unseekable files was added.
151
152   .. versionchanged:: 3.3
153      The :meth:`io.BufferedIOBase.read1` method is now implemented.
154
155   .. versionchanged:: 3.4
156      Added support for the ``'x'`` and ``'xb'`` modes.
157
158   .. versionchanged:: 3.5
159      Added support for writing arbitrary
160      :term:`bytes-like objects <bytes-like object>`.
161      The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
162      ``None``.
163
164   .. versionchanged:: 3.6
165      Accepts a :term:`path-like object`.
166
167
168.. function:: compress(data, compresslevel=9, *, mtime=None)
169
170   Compress the *data*, returning a :class:`bytes` object containing
171   the compressed data.  *compresslevel* and *mtime* have the same meaning as in
172   the :class:`GzipFile` constructor above.
173
174   .. versionadded:: 3.2
175   .. versionchanged:: 3.8
176      Added the *mtime* parameter for reproducible output.
177
178.. function:: decompress(data)
179
180   Decompress the *data*, returning a :class:`bytes` object containing the
181   uncompressed data.
182
183   .. versionadded:: 3.2
184
185
186.. _gzip-usage-examples:
187
188Examples of usage
189-----------------
190
191Example of how to read a compressed file::
192
193   import gzip
194   with gzip.open('/home/joe/file.txt.gz', 'rb') as f:
195       file_content = f.read()
196
197Example of how to create a compressed GZIP file::
198
199   import gzip
200   content = b"Lots of content here"
201   with gzip.open('/home/joe/file.txt.gz', 'wb') as f:
202       f.write(content)
203
204Example of how to GZIP compress an existing file::
205
206   import gzip
207   import shutil
208   with open('/home/joe/file.txt', 'rb') as f_in:
209       with gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
210           shutil.copyfileobj(f_in, f_out)
211
212Example of how to GZIP compress a binary string::
213
214   import gzip
215   s_in = b"Lots of content here"
216   s_out = gzip.compress(s_in)
217
218.. seealso::
219
220   Module :mod:`zlib`
221      The basic data compression module needed to support the :program:`gzip` file
222      format.
223
224
225.. program:: gzip
226
227Command Line Interface
228----------------------
229
230The :mod:`gzip` module provides a simple command line interface to compress or
231decompress files.
232
233Once executed the :mod:`gzip` module keeps the input file(s).
234
235.. versionchanged:: 3.8
236
237   Add a new command line interface with a usage.
238   By default, when you will execute the CLI, the default compression level is 6.
239
240Command line options
241^^^^^^^^^^^^^^^^^^^^
242
243.. cmdoption:: file
244
245   If *file* is not specified, read from :attr:`sys.stdin`.
246
247.. cmdoption:: --fast
248
249   Indicates the fastest compression method (less compression).
250
251.. cmdoption:: --best
252
253   Indicates the slowest compression method (best compression).
254
255.. cmdoption:: -d, --decompress
256
257   Decompress the given file.
258
259.. cmdoption:: -h, --help
260
261   Show the help message.
262
263