1<!-- Creator     : groff version 1.22.4 -->
2<!-- CreationDate: Sun Aug 22 23:03:27 2021 -->
3<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
4"http://www.w3.org/TR/html4/loose.dtd">
5<html>
6<head>
7<meta name="generator" content="groff -Thtml, see www.gnu.org">
8<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
9<meta name="Content-Style" content="text/css">
10<style type="text/css">
11       p       { margin-top: 0; margin-bottom: 0; vertical-align: top }
12       pre     { margin-top: 0; margin-bottom: 0; vertical-align: top }
13       table   { margin-top: 0; margin-bottom: 0; vertical-align: top }
14       h1      { text-align: center }
15</style>
16<title></title>
17</head>
18<body>
19
20<hr>
21
22
23<p>LIBARCHIVE-FORMATS(5) BSD File Formats Manual
24LIBARCHIVE-FORMATS(5)</p>
25
26<p style="margin-top: 1em"><b>NAME</b></p>
27
28<p style="margin-left:6%;"><b>libarchive-formats</b>
29&mdash; archive formats supported by the libarchive
30library</p>
31
32<p style="margin-top: 1em"><b>DESCRIPTION</b></p>
33
34<p style="margin-left:6%;">The libarchive(3) library reads
35and writes a variety of streaming archive formats. Generally
36speaking, all of these archive formats consist of a series
37of &ldquo;entries&rdquo;. Each entry stores a single file
38system object, such as a file, directory, or symbolic
39link.</p>
40
41<p style="margin-left:6%; margin-top: 1em">The following
42provides a brief description of each format supported by
43libarchive, with some information about recognized
44extensions or limitations of the current library support.
45Note that just because a format is supported by libarchive
46does not imply that a program that uses libarchive will
47support that format. Applications that use libarchive
48specify which formats they wish to support, though many
49programs do use libarchive convenience functions to enable
50all supported formats.</p>
51
52<p style="margin-left:6%; margin-top: 1em"><b>Tar
53Formats</b> <br>
54The libarchive(3) library can read most tar archives. It can
55write POSIX-standard &ldquo;ustar&rdquo; and &ldquo;pax
56interchange&rdquo; formats as well as v7 tar format and a
57subset of the legacy GNU tar format.</p>
58
59<p style="margin-left:6%; margin-top: 1em">All tar formats
60store each entry in one or more 512-byte records. The first
61record is used for file metadata, including filename,
62timestamp, and mode information, and the file data is stored
63in subsequent records. Later variants have extended this by
64either appropriating undefined areas of the header record,
65extending the header to multiple records, or by storing
66special entries that modify the interpretation of subsequent
67entries.</p>
68
69<p style="margin-top: 1em"><b>gnutar</b></p>
70
71<p style="margin-left:17%; margin-top: 1em">The
72libarchive(3) library can read most GNU-format tar archives.
73It currently supports the most popular GNU extensions,
74including modern long filename and linkname support, as well
75as atime and ctime data. The libarchive library does not
76support multi-volume archives, nor the old GNU long filename
77format. It can read GNU sparse file entries, including the
78new POSIX-based formats.</p>
79
80<p style="margin-left:17%; margin-top: 1em">The
81libarchive(3) library can write GNU tar format, including
82long filename and linkname support, as well as atime and
83ctime data.</p>
84
85<p style="margin-top: 1em"><b>pax</b></p>
86
87<p style="margin-left:17%; margin-top: 1em">The
88libarchive(3) library can read and write POSIX-compliant pax
89interchange format archives. Pax interchange format archives
90are an extension of the older ustar format that adds a
91separate entry with additional attributes stored as
92key/value pairs immediately before each regular entry. The
93presence of these additional entries is the only difference
94between pax interchange format and the older ustar format.
95The extended attributes are of unlimited length and are
96stored as UTF-8 Unicode strings. Keywords defined in the
97standard are in all lowercase; vendors are allowed to define
98custom keys by preceding them with the vendor name in all
99uppercase. When writing pax archives, libarchive uses many
100of the SCHILY keys defined by Joerg Schilling&rsquo;s
101&ldquo;star&rdquo; archiver and a few LIBARCHIVE keys. The
102libarchive library can read most of the SCHILY keys and most
103of the GNU keys introduced by GNU tar. It silently ignores
104any keywords that it does not understand.</p>
105
106<p style="margin-left:17%; margin-top: 1em">The pax
107interchange format converts filenames to Unicode and stores
108them using the UTF-8 encoding. Prior to libarchive 3.0,
109libarchive erroneously assumed that the system
110wide-character routines natively supported Unicode. This
111caused it to mis-handle non-ASCII filenames on systems that
112did not satisfy this assumption.</p>
113
114<p style="margin-top: 1em"><b>restricted pax</b></p>
115
116<p style="margin-left:17%;">The libarchive library can also
117write pax archives in which it attempts to suppress the
118extended attributes entry whenever possible. The result will
119be identical to a ustar archive unless the extended
120attributes entry is required to store a long file name, long
121linkname, extended ACL, file flags, or if any of the
122standard ustar data (user name, group name, UID, GID, etc)
123cannot be fully represented in the ustar header. In all
124cases, the result can be dearchived by any program that can
125read POSIX-compliant pax interchange format archives.
126Programs that correctly read ustar format (see below) will
127also be able to read this format; any extended attributes
128will be extracted as separate files stored in
129<i>PaxHeader</i> directories.</p>
130
131<p style="margin-top: 1em"><b>ustar</b></p>
132
133<p style="margin-left:17%; margin-top: 1em">The libarchive
134library can both read and write this format. This format has
135the following limitations:</p>
136
137<p><b>&bull;</b></p>
138
139<p style="margin-left:22%;">Device major and minor numbers
140are limited to 21 bits. Nodes with larger numbers will not
141be added to the archive.</p>
142
143<p><b>&bull;</b></p>
144
145<p style="margin-left:22%;">Path names in the archive are
146limited to 255 bytes. (Shorter if there is no / character in
147exactly the right place.)</p>
148
149<p><b>&bull;</b></p>
150
151<p style="margin-left:22%;">Symbolic links and hard links
152are stored in the archive with the name of the referenced
153file. This name is limited to 100 bytes.</p>
154
155<p><b>&bull;</b></p>
156
157<p style="margin-left:22%;">Extended attributes, file
158flags, and other extended security information cannot be
159stored.</p>
160
161<p><b>&bull;</b></p>
162
163<p style="margin-left:22%;">Archive entries are limited to
1648 gigabytes in size.</p>
165
166<p style="margin-left:17%;">Note that the pax interchange
167format has none of these restrictions. The ustar format is
168old and widely supported. It is recommended when
169compatibility is the primary concern.</p>
170
171<p style="margin-top: 1em"><b>v7</b></p>
172
173<p style="margin-left:17%; margin-top: 1em">The libarchive
174library can read and write the legacy v7 tar format. This
175format has the following limitations:</p>
176
177<p><b>&bull;</b></p>
178
179<p style="margin-left:22%;">Only regular files,
180directories, and symbolic links can be archived. Block and
181character device nodes, FIFOs, and sockets cannot be
182archived.</p>
183
184<p><b>&bull;</b></p>
185
186<p style="margin-left:22%;">Path names in the archive are
187limited to 100 bytes.</p>
188
189<p><b>&bull;</b></p>
190
191<p style="margin-left:22%;">Symbolic links and hard links
192are stored in the archive with the name of the referenced
193file. This name is limited to 100 bytes.</p>
194
195<p><b>&bull;</b></p>
196
197<p style="margin-left:22%;">User and group information are
198stored as numeric IDs; there is no provision for storing
199user or group names.</p>
200
201<p><b>&bull;</b></p>
202
203<p style="margin-left:22%;">Extended attributes, file
204flags, and other extended security information cannot be
205stored.</p>
206
207<p><b>&bull;</b></p>
208
209<p style="margin-left:22%;">Archive entries are limited to
2108 gigabytes in size.</p>
211
212<p style="margin-left:17%;">Generally, users should prefer
213the ustar format for portability as the v7 tar format is
214both less useful and less portable.</p>
215
216<p style="margin-left:6%; margin-top: 1em">The libarchive
217library also reads a variety of commonly-used extensions to
218the basic tar format. These extensions are recognized
219automatically whenever they appear.</p>
220
221<p style="margin-top: 1em">Numeric extensions.</p>
222
223<p style="margin-left:17%;">The POSIX standards require
224fixed-length numeric fields to be written with some
225character position reserved for terminators. Libarchive
226allows these fields to be written without terminator
227characters. This extends the allowable range; in particular,
228ustar archives with this extension can support entries up to
22964 gigabytes in size. Libarchive also recognizes base-256
230values in most numeric fields. This essentially removes all
231limitations on file size, modification time, and device
232numbers.</p>
233
234<p style="margin-top: 1em">Solaris extensions</p>
235
236<p style="margin-left:17%;">Libarchive recognizes ACL and
237extended attribute records written by Solaris tar.</p>
238
239<p style="margin-left:6%; margin-top: 1em">The first tar
240program appeared in Seventh Edition Unix in 1979. The first
241official standard for the tar file format was the
242&ldquo;ustar&rdquo; (Unix Standard Tar) format defined by
243POSIX in 1988. POSIX.1-2001 extended the ustar format to
244create the &ldquo;pax interchange&rdquo; format.</p>
245
246<p style="margin-left:6%; margin-top: 1em"><b>Cpio
247Formats</b> <br>
248The libarchive library can read and write a number of common
249cpio variants. A cpio archive stores each entry as a
250fixed-size header followed by a variable-length filename and
251variable-length data. Unlike the tar format, the cpio format
252does only minimal padding of the header or file data. There
253are several cpio variants, which differ primarily in how
254they store the initial header: some store the values as
255octal or hexadecimal numbers in ASCII, others as binary
256values of varying byte order and length.</p>
257
258<p style="margin-top: 1em"><b>binary</b></p>
259
260<p style="margin-left:17%; margin-top: 1em">The libarchive
261library transparently reads both big-endian and
262little-endian variants of the the two binary cpio formats;
263the original one from PWB/UNIX, and the later, more widely
264used, variant. This format used 32-bit binary values for
265file size and mtime, and 16-bit binary values for the other
266fields. The formats support only the file types present in
267UNIX at the time of their creation. File sizes are limited
268to 24 bits in the PWB format, because of the limits of the
269file system, and to 31 bits in the newer binary format,
270where signed 32 bit longs were used.</p>
271
272<p style="margin-top: 1em"><b>odc</b></p>
273
274<p style="margin-left:17%; margin-top: 1em">This is the
275POSIX standardized format, which is officially known as the
276&ldquo;cpio interchange format&rdquo; or the
277&ldquo;octet-oriented cpio archive format&rdquo; and
278sometimes unofficially referred to as the &ldquo;old
279character format&rdquo;. This format stores the header
280contents as octal values in ASCII. It is standard, portable,
281and immune from byte-order confusion. File sizes and mtime
282are limited to 33 bits (8GB file size), other fields are
283limited to 18 bits.</p>
284
285<p style="margin-top: 1em"><b>SVR4/newc</b></p>
286
287<p style="margin-left:17%;">The libarchive library can read
288both CRC and non-CRC variants of this format. The SVR4
289format uses eight-digit hexadecimal values for all header
290fields. This limits file size to 4GB, and also limits the
291mtime and other fields to 32 bits. The SVR4 format can
292optionally include a CRC of the file contents, although
293libarchive does not currently verify this CRC.</p>
294
295<p style="margin-left:6%; margin-top: 1em">Cpio first
296appeared in PWB/UNIX 1.0, which was released within AT&amp;T
297in 1977. PWB/UNIX 1.0 formed the basis of System III Unix,
298released outside of AT&amp;T in 1981. This makes cpio older
299than tar, although cpio was not included in Version 7
300AT&amp;T Unix. As a result, the tar command became much
301better known in universities and research groups that used
302Version 7. The combination of the <b>find</b> and
303<b>cpio</b> utilities provided very precise control over
304file selection. Unfortunately, the format has many
305limitations that make it unsuitable for widespread use. Only
306the POSIX format permits files over 4GB, and its 18-bit
307limit for most other fields makes it unsuitable for modern
308systems. In addition, cpio formats only store numeric
309UID/GID values (not usernames and group names), which can
310make it very difficult to correctly transfer archives across
311systems with dissimilar user numbering.</p>
312
313<p style="margin-left:6%; margin-top: 1em"><b>Shar
314Formats</b> <br>
315A &ldquo;shell archive&rdquo; is a shell script that, when
316executed on a POSIX-compliant system, will recreate a
317collection of file system objects. The libarchive library
318can write two different kinds of shar archives:</p>
319
320<p style="margin-top: 1em"><b>shar</b></p>
321
322<p style="margin-left:17%; margin-top: 1em">The traditional
323shar format uses a limited set of POSIX commands, including
324echo(1), mkdir(1), and sed(1). It is suitable for portably
325archiving small collections of plain text files. However, it
326is not generally well-suited for large archives (many
327implementations of sh(1) have limits on the size of a
328script) nor should it be used with non-text files.</p>
329
330<p style="margin-top: 1em"><b>shardump</b></p>
331
332<p style="margin-left:17%;">This format is similar to shar
333but encodes files using uuencode(1) so that the result will
334be a plain text file regardless of the file contents. It
335also includes additional shell commands that attempt to
336reproduce as many file attributes as possible, including
337owner, mode, and flags. The additional commands used to
338restore file attributes make shardump archives less portable
339than plain shar archives.</p>
340
341<p style="margin-left:6%; margin-top: 1em"><b>ISO9660
342format</b> <br>
343Libarchive can read and extract from files containing
344ISO9660-compliant CDROM images. In many cases, this can
345remove the need to burn a physical CDROM just in order to
346read the files contained in an ISO9660 image. It also avoids
347security and complexity issues that come with virtual mounts
348and loopback devices. Libarchive supports the most common
349Rockridge extensions and has partial support for Joliet
350extensions. If both extensions are present, the Joliet
351extensions will be used and the Rockridge extensions will be
352ignored. In particular, this can create problems with
353hardlinks and symlinks, which are supported by Rockridge but
354not by Joliet.</p>
355
356<p style="margin-left:6%; margin-top: 1em">Libarchive reads
357ISO9660 images using a streaming strategy. This allows it to
358read compressed images directly (decompressing on the fly)
359and allows it to read images directly from network sockets,
360pipes, and other non-seekable data sources. This strategy
361works well for optimized ISO9660 images created by many
362popular programs. Such programs collect all directory
363information at the beginning of the ISO9660 image so it can
364be read from a physical disk with a minimum of seeking.
365However, not all ISO9660 images can be read in this
366fashion.</p>
367
368<p style="margin-left:6%; margin-top: 1em">Libarchive can
369also write ISO9660 images. Such images are fully optimized
370with the directory information preceding all file data. This
371is done by storing all file data to a temporary file while
372collecting directory information in memory. When the image
373is finished, libarchive writes out the directory structure
374followed by the file data. The location used for the
375temporary file can be changed by the usual environment
376variables.</p>
377
378<p style="margin-left:6%; margin-top: 1em"><b>Zip
379format</b> <br>
380Libarchive can read and write zip format archives that have
381uncompressed entries and entries compressed with the
382&ldquo;deflate&rdquo; algorithm. Other zip compression
383algorithms are not supported. It can extract jar archives,
384archives that use Zip64 extensions and self-extracting zip
385archives. Libarchive can use either of two different
386strategies for reading Zip archives: a streaming strategy
387which is fast and can handle extremely large archives, and a
388seeking strategy which can correctly process self-extracting
389Zip archives and archives with deleted members or other
390in-place modifications.</p>
391
392<p style="margin-left:6%; margin-top: 1em">The streaming
393reader processes Zip archives as they are read. It can read
394archives of arbitrary size from tape or network sockets, and
395can decode Zip archives that have been separately compressed
396or encoded. However, self-extracting Zip archives and
397archives with certain types of modifications cannot be
398correctly handled. Such archives require that the reader
399first process the Central Directory, which is ordinarily
400located at the end of a Zip archive and is thus inaccessible
401to the streaming reader. If the program using libarchive has
402enabled seek support, then libarchive will use this to
403processes the central directory first.</p>
404
405<p style="margin-left:6%; margin-top: 1em">In particular,
406the seeking reader must be used to correctly handle
407self-extracting archives. Such archives consist of a program
408followed by a regular Zip archive. The streaming reader
409cannot parse the initial program portion, but the seeking
410reader starts by reading the Central Directory from the end
411of the archive. Similarly, Zip archives that have been
412modified in-place can have deleted entries or other garbage
413data that can only be accurately detected by first reading
414the Central Directory.</p>
415
416<p style="margin-left:6%; margin-top: 1em"><b>Archive
417(library) file format</b> <br>
418The Unix archive format (commonly created by the ar(1)
419archiver) is a general-purpose format which is used almost
420exclusively for object files to be read by the link editor
421ld(1). The ar format has never been standardised. There are
422two common variants: the GNU format derived from SVR4, and
423the BSD format, which first appeared in 4.4BSD. The two
424differ primarily in their handling of filenames longer than
42515 characters: the GNU/SVR4 variant writes a filename table
426at the beginning of the archive; the BSD format stores each
427long filename in an extension area adjacent to the entry.
428Libarchive can read both extensions, including archives that
429may include both types of long filenames. Programs using
430libarchive can write GNU/SVR4 format if they provide an
431entry called <i>//</i> containing a filename table to be
432written into the archive before any of the entries. Any
433entries whose names are not in the filename table will be
434written using BSD-style long filenames. This can cause
435problems for programs such as GNU ld that do not support the
436BSD-style long filenames.</p>
437
438<p style="margin-left:6%; margin-top: 1em"><b>mtree</b>
439<br>
440Libarchive can read and write files in mtree(5) format. This
441format is not a true archive format, but rather a textual
442description of a file hierarchy in which each line specifies
443the name of a file and provides specific metadata about that
444file. Libarchive can read all of the keywords supported by
445both the NetBSD and FreeBSD versions of mtree(8), although
446many of the keywords cannot currently be stored in an
447archive_entry object. When writing, libarchive supports use
448of the archive_write_set_options(3) interface to specify
449which keywords should be included in the output. If
450libarchive was compiled with access to suitable
451cryptographic libraries (such as the OpenSSL libraries), it
452can compute hash entries such as <b>sha512</b> or <b>md5</b>
453from file data being written to the mtree writer.</p>
454
455<p style="margin-left:6%; margin-top: 1em">When reading an
456mtree file, libarchive will locate the corresponding files
457on disk using the <b>contents</b> keyword if present or the
458regular filename. If it can locate and open the file on
459disk, it will use that to fill in any metadata that is
460missing from the mtree file and will read the file contents
461and return those to the program using libarchive. If it
462cannot locate and open the file on disk, libarchive will
463return an error for any attempt to read the entry body.</p>
464
465<p style="margin-left:6%; margin-top: 1em"><b>7-Zip</b>
466<br>
467Libarchive can read and write 7-Zip format archives. TODO:
468Need more information</p>
469
470<p style="margin-left:6%; margin-top: 1em"><b>CAB</b> <br>
471Libarchive can read Microsoft Cabinet ( &ldquo;CAB&rdquo;)
472format archives. TODO: Need more information.</p>
473
474<p style="margin-left:6%; margin-top: 1em"><b>LHA</b> <br>
475TODO: Information about libarchive&rsquo;s LHA support</p>
476
477<p style="margin-left:6%; margin-top: 1em"><b>RAR</b> <br>
478Libarchive has limited support for reading RAR format
479archives. Currently, libarchive can read RARv3 format
480archives which have been either created uncompressed, or
481compressed using any of the compression methods supported by
482the RARv3 format. Libarchive can also read self-extracting
483RAR archives.</p>
484
485<p style="margin-left:6%; margin-top: 1em"><b>Warc</b> <br>
486Libarchive can read and write &ldquo;web archives&rdquo;.
487TODO: Need more information</p>
488
489<p style="margin-left:6%; margin-top: 1em"><b>XAR</b> <br>
490Libarchive can read and write the XAR format used by many
491Apple tools. TODO: Need more information</p>
492
493<p style="margin-top: 1em"><b>SEE ALSO</b></p>
494
495<p style="margin-left:6%;">ar(1), cpio(1), mkisofs(1),
496shar(1), tar(1), zip(1), zlib(3), cpio(5), mtree(5),
497tar(5)</p>
498
499<p style="margin-left:6%; margin-top: 1em">BSD
500December&nbsp;27, 2016 BSD</p>
501<hr>
502</body>
503</html>
504