1<!-- Creator : groff version 1.22.4 --> 2<!-- CreationDate: Sun Aug 22 23:03:27 2021 --> 3<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 4"http://www.w3.org/TR/html4/loose.dtd"> 5<html> 6<head> 7<meta name="generator" content="groff -Thtml, see www.gnu.org"> 8<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> 9<meta name="Content-Style" content="text/css"> 10<style type="text/css"> 11 p { margin-top: 0; margin-bottom: 0; vertical-align: top } 12 pre { margin-top: 0; margin-bottom: 0; vertical-align: top } 13 table { margin-top: 0; margin-bottom: 0; vertical-align: top } 14 h1 { text-align: center } 15</style> 16<title></title> 17</head> 18<body> 19 20<hr> 21 22 23<p>LIBARCHIVE-FORMATS(5) BSD File Formats Manual 24LIBARCHIVE-FORMATS(5)</p> 25 26<p style="margin-top: 1em"><b>NAME</b></p> 27 28<p style="margin-left:6%;"><b>libarchive-formats</b> 29— archive formats supported by the libarchive 30library</p> 31 32<p style="margin-top: 1em"><b>DESCRIPTION</b></p> 33 34<p style="margin-left:6%;">The libarchive(3) library reads 35and writes a variety of streaming archive formats. Generally 36speaking, all of these archive formats consist of a series 37of “entries”. Each entry stores a single file 38system object, such as a file, directory, or symbolic 39link.</p> 40 41<p style="margin-left:6%; margin-top: 1em">The following 42provides a brief description of each format supported by 43libarchive, with some information about recognized 44extensions or limitations of the current library support. 45Note that just because a format is supported by libarchive 46does not imply that a program that uses libarchive will 47support that format. Applications that use libarchive 48specify which formats they wish to support, though many 49programs do use libarchive convenience functions to enable 50all supported formats.</p> 51 52<p style="margin-left:6%; margin-top: 1em"><b>Tar 53Formats</b> <br> 54The libarchive(3) library can read most tar archives. It can 55write POSIX-standard “ustar” and “pax 56interchange” formats as well as v7 tar format and a 57subset of the legacy GNU tar format.</p> 58 59<p style="margin-left:6%; margin-top: 1em">All tar formats 60store each entry in one or more 512-byte records. The first 61record is used for file metadata, including filename, 62timestamp, and mode information, and the file data is stored 63in subsequent records. Later variants have extended this by 64either appropriating undefined areas of the header record, 65extending the header to multiple records, or by storing 66special entries that modify the interpretation of subsequent 67entries.</p> 68 69<p style="margin-top: 1em"><b>gnutar</b></p> 70 71<p style="margin-left:17%; margin-top: 1em">The 72libarchive(3) library can read most GNU-format tar archives. 73It currently supports the most popular GNU extensions, 74including modern long filename and linkname support, as well 75as atime and ctime data. The libarchive library does not 76support multi-volume archives, nor the old GNU long filename 77format. It can read GNU sparse file entries, including the 78new POSIX-based formats.</p> 79 80<p style="margin-left:17%; margin-top: 1em">The 81libarchive(3) library can write GNU tar format, including 82long filename and linkname support, as well as atime and 83ctime data.</p> 84 85<p style="margin-top: 1em"><b>pax</b></p> 86 87<p style="margin-left:17%; margin-top: 1em">The 88libarchive(3) library can read and write POSIX-compliant pax 89interchange format archives. Pax interchange format archives 90are an extension of the older ustar format that adds a 91separate entry with additional attributes stored as 92key/value pairs immediately before each regular entry. The 93presence of these additional entries is the only difference 94between pax interchange format and the older ustar format. 95The extended attributes are of unlimited length and are 96stored as UTF-8 Unicode strings. Keywords defined in the 97standard are in all lowercase; vendors are allowed to define 98custom keys by preceding them with the vendor name in all 99uppercase. When writing pax archives, libarchive uses many 100of the SCHILY keys defined by Joerg Schilling’s 101“star” archiver and a few LIBARCHIVE keys. The 102libarchive library can read most of the SCHILY keys and most 103of the GNU keys introduced by GNU tar. It silently ignores 104any keywords that it does not understand.</p> 105 106<p style="margin-left:17%; margin-top: 1em">The pax 107interchange format converts filenames to Unicode and stores 108them using the UTF-8 encoding. Prior to libarchive 3.0, 109libarchive erroneously assumed that the system 110wide-character routines natively supported Unicode. This 111caused it to mis-handle non-ASCII filenames on systems that 112did not satisfy this assumption.</p> 113 114<p style="margin-top: 1em"><b>restricted pax</b></p> 115 116<p style="margin-left:17%;">The libarchive library can also 117write pax archives in which it attempts to suppress the 118extended attributes entry whenever possible. The result will 119be identical to a ustar archive unless the extended 120attributes entry is required to store a long file name, long 121linkname, extended ACL, file flags, or if any of the 122standard ustar data (user name, group name, UID, GID, etc) 123cannot be fully represented in the ustar header. In all 124cases, the result can be dearchived by any program that can 125read POSIX-compliant pax interchange format archives. 126Programs that correctly read ustar format (see below) will 127also be able to read this format; any extended attributes 128will be extracted as separate files stored in 129<i>PaxHeader</i> directories.</p> 130 131<p style="margin-top: 1em"><b>ustar</b></p> 132 133<p style="margin-left:17%; margin-top: 1em">The libarchive 134library can both read and write this format. This format has 135the following limitations:</p> 136 137<p><b>•</b></p> 138 139<p style="margin-left:22%;">Device major and minor numbers 140are limited to 21 bits. Nodes with larger numbers will not 141be added to the archive.</p> 142 143<p><b>•</b></p> 144 145<p style="margin-left:22%;">Path names in the archive are 146limited to 255 bytes. (Shorter if there is no / character in 147exactly the right place.)</p> 148 149<p><b>•</b></p> 150 151<p style="margin-left:22%;">Symbolic links and hard links 152are stored in the archive with the name of the referenced 153file. This name is limited to 100 bytes.</p> 154 155<p><b>•</b></p> 156 157<p style="margin-left:22%;">Extended attributes, file 158flags, and other extended security information cannot be 159stored.</p> 160 161<p><b>•</b></p> 162 163<p style="margin-left:22%;">Archive entries are limited to 1648 gigabytes in size.</p> 165 166<p style="margin-left:17%;">Note that the pax interchange 167format has none of these restrictions. The ustar format is 168old and widely supported. It is recommended when 169compatibility is the primary concern.</p> 170 171<p style="margin-top: 1em"><b>v7</b></p> 172 173<p style="margin-left:17%; margin-top: 1em">The libarchive 174library can read and write the legacy v7 tar format. This 175format has the following limitations:</p> 176 177<p><b>•</b></p> 178 179<p style="margin-left:22%;">Only regular files, 180directories, and symbolic links can be archived. Block and 181character device nodes, FIFOs, and sockets cannot be 182archived.</p> 183 184<p><b>•</b></p> 185 186<p style="margin-left:22%;">Path names in the archive are 187limited to 100 bytes.</p> 188 189<p><b>•</b></p> 190 191<p style="margin-left:22%;">Symbolic links and hard links 192are stored in the archive with the name of the referenced 193file. This name is limited to 100 bytes.</p> 194 195<p><b>•</b></p> 196 197<p style="margin-left:22%;">User and group information are 198stored as numeric IDs; there is no provision for storing 199user or group names.</p> 200 201<p><b>•</b></p> 202 203<p style="margin-left:22%;">Extended attributes, file 204flags, and other extended security information cannot be 205stored.</p> 206 207<p><b>•</b></p> 208 209<p style="margin-left:22%;">Archive entries are limited to 2108 gigabytes in size.</p> 211 212<p style="margin-left:17%;">Generally, users should prefer 213the ustar format for portability as the v7 tar format is 214both less useful and less portable.</p> 215 216<p style="margin-left:6%; margin-top: 1em">The libarchive 217library also reads a variety of commonly-used extensions to 218the basic tar format. These extensions are recognized 219automatically whenever they appear.</p> 220 221<p style="margin-top: 1em">Numeric extensions.</p> 222 223<p style="margin-left:17%;">The POSIX standards require 224fixed-length numeric fields to be written with some 225character position reserved for terminators. Libarchive 226allows these fields to be written without terminator 227characters. This extends the allowable range; in particular, 228ustar archives with this extension can support entries up to 22964 gigabytes in size. Libarchive also recognizes base-256 230values in most numeric fields. This essentially removes all 231limitations on file size, modification time, and device 232numbers.</p> 233 234<p style="margin-top: 1em">Solaris extensions</p> 235 236<p style="margin-left:17%;">Libarchive recognizes ACL and 237extended attribute records written by Solaris tar.</p> 238 239<p style="margin-left:6%; margin-top: 1em">The first tar 240program appeared in Seventh Edition Unix in 1979. The first 241official standard for the tar file format was the 242“ustar” (Unix Standard Tar) format defined by 243POSIX in 1988. POSIX.1-2001 extended the ustar format to 244create the “pax interchange” format.</p> 245 246<p style="margin-left:6%; margin-top: 1em"><b>Cpio 247Formats</b> <br> 248The libarchive library can read and write a number of common 249cpio variants. A cpio archive stores each entry as a 250fixed-size header followed by a variable-length filename and 251variable-length data. Unlike the tar format, the cpio format 252does only minimal padding of the header or file data. There 253are several cpio variants, which differ primarily in how 254they store the initial header: some store the values as 255octal or hexadecimal numbers in ASCII, others as binary 256values of varying byte order and length.</p> 257 258<p style="margin-top: 1em"><b>binary</b></p> 259 260<p style="margin-left:17%; margin-top: 1em">The libarchive 261library transparently reads both big-endian and 262little-endian variants of the the two binary cpio formats; 263the original one from PWB/UNIX, and the later, more widely 264used, variant. This format used 32-bit binary values for 265file size and mtime, and 16-bit binary values for the other 266fields. The formats support only the file types present in 267UNIX at the time of their creation. File sizes are limited 268to 24 bits in the PWB format, because of the limits of the 269file system, and to 31 bits in the newer binary format, 270where signed 32 bit longs were used.</p> 271 272<p style="margin-top: 1em"><b>odc</b></p> 273 274<p style="margin-left:17%; margin-top: 1em">This is the 275POSIX standardized format, which is officially known as the 276“cpio interchange format” or the 277“octet-oriented cpio archive format” and 278sometimes unofficially referred to as the “old 279character format”. This format stores the header 280contents as octal values in ASCII. It is standard, portable, 281and immune from byte-order confusion. File sizes and mtime 282are limited to 33 bits (8GB file size), other fields are 283limited to 18 bits.</p> 284 285<p style="margin-top: 1em"><b>SVR4/newc</b></p> 286 287<p style="margin-left:17%;">The libarchive library can read 288both CRC and non-CRC variants of this format. The SVR4 289format uses eight-digit hexadecimal values for all header 290fields. This limits file size to 4GB, and also limits the 291mtime and other fields to 32 bits. The SVR4 format can 292optionally include a CRC of the file contents, although 293libarchive does not currently verify this CRC.</p> 294 295<p style="margin-left:6%; margin-top: 1em">Cpio first 296appeared in PWB/UNIX 1.0, which was released within AT&T 297in 1977. PWB/UNIX 1.0 formed the basis of System III Unix, 298released outside of AT&T in 1981. This makes cpio older 299than tar, although cpio was not included in Version 7 300AT&T Unix. As a result, the tar command became much 301better known in universities and research groups that used 302Version 7. The combination of the <b>find</b> and 303<b>cpio</b> utilities provided very precise control over 304file selection. Unfortunately, the format has many 305limitations that make it unsuitable for widespread use. Only 306the POSIX format permits files over 4GB, and its 18-bit 307limit for most other fields makes it unsuitable for modern 308systems. In addition, cpio formats only store numeric 309UID/GID values (not usernames and group names), which can 310make it very difficult to correctly transfer archives across 311systems with dissimilar user numbering.</p> 312 313<p style="margin-left:6%; margin-top: 1em"><b>Shar 314Formats</b> <br> 315A “shell archive” is a shell script that, when 316executed on a POSIX-compliant system, will recreate a 317collection of file system objects. The libarchive library 318can write two different kinds of shar archives:</p> 319 320<p style="margin-top: 1em"><b>shar</b></p> 321 322<p style="margin-left:17%; margin-top: 1em">The traditional 323shar format uses a limited set of POSIX commands, including 324echo(1), mkdir(1), and sed(1). It is suitable for portably 325archiving small collections of plain text files. However, it 326is not generally well-suited for large archives (many 327implementations of sh(1) have limits on the size of a 328script) nor should it be used with non-text files.</p> 329 330<p style="margin-top: 1em"><b>shardump</b></p> 331 332<p style="margin-left:17%;">This format is similar to shar 333but encodes files using uuencode(1) so that the result will 334be a plain text file regardless of the file contents. It 335also includes additional shell commands that attempt to 336reproduce as many file attributes as possible, including 337owner, mode, and flags. The additional commands used to 338restore file attributes make shardump archives less portable 339than plain shar archives.</p> 340 341<p style="margin-left:6%; margin-top: 1em"><b>ISO9660 342format</b> <br> 343Libarchive can read and extract from files containing 344ISO9660-compliant CDROM images. In many cases, this can 345remove the need to burn a physical CDROM just in order to 346read the files contained in an ISO9660 image. It also avoids 347security and complexity issues that come with virtual mounts 348and loopback devices. Libarchive supports the most common 349Rockridge extensions and has partial support for Joliet 350extensions. If both extensions are present, the Joliet 351extensions will be used and the Rockridge extensions will be 352ignored. In particular, this can create problems with 353hardlinks and symlinks, which are supported by Rockridge but 354not by Joliet.</p> 355 356<p style="margin-left:6%; margin-top: 1em">Libarchive reads 357ISO9660 images using a streaming strategy. This allows it to 358read compressed images directly (decompressing on the fly) 359and allows it to read images directly from network sockets, 360pipes, and other non-seekable data sources. This strategy 361works well for optimized ISO9660 images created by many 362popular programs. Such programs collect all directory 363information at the beginning of the ISO9660 image so it can 364be read from a physical disk with a minimum of seeking. 365However, not all ISO9660 images can be read in this 366fashion.</p> 367 368<p style="margin-left:6%; margin-top: 1em">Libarchive can 369also write ISO9660 images. Such images are fully optimized 370with the directory information preceding all file data. This 371is done by storing all file data to a temporary file while 372collecting directory information in memory. When the image 373is finished, libarchive writes out the directory structure 374followed by the file data. The location used for the 375temporary file can be changed by the usual environment 376variables.</p> 377 378<p style="margin-left:6%; margin-top: 1em"><b>Zip 379format</b> <br> 380Libarchive can read and write zip format archives that have 381uncompressed entries and entries compressed with the 382“deflate” algorithm. Other zip compression 383algorithms are not supported. It can extract jar archives, 384archives that use Zip64 extensions and self-extracting zip 385archives. Libarchive can use either of two different 386strategies for reading Zip archives: a streaming strategy 387which is fast and can handle extremely large archives, and a 388seeking strategy which can correctly process self-extracting 389Zip archives and archives with deleted members or other 390in-place modifications.</p> 391 392<p style="margin-left:6%; margin-top: 1em">The streaming 393reader processes Zip archives as they are read. It can read 394archives of arbitrary size from tape or network sockets, and 395can decode Zip archives that have been separately compressed 396or encoded. However, self-extracting Zip archives and 397archives with certain types of modifications cannot be 398correctly handled. Such archives require that the reader 399first process the Central Directory, which is ordinarily 400located at the end of a Zip archive and is thus inaccessible 401to the streaming reader. If the program using libarchive has 402enabled seek support, then libarchive will use this to 403processes the central directory first.</p> 404 405<p style="margin-left:6%; margin-top: 1em">In particular, 406the seeking reader must be used to correctly handle 407self-extracting archives. Such archives consist of a program 408followed by a regular Zip archive. The streaming reader 409cannot parse the initial program portion, but the seeking 410reader starts by reading the Central Directory from the end 411of the archive. Similarly, Zip archives that have been 412modified in-place can have deleted entries or other garbage 413data that can only be accurately detected by first reading 414the Central Directory.</p> 415 416<p style="margin-left:6%; margin-top: 1em"><b>Archive 417(library) file format</b> <br> 418The Unix archive format (commonly created by the ar(1) 419archiver) is a general-purpose format which is used almost 420exclusively for object files to be read by the link editor 421ld(1). The ar format has never been standardised. There are 422two common variants: the GNU format derived from SVR4, and 423the BSD format, which first appeared in 4.4BSD. The two 424differ primarily in their handling of filenames longer than 42515 characters: the GNU/SVR4 variant writes a filename table 426at the beginning of the archive; the BSD format stores each 427long filename in an extension area adjacent to the entry. 428Libarchive can read both extensions, including archives that 429may include both types of long filenames. Programs using 430libarchive can write GNU/SVR4 format if they provide an 431entry called <i>//</i> containing a filename table to be 432written into the archive before any of the entries. Any 433entries whose names are not in the filename table will be 434written using BSD-style long filenames. This can cause 435problems for programs such as GNU ld that do not support the 436BSD-style long filenames.</p> 437 438<p style="margin-left:6%; margin-top: 1em"><b>mtree</b> 439<br> 440Libarchive can read and write files in mtree(5) format. This 441format is not a true archive format, but rather a textual 442description of a file hierarchy in which each line specifies 443the name of a file and provides specific metadata about that 444file. Libarchive can read all of the keywords supported by 445both the NetBSD and FreeBSD versions of mtree(8), although 446many of the keywords cannot currently be stored in an 447archive_entry object. When writing, libarchive supports use 448of the archive_write_set_options(3) interface to specify 449which keywords should be included in the output. If 450libarchive was compiled with access to suitable 451cryptographic libraries (such as the OpenSSL libraries), it 452can compute hash entries such as <b>sha512</b> or <b>md5</b> 453from file data being written to the mtree writer.</p> 454 455<p style="margin-left:6%; margin-top: 1em">When reading an 456mtree file, libarchive will locate the corresponding files 457on disk using the <b>contents</b> keyword if present or the 458regular filename. If it can locate and open the file on 459disk, it will use that to fill in any metadata that is 460missing from the mtree file and will read the file contents 461and return those to the program using libarchive. If it 462cannot locate and open the file on disk, libarchive will 463return an error for any attempt to read the entry body.</p> 464 465<p style="margin-left:6%; margin-top: 1em"><b>7-Zip</b> 466<br> 467Libarchive can read and write 7-Zip format archives. TODO: 468Need more information</p> 469 470<p style="margin-left:6%; margin-top: 1em"><b>CAB</b> <br> 471Libarchive can read Microsoft Cabinet ( “CAB”) 472format archives. TODO: Need more information.</p> 473 474<p style="margin-left:6%; margin-top: 1em"><b>LHA</b> <br> 475TODO: Information about libarchive’s LHA support</p> 476 477<p style="margin-left:6%; margin-top: 1em"><b>RAR</b> <br> 478Libarchive has limited support for reading RAR format 479archives. Currently, libarchive can read RARv3 format 480archives which have been either created uncompressed, or 481compressed using any of the compression methods supported by 482the RARv3 format. Libarchive can also read self-extracting 483RAR archives.</p> 484 485<p style="margin-left:6%; margin-top: 1em"><b>Warc</b> <br> 486Libarchive can read and write “web archives”. 487TODO: Need more information</p> 488 489<p style="margin-left:6%; margin-top: 1em"><b>XAR</b> <br> 490Libarchive can read and write the XAR format used by many 491Apple tools. TODO: Need more information</p> 492 493<p style="margin-top: 1em"><b>SEE ALSO</b></p> 494 495<p style="margin-left:6%;">ar(1), cpio(1), mkisofs(1), 496shar(1), tar(1), zip(1), zlib(3), cpio(5), mtree(5), 497tar(5)</p> 498 499<p style="margin-left:6%; margin-top: 1em">BSD 500December 27, 2016 BSD</p> 501<hr> 502</body> 503</html> 504