1.\" Copyright (c) 2003-2007 Tim Kientzle
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD: src/lib/libarchive/libarchive.3,v 1.11 2007/01/09 08:05:56 kientzle Exp $
26.\"
27.Dd August 19, 2006
28.Dt LIBARCHIVE 3
29.Os
30.Sh NAME
31.Nm libarchive
32.Nd functions for reading and writing streaming archives
33.Sh LIBRARY
34.Lb libarchive
35.Sh OVERVIEW
36The
37.Nm
38library provides a flexible interface for reading and writing
39streaming archive files such as tar and cpio.
40The library is inherently stream-oriented; readers serially iterate through
41the archive, writers serially add things to the archive.
42In particular, note that there is no built-in support for
43random access nor for in-place modification.
44.Pp
45When reading an archive, the library automatically detects the
46format and the compression.
47The library currently has read support for:
48.Bl -bullet -compact
49.It
50old-style tar archives,
51.It
52most variants of the POSIX
53.Dq ustar
54format,
55.It
56the POSIX
57.Dq pax interchange
58format,
59.It
60GNU-format tar archives,
61.It
62most common cpio archive formats,
63.It
64ISO9660 CD images (with or without RockRidge extensions),
65.It
66Zip archives.
67.El
68The library automatically detects archives compressed with
69.Xr gzip 1 ,
70.Xr bzip2 1 ,
71or
72.Xr compress 1
73and decompresses them transparently.
74.Pp
75When writing an archive, you can specify the compression
76to be used and the format to use.
77The library can write
78.Bl -bullet -compact
79.It
80POSIX-standard
81.Dq ustar
82archives,
83.It
84POSIX
85.Dq pax interchange format
86archives,
87.It
88POSIX octet-oriented cpio archives,
89.It
90two different variants of shar archives.
91.El
92Pax interchange format is an extension of the tar archive format that
93eliminates essentially all of the limitations of historic tar formats
94in a standard fashion that is supported
95by POSIX-compliant
96.Xr pax 1
97implementations on many systems as well as several newer implementations of
98.Xr tar 1 .
99Note that the default write format will suppress the pax extended
100attributes for most entries; explicitly requesting pax format will
101enable those attributes for all entries.
102.Pp
103The read and write APIs are accessed through the
104.Fn archive_read_XXX
105functions and the
106.Fn archive_write_XXX
107functions, respectively, and either can be used independently
108of the other.
109.Pp
110The rest of this manual page provides an overview of the library
111operation.
112More detailed information can be found in the individual manual
113pages for each API or utility function.
114.Sh READING AN ARCHIVE
115To read an archive, you must first obtain an initialized
116.Tn struct archive
117object from
118.Fn archive_read_new .
119You can then modify this object for the desired operations with the
120various
121.Fn archive_read_set_XXX
122and
123.Fn archive_read_support_XXX
124functions.
125In particular, you will need to invoke appropriate
126.Fn archive_read_support_XXX
127functions to enable the corresponding compression and format
128support.
129Note that these latter functions perform two distinct operations:
130they cause the corresponding support code to be linked into your
131program, and they enable the corresponding auto-detect code.
132Unless you have specific constraints, you will generally want
133to invoke
134.Fn archive_read_support_compression_all
135and
136.Fn archive_read_support_format_all
137to enable auto-detect for all formats and compression types
138currently supported by the library.
139.Pp
140Once you have prepared the
141.Tn struct archive
142object, you call
143.Fn archive_read_open
144to actually open the archive and prepare it for reading.
145There are several variants of this function;
146the most basic expects you to provide pointers to several
147functions that can provide blocks of bytes from the archive.
148There are convenience forms that allow you to
149specify a filename, file descriptor,
150.Ft "FILE *"
151object, or a block of memory from which to read the archive data.
152Note that the core library makes no assumptions about the
153size of the blocks read;
154callback functions are free to read whatever block size is
155most appropriate for the medium.
156.Pp
157Each archive entry consists of a header followed by a certain
158amount of data.
159You can obtain the next header with
160.Fn archive_read_next_header ,
161which returns a pointer to an
162.Tn struct archive_entry
163structure with information about the current archive element.
164If the entry is a regular file, then the header will be followed
165by the file data.
166You can use
167.Fn archive_read_data
168(which works much like the
169.Xr read 2
170system call)
171to read this data from the archive.
172You may prefer to use the higher-level
173.Fn archive_read_data_skip ,
174which reads and discards the data for this entry,
175.Fn archive_read_data_to_buffer ,
176which reads the data into an in-memory buffer,
177.Fn archive_read_data_to_file ,
178which copies the data to the provided file descriptor, or
179.Fn archive_read_extract ,
180which recreates the specified entry on disk and copies data
181from the archive.
182In particular, note that
183.Fn archive_read_extract
184uses the
185.Tn struct archive_entry
186structure that you provide it, which may differ from the
187entry just read from the archive.
188In particular, many applications will want to override the
189pathname, file permissions, or ownership.
190.Pp
191Once you have finished reading data from the archive, you
192should call
193.Fn archive_read_close
194to close the archive, then call
195.Fn archive_read_finish
196to release all resources, including all memory allocated by the library.
197.Pp
198The
199.Xr archive_read 3
200manual page provides more detailed calling information for this API.
201.Sh WRITING AN ARCHIVE
202You use a similar process to write an archive.
203The
204.Fn archive_write_new
205function creates an archive object useful for writing,
206the various
207.Fn archive_write_set_XXX
208functions are used to set parameters for writing the archive, and
209.Fn archive_write_open
210completes the setup and opens the archive for writing.
211.Pp
212Individual archive entries are written in a three-step
213process:
214You first initialize a
215.Tn struct archive_entry
216structure with information about the new entry.
217At a minimum, you should set the pathname of the
218entry and provide a
219.Va struct stat
220with a valid
221.Va st_mode
222field, which specifies the type of object and
223.Va st_size
224field, which specifies the size of the data portion of the object.
225The
226.Fn archive_write_header
227function actually writes the header data to the archive.
228You can then use
229.Fn archive_write_data
230to write the actual data.
231.Pp
232After all entries have been written, use the
233.Fn archive_write_finish
234function to release all resources.
235.Pp
236The
237.Xr archive_write 3
238manual page provides more detailed calling information for this API.
239.Sh DESCRIPTION
240Detailed descriptions of each function are provided by the
241corresponding manual pages.
242.Pp
243All of the functions utilize an opaque
244.Tn struct archive
245datatype that provides access to the archive contents.
246.Pp
247The
248.Tn struct archive_entry
249structure contains a complete description of a single archive
250entry.
251It uses an opaque interface that is fully documented in
252.Xr archive_entry 3 .
253.Pp
254Users familiar with historic formats should be aware that the newer
255variants have eliminated most restrictions on the length of textual fields.
256Clients should not assume that filenames, link names, user names, or
257group names are limited in length.
258In particular, pax interchange format can easily accommodate pathnames
259in arbitrary character sets that exceed
260.Va PATH_MAX .
261.Sh RETURN VALUES
262Most functions return zero on success, non-zero on error.
263The return value indicates the general severity of the error, ranging
264from
265.Cm ARCHIVE_WARN ,
266which indicates a minor problem that should probably be reported
267to the user, to
268.Cm ARCHIVE_FATAL ,
269which indicates a serious problem that will prevent any further
270operations on this archive.
271On error, the
272.Fn archive_errno
273function can be used to retrieve a numeric error code (see
274.Xr errno 2 ) .
275The
276.Fn archive_error_string
277returns a textual error message suitable for display.
278.Pp
279.Fn archive_read_new
280and
281.Fn archive_write_new
282return pointers to an allocated and initialized
283.Tn struct archive
284object.
285.Pp
286.Fn archive_read_data
287and
288.Fn archive_write_data
289return a count of the number of bytes actually read or written.
290A value of zero indicates the end of the data for this entry.
291A negative value indicates an error, in which case the
292.Fn archive_errno
293and
294.Fn archive_error_string
295functions can be used to obtain more information.
296.Sh ENVIRONMENT
297There are character set conversions within the
298.Xr archive_entry 3
299functions that are impacted by the currently-selected locale.
300.Sh SEE ALSO
301.Xr tar 1 ,
302.Xr archive_entry 3 ,
303.Xr archive_read 3 ,
304.Xr archive_util 3 ,
305.Xr archive_write 3 ,
306.Xr tar 5
307.Sh HISTORY
308The
309.Nm libarchive
310library first appeared in
311.Fx 5.3 .
312.Sh AUTHORS
313.An -nosplit
314The
315.Nm libarchive
316library was written by
317.An Tim Kientzle Aq kientzle@acm.org .
318.Sh BUGS
319Some archive formats support information that is not supported by
320.Tn struct archive_entry .
321Such information cannot be fully archived or restored using this library.
322This includes, for example, comments, character sets,
323or the arbitrary key/value pairs that can appear in
324pax interchange format archives.
325.Pp
326Conversely, of course, not all of the information that can be
327stored in an
328.Tn struct archive_entry
329is supported by all formats.
330For example, cpio formats do not support nanosecond timestamps;
331old tar formats do not support large device numbers.
332