xref: /openbsd/usr.sbin/pkg_add/OpenBSD/Ustar.pod (revision 5fc83ebf)
1$OpenBSD: Ustar.pod,v 1.3 2023/05/16 10:52:58 espie Exp $
2
3=head1 NAME
4
5OpenBSD::Ustar - simple access to Ustar C<tar(1)> archives
6
7=head1 SYNOPSIS
8
9    use OpenBSD::Ustar;
10    # for reading
11
12    open(my $in, "<", $arcnameforreading) or die;
13    $rdarc = OpenBSD::Ustar->new($in, $state, $destdir);
14    $rdarc->set_description($arcnameforreading);
15    while (my $o = $rdarc->next) {
16    	# decide whether we want to extract it, change object attributes
17	$o->create;
18    }
19    $rdarc->close;
20
21    # for writing
22    open(my $out, ">", $arcnameforwriting) or die;
23    $wrarc = OpenBSD::Ustar->new($fh, $state, $destdir);
24    # loop
25    	my $o = $wrarc->prepare($filename);
26	# tweak some entry parameters
27	$o->write;
28
29    $wrarc->close;
30
31    # for copying
32    open(my $in, "<", $arcnameforreading) or die;
33    $rdarc = OpenBSD::Ustar->new($in, $state, $destdir);
34    open(my $out, ">", $arcnameforwriting) or die;
35    $wrarc = OpenBSD::Ustar->new($fh, $state, $destdir);
36    while (my $o = $rdarc->next) {
37    	$o->copy($wrarc);
38    }
39    $rdarc->close;
40    $wrarc->close;
41
42=head1 DESCRIPTION
43
44C<OpenBSD::Ustar> provides an API to read, write and copy archives compatible
45with C<tar(1)>.
46
47For the time being, it can only handle the USTAR archive format,
48but is supports the C<XHDR> (x blocktype) extension for accurately
49representing long hard links and symbolic links.
50It also accurately recognize some common extensions that it doesn't process.
51
52A filehandle C<$fh> is associated with an C<OpenBSD::Ustar> object through
53C<new>. For archive reading, the filehandle should support
54C<read>. C<OpenBSD::Ustar> does not rely on C<seek> or C<rewind> in order
55to be usable on pipe outputs. For archive writing, the filehandle should
56support C<print>.
57
58Error messages and fatal errors will be handled through the C<$state> object,
59which should conform to C<OpenBSD::BaseState(3p)> (uses C<errsay> and C<fatal>).
60
61Note that read and write support are mutually exclusive, though there is
62no need to specify the mode used at creation time; it is implicitly
63provided by the underlying filehandle.
64
65Read access to an archive object C<$rdarc> occurs through a loop that
66repeatedly calls C<$o = $rdarc-E<gt>next> to obtain the next archive entry.
67It returns an archive entry object C<$o> that can be
68queried to decide whether to extract this entry or not.
69
70Write access to an archive object C<$wrarc> occurs through a user-directed
71loop: obtain an archive entry through C<$o = $wrarc-E<gt>prepare($filename)>,
72which can be tweaked manually and then written to the archive.
73
74C<prepare> takes an optional C<$destdir> parameter that will override the
75archive destdination directory.
76This can be used to prepare an archive entry from a temporary file, that
77will be used for the real checks and contents of the archive, then set
78the name to save before writing the actual entry:
79
80    $o = $wrarc->prepare($tempfile, '');
81    $o->set_name("othername");
82    $o->write;
83
84Most client software will specialize C<OpenBSD::Ustar> to their own needs.
85Note however that C<OpenBSD::Ustar> is not designed for inheritance.
86Composition (putting a C<OpenBSD::Ustar> object inside your class) and
87forwarding methods (writing C<create> or C<next> methods that call the
88corresponding C<OpenBSD::Ustar> method) are the correct way to use this API.
89
90Note that C<OpenBSD::Ustar> does not do any caching. The client
91code is responsible for retrieving and storing archives if it
92needs to scan through them multiple times in a row.
93
94Actual extraction is performed through C<$o-E<gt>create> and is not
95mandatory. Thus, client code can control whether it wants to extract archive
96elements or not.
97
98In case of errors, the archive will call C<$state-E<gt>fatal> with a suitable
99error message that contains the last index name processed. The user may
100set an optional archive description with C<set_description>.
101
102The archive object can take a description through C<$arc-E<gt>set_description>
103which will be used in error messages related to archive extraction or creation.
104
105The archive object can be embued with a C<$callback> through
106C<$arch-E<gt>set_callback>, which will be called regularly while
107extracting large objects, as C<&$callback($donesize)>,
108with C<$donesize> the number of bytes already extracted, for use in
109progressmeter-style user interactions.
110
111Small files can also be directly extracted to a scalar using
112C<$v = $o-E<gt>contents>.
113
114Actual file objects can also be directly extracted to a temporary file using
115C<$oE<gt>extract_to_fh($fh)>.
116
117Actual writing is performed through C<$o-E<gt>write> and is not mandatory
118either.
119
120Archives should be closed using C<$wrarc-E<gt>close>, which will
121pad the archive as needed and close the underlying file handle.
122In particular, this is mandatory for write access, since valid archives
123require blank-filled blocks.
124
125This is equivalent to calling C<$wrarc-E<gt>pad>, which will
126complete the archive with blank-filled blocks, then closing the
127associated file handle manually.
128
129Client code may decide to abort archive extraction early, or to run it through
130until C<$arc-E<gt>next> returns false.  The C<OpenBSD::Ustar> object doesn't
131hold any hidden resources and doesn't need any specific clean-up.
132
133Client code is only responsible for closing the underlying filehandle and
134terminating any associated pipe process.
135
136An object C<$o> returned through C<next> or through C<prepare> holds all
137the characteristics of the archive header:
138
139=over 20
140
141=item C<$o-E<gt>IsDir>
142
143true if archive entry is a directory
144
145=item C<$o-E<gt>isFile>
146
147true if archive entry is a file
148
149=item C<$o-E<gt>isLink>
150
151true if archive entry is any kind of link
152
153=item C<$o-E<gt>isSymLink>
154
155true if archive entry is a symbolic link
156
157=item C<$o-E<gt>isHardLink>
158
159true if archive entry is a hard link
160
161=item C<$o-E<gt>{name}>
162
163filename
164
165=item C<$o-E<gt>{mode}>
166
167C<chmod(2)> mode
168
169=item C<$o-E<gt>{atime}>
170
171C<utime(2)> access time
172
173=item C<$o-E<gt>{mtime}>
174
175C<utime(2)> modification time
176
177=item C<$o-E<gt>{uid}>
178
179owner user ID
180
181=item C<$o-E<gt>{gid}>
182
183owner group ID
184
185=item C<$o-E<gt>{uname}>
186
187owner user name
188
189=item C<$o-E<gt>{gname}>
190
191owner group name
192
193=item C<$o-E<gt>{linkname}>
194
195name of the source link, if applicable
196
197=back
198
199The fields C<name>, C<mode>, C<atime>, C<mtime>, C<uid>, C<gid> and C<linkname>
200can be altered before calling C<$o-E<gt>create> or C<$o-E<gt>write>,
201and will properly influence the resulting file.
202C<atime> and C<mtime> can be undef to set those to the current time.
203
204The relationship between C<uid> and C<uname>, and C<gid> and C<gname>
205conforms to the USTAR format usual behavior.
206
207In addition, client code may define C<$o-E<gt>{cwd}> in a way similar
208to C<tar(1)>'s C<-C> option to affect the creation of hard links.
209
210All creation commands happen relative to the current destdir of
211the C<$arc> C<OpenBSD::Ustar> object.  This is set at creation, and can
212later be changed through C<$arc-E<gt>destdir($value)>.
213
214During writing, hard link status is determined according to already written
215archive entries: a name that references a file which has already been written
216will be granted hard link status.
217
218Hard links can not be copied from one archive to another unless the original
219file has also been copied.  Calling C<$o-E<gt>alias($arc, $name)> will trick
220the destination archive C<$arc> into believing C<$o> has been copied under the
221given C<$name>, so that further hard links will be copied over.
222
223Archives can be copied by creating separate archives for reading and writing.
224Calling C<$o = $rdarc-E<gt>next> and C<$o-E<gt>copy($wrarc)> will copy
225an entry obtained from C<$rdarc> to C<$wrarc>.
226