1$OpenBSD: Ustar.pod,v 1.3 2023/05/16 10:52:58 espie Exp $ 2 3=head1 NAME 4 5OpenBSD::Ustar - simple access to Ustar C<tar(1)> archives 6 7=head1 SYNOPSIS 8 9 use OpenBSD::Ustar; 10 # for reading 11 12 open(my $in, "<", $arcnameforreading) or die; 13 $rdarc = OpenBSD::Ustar->new($in, $state, $destdir); 14 $rdarc->set_description($arcnameforreading); 15 while (my $o = $rdarc->next) { 16 # decide whether we want to extract it, change object attributes 17 $o->create; 18 } 19 $rdarc->close; 20 21 # for writing 22 open(my $out, ">", $arcnameforwriting) or die; 23 $wrarc = OpenBSD::Ustar->new($fh, $state, $destdir); 24 # loop 25 my $o = $wrarc->prepare($filename); 26 # tweak some entry parameters 27 $o->write; 28 29 $wrarc->close; 30 31 # for copying 32 open(my $in, "<", $arcnameforreading) or die; 33 $rdarc = OpenBSD::Ustar->new($in, $state, $destdir); 34 open(my $out, ">", $arcnameforwriting) or die; 35 $wrarc = OpenBSD::Ustar->new($fh, $state, $destdir); 36 while (my $o = $rdarc->next) { 37 $o->copy($wrarc); 38 } 39 $rdarc->close; 40 $wrarc->close; 41 42=head1 DESCRIPTION 43 44C<OpenBSD::Ustar> provides an API to read, write and copy archives compatible 45with C<tar(1)>. 46 47For the time being, it can only handle the USTAR archive format, 48but is supports the C<XHDR> (x blocktype) extension for accurately 49representing long hard links and symbolic links. 50It also accurately recognize some common extensions that it doesn't process. 51 52A filehandle C<$fh> is associated with an C<OpenBSD::Ustar> object through 53C<new>. For archive reading, the filehandle should support 54C<read>. C<OpenBSD::Ustar> does not rely on C<seek> or C<rewind> in order 55to be usable on pipe outputs. For archive writing, the filehandle should 56support C<print>. 57 58Error messages and fatal errors will be handled through the C<$state> object, 59which should conform to C<OpenBSD::BaseState(3p)> (uses C<errsay> and C<fatal>). 60 61Note that read and write support are mutually exclusive, though there is 62no need to specify the mode used at creation time; it is implicitly 63provided by the underlying filehandle. 64 65Read access to an archive object C<$rdarc> occurs through a loop that 66repeatedly calls C<$o = $rdarc-E<gt>next> to obtain the next archive entry. 67It returns an archive entry object C<$o> that can be 68queried to decide whether to extract this entry or not. 69 70Write access to an archive object C<$wrarc> occurs through a user-directed 71loop: obtain an archive entry through C<$o = $wrarc-E<gt>prepare($filename)>, 72which can be tweaked manually and then written to the archive. 73 74C<prepare> takes an optional C<$destdir> parameter that will override the 75archive destdination directory. 76This can be used to prepare an archive entry from a temporary file, that 77will be used for the real checks and contents of the archive, then set 78the name to save before writing the actual entry: 79 80 $o = $wrarc->prepare($tempfile, ''); 81 $o->set_name("othername"); 82 $o->write; 83 84Most client software will specialize C<OpenBSD::Ustar> to their own needs. 85Note however that C<OpenBSD::Ustar> is not designed for inheritance. 86Composition (putting a C<OpenBSD::Ustar> object inside your class) and 87forwarding methods (writing C<create> or C<next> methods that call the 88corresponding C<OpenBSD::Ustar> method) are the correct way to use this API. 89 90Note that C<OpenBSD::Ustar> does not do any caching. The client 91code is responsible for retrieving and storing archives if it 92needs to scan through them multiple times in a row. 93 94Actual extraction is performed through C<$o-E<gt>create> and is not 95mandatory. Thus, client code can control whether it wants to extract archive 96elements or not. 97 98In case of errors, the archive will call C<$state-E<gt>fatal> with a suitable 99error message that contains the last index name processed. The user may 100set an optional archive description with C<set_description>. 101 102The archive object can take a description through C<$arc-E<gt>set_description> 103which will be used in error messages related to archive extraction or creation. 104 105The archive object can be embued with a C<$callback> through 106C<$arch-E<gt>set_callback>, which will be called regularly while 107extracting large objects, as C<&$callback($donesize)>, 108with C<$donesize> the number of bytes already extracted, for use in 109progressmeter-style user interactions. 110 111Small files can also be directly extracted to a scalar using 112C<$v = $o-E<gt>contents>. 113 114Actual file objects can also be directly extracted to a temporary file using 115C<$oE<gt>extract_to_fh($fh)>. 116 117Actual writing is performed through C<$o-E<gt>write> and is not mandatory 118either. 119 120Archives should be closed using C<$wrarc-E<gt>close>, which will 121pad the archive as needed and close the underlying file handle. 122In particular, this is mandatory for write access, since valid archives 123require blank-filled blocks. 124 125This is equivalent to calling C<$wrarc-E<gt>pad>, which will 126complete the archive with blank-filled blocks, then closing the 127associated file handle manually. 128 129Client code may decide to abort archive extraction early, or to run it through 130until C<$arc-E<gt>next> returns false. The C<OpenBSD::Ustar> object doesn't 131hold any hidden resources and doesn't need any specific clean-up. 132 133Client code is only responsible for closing the underlying filehandle and 134terminating any associated pipe process. 135 136An object C<$o> returned through C<next> or through C<prepare> holds all 137the characteristics of the archive header: 138 139=over 20 140 141=item C<$o-E<gt>IsDir> 142 143true if archive entry is a directory 144 145=item C<$o-E<gt>isFile> 146 147true if archive entry is a file 148 149=item C<$o-E<gt>isLink> 150 151true if archive entry is any kind of link 152 153=item C<$o-E<gt>isSymLink> 154 155true if archive entry is a symbolic link 156 157=item C<$o-E<gt>isHardLink> 158 159true if archive entry is a hard link 160 161=item C<$o-E<gt>{name}> 162 163filename 164 165=item C<$o-E<gt>{mode}> 166 167C<chmod(2)> mode 168 169=item C<$o-E<gt>{atime}> 170 171C<utime(2)> access time 172 173=item C<$o-E<gt>{mtime}> 174 175C<utime(2)> modification time 176 177=item C<$o-E<gt>{uid}> 178 179owner user ID 180 181=item C<$o-E<gt>{gid}> 182 183owner group ID 184 185=item C<$o-E<gt>{uname}> 186 187owner user name 188 189=item C<$o-E<gt>{gname}> 190 191owner group name 192 193=item C<$o-E<gt>{linkname}> 194 195name of the source link, if applicable 196 197=back 198 199The fields C<name>, C<mode>, C<atime>, C<mtime>, C<uid>, C<gid> and C<linkname> 200can be altered before calling C<$o-E<gt>create> or C<$o-E<gt>write>, 201and will properly influence the resulting file. 202C<atime> and C<mtime> can be undef to set those to the current time. 203 204The relationship between C<uid> and C<uname>, and C<gid> and C<gname> 205conforms to the USTAR format usual behavior. 206 207In addition, client code may define C<$o-E<gt>{cwd}> in a way similar 208to C<tar(1)>'s C<-C> option to affect the creation of hard links. 209 210All creation commands happen relative to the current destdir of 211the C<$arc> C<OpenBSD::Ustar> object. This is set at creation, and can 212later be changed through C<$arc-E<gt>destdir($value)>. 213 214During writing, hard link status is determined according to already written 215archive entries: a name that references a file which has already been written 216will be granted hard link status. 217 218Hard links can not be copied from one archive to another unless the original 219file has also been copied. Calling C<$o-E<gt>alias($arc, $name)> will trick 220the destination archive C<$arc> into believing C<$o> has been copied under the 221given C<$name>, so that further hard links will be copied over. 222 223Archives can be copied by creating separate archives for reading and writing. 224Calling C<$o = $rdarc-E<gt>next> and C<$o-E<gt>copy($wrarc)> will copy 225an entry obtained from C<$rdarc> to C<$wrarc>. 226