1package Convert::BinHex; 2 3 4=head1 NAME 5 6Convert::BinHex - extract data from Macintosh BinHex files 7 8I<ALPHA WARNING: this code is currently in its Alpha release. 9Things may change drastically until the interface is hammered out: 10if you have suggestions or objections, please speak up now!> 11 12 13=head1 SYNOPSIS 14 15B<Simple functions:> 16 17 use Convert::BinHex qw(binhex_crc macbinary_crc); 18 19 # Compute HQX7-style CRC for data, pumping in old CRC if desired: 20 $crc = binhex_crc($data, $crc); 21 22 # Compute the MacBinary-II-style CRC for the data: 23 $crc = macbinary_crc($data, $crc); 24 25B<Hex to bin, low-level interface.> 26Conversion is actually done via an object (L<"Convert::BinHex::Hex2Bin">) 27which keeps internal conversion state: 28 29 # Create and use a "translator" object: 30 my $H2B = Convert::BinHex->hex2bin; # get a converter object 31 while (<STDIN>) { 32 print $STDOUT $H2B->next($_); # convert some more input 33 } 34 print $STDOUT $H2B->done; # no more input: finish up 35 36B<Hex to bin, OO interface.> 37The following operations I<must> be done in the order shown! 38 39 # Read data in piecemeal: 40 $HQX = Convert::BinHex->open(FH=>\*STDIN) || die "open: $!"; 41 $HQX->read_header; # read header info 42 @data = $HQX->read_data; # read in all the data 43 @rsrc = $HQX->read_resource; # read in all the resource 44 45B<Bin to hex, low-level interface.> 46Conversion is actually done via an object (L<"Convert::BinHex::Bin2Hex">) 47which keeps internal conversion state: 48 49 # Create and use a "translator" object: 50 my $B2H = Convert::BinHex->bin2hex; # get a converter object 51 while (<STDIN>) { 52 print $STDOUT $B2H->next($_); # convert some more input 53 } 54 print $STDOUT $B2H->done; # no more input: finish up 55 56B<Bin to hex, file interface.> Yes, you can convert I<to> BinHex 57as well as from it! 58 59 # Create new, empty object: 60 my $HQX = Convert::BinHex->new; 61 62 # Set header attributes: 63 $HQX->filename("logo.gif"); 64 $HQX->type("GIFA"); 65 $HQX->creator("CNVS"); 66 67 # Give it the data and resource forks (either can be absent): 68 $HQX->data(Path => "/path/to/data"); # here, data is on disk 69 $HQX->resource(Data => $resourcefork); # here, resource is in core 70 71 # Output as a BinHex stream, complete with leading comment: 72 $HQX->encode(\*STDOUT); 73 74B<PLANNED!!!! Bin to hex, "CAP" interface.> 75I<Thanks to Ken Lunde for suggesting this>. 76 77 # Create new, empty object from CAP tree: 78 my $HQX = Convert::BinHex->from_cap("/path/to/root/file"); 79 $HQX->encode(\*STDOUT); 80 81 82=head1 DESCRIPTION 83 84B<BinHex> is a format used by Macintosh for transporting Mac files 85safely through electronic mail, as short-lined, 7-bit, semi-compressed 86data streams. Ths module provides a means of converting those 87data streams back into into binary data. 88 89 90=head1 FORMAT 91 92I<(Some text taken from RFC-1741.)> 93Files on the Macintosh consist of two parts, called I<forks>: 94 95=over 4 96 97=item Data fork 98 99The actual data included in the file. The Data fork is typically the 100only meaningful part of a Macintosh file on a non-Macintosh computer system. 101For example, if a Macintosh user wants to send a file of data to a 102user on an IBM-PC, she would only send the Data fork. 103 104=item Resource fork 105 106Contains a collection of arbitrary attribute/value pairs, including 107program segments, icon bitmaps, and parametric values. 108 109=back 110 111Additional information regarding Macintosh files is stored by the 112Finder in a hidden file, called the "Desktop Database". 113 114Because of the complications in storing different parts of a 115Macintosh file in a non-Macintosh filesystem that only handles 116consecutive data in one part, it is common to convert the Macintosh 117file into some other format before transferring it over the network. 118The BinHex format squashes that data into transmittable ASCII as follows: 119 120=over 4 121 122=item 1. 123 124The file is output as a B<byte stream> consisting of some basic header 125information (filename, type, creator), then the data fork, then the 126resource fork. 127 128=item 2. 129 130The byte stream is B<compressed> by looking for series of duplicated 131bytes and representing them using a special binary escape sequence 132(of course, any occurences of the escape character must also be escaped). 133 134=item 3. 135 136The compressed stream is B<encoded> via the "6/8 hemiola" common 137to I<base64> and I<uuencode>: each group of three 8-bit bytes (24 bits) 138is chopped into four 6-bit numbers, which are used as indexes into 139an ASCII "alphabet". 140(I assume that leftover bytes are zero-padded; documentation is thin). 141 142=back 143 144=cut 145 146use strict; 147use warnings; 148use vars qw(@ISA @EXPORT_OK $VERSION $QUIET); 149use integer; 150 151use Carp; 152use Exporter; 153use FileHandle; 154 155@ISA = qw(Exporter); 156@EXPORT_OK = qw( 157 macbinary_crc 158 binhex_crc 159 ); 160 161 162 163our $VERSION = '1.125'; # VERSION 164 165# My identity: 166my $I = 'binhex:'; 167 168# Utility function: 169sub min { 170 my ($a, $b) = @_; 171 ($a < $b) ? $a : $b; 172} 173 174# An array useful for CRC calculations that use 0x1021 as the "seed": 175my @MAGIC = ( 176 0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50a5, 0x60c6, 0x70e7, 177 0x8108, 0x9129, 0xa14a, 0xb16b, 0xc18c, 0xd1ad, 0xe1ce, 0xf1ef, 178 0x1231, 0x0210, 0x3273, 0x2252, 0x52b5, 0x4294, 0x72f7, 0x62d6, 179 0x9339, 0x8318, 0xb37b, 0xa35a, 0xd3bd, 0xc39c, 0xf3ff, 0xe3de, 180 0x2462, 0x3443, 0x0420, 0x1401, 0x64e6, 0x74c7, 0x44a4, 0x5485, 181 0xa56a, 0xb54b, 0x8528, 0x9509, 0xe5ee, 0xf5cf, 0xc5ac, 0xd58d, 182 0x3653, 0x2672, 0x1611, 0x0630, 0x76d7, 0x66f6, 0x5695, 0x46b4, 183 0xb75b, 0xa77a, 0x9719, 0x8738, 0xf7df, 0xe7fe, 0xd79d, 0xc7bc, 184 0x48c4, 0x58e5, 0x6886, 0x78a7, 0x0840, 0x1861, 0x2802, 0x3823, 185 0xc9cc, 0xd9ed, 0xe98e, 0xf9af, 0x8948, 0x9969, 0xa90a, 0xb92b, 186 0x5af5, 0x4ad4, 0x7ab7, 0x6a96, 0x1a71, 0x0a50, 0x3a33, 0x2a12, 187 0xdbfd, 0xcbdc, 0xfbbf, 0xeb9e, 0x9b79, 0x8b58, 0xbb3b, 0xab1a, 188 0x6ca6, 0x7c87, 0x4ce4, 0x5cc5, 0x2c22, 0x3c03, 0x0c60, 0x1c41, 189 0xedae, 0xfd8f, 0xcdec, 0xddcd, 0xad2a, 0xbd0b, 0x8d68, 0x9d49, 190 0x7e97, 0x6eb6, 0x5ed5, 0x4ef4, 0x3e13, 0x2e32, 0x1e51, 0x0e70, 191 0xff9f, 0xefbe, 0xdfdd, 0xcffc, 0xbf1b, 0xaf3a, 0x9f59, 0x8f78, 192 0x9188, 0x81a9, 0xb1ca, 0xa1eb, 0xd10c, 0xc12d, 0xf14e, 0xe16f, 193 0x1080, 0x00a1, 0x30c2, 0x20e3, 0x5004, 0x4025, 0x7046, 0x6067, 194 0x83b9, 0x9398, 0xa3fb, 0xb3da, 0xc33d, 0xd31c, 0xe37f, 0xf35e, 195 0x02b1, 0x1290, 0x22f3, 0x32d2, 0x4235, 0x5214, 0x6277, 0x7256, 196 0xb5ea, 0xa5cb, 0x95a8, 0x8589, 0xf56e, 0xe54f, 0xd52c, 0xc50d, 197 0x34e2, 0x24c3, 0x14a0, 0x0481, 0x7466, 0x6447, 0x5424, 0x4405, 198 0xa7db, 0xb7fa, 0x8799, 0x97b8, 0xe75f, 0xf77e, 0xc71d, 0xd73c, 199 0x26d3, 0x36f2, 0x0691, 0x16b0, 0x6657, 0x7676, 0x4615, 0x5634, 200 0xd94c, 0xc96d, 0xf90e, 0xe92f, 0x99c8, 0x89e9, 0xb98a, 0xa9ab, 201 0x5844, 0x4865, 0x7806, 0x6827, 0x18c0, 0x08e1, 0x3882, 0x28a3, 202 0xcb7d, 0xdb5c, 0xeb3f, 0xfb1e, 0x8bf9, 0x9bd8, 0xabbb, 0xbb9a, 203 0x4a75, 0x5a54, 0x6a37, 0x7a16, 0x0af1, 0x1ad0, 0x2ab3, 0x3a92, 204 0xfd2e, 0xed0f, 0xdd6c, 0xcd4d, 0xbdaa, 0xad8b, 0x9de8, 0x8dc9, 205 0x7c26, 0x6c07, 0x5c64, 0x4c45, 0x3ca2, 0x2c83, 0x1ce0, 0x0cc1, 206 0xef1f, 0xff3e, 0xcf5d, 0xdf7c, 0xaf9b, 0xbfba, 0x8fd9, 0x9ff8, 207 0x6e17, 0x7e36, 0x4e55, 0x5e74, 0x2e93, 0x3eb2, 0x0ed1, 0x1ef0 208); 209 210# Ssssssssssshhhhhhhhhh: 211$QUIET = 0; 212 213 214 215#============================== 216 217=head1 FUNCTIONS 218 219=head2 CRC computation 220 221=over 4 222 223=cut 224 225#------------------------------------------------------------ 226 227=item macbinary_crc DATA, SEED 228 229Compute the MacBinary-II-style CRC for the given DATA, with the CRC 230seeded to SEED. Normally, you start with a SEED of 0, and you pump in 231the previous CRC as the SEED if you're handling a lot of data one chunk 232at a time. That is: 233 234 $crc = 0; 235 while (<STDIN>) { 236 $crc = macbinary_crc($_, $crc); 237 } 238 239I<Note:> Extracted from the I<mcvert> utility (Doug Moore, April '87), 240using a "magic array" algorithm by Jim Van Verth for efficiency. 241Converted to Perl5 by Eryq. B<Untested.> 242 243=cut 244 245sub macbinary_crc { 246 my $len = length($_[0]); 247 my $crc = $_[1]; 248 my $i; 249 for ($i = 0; $i < $len; $i++) { 250 ($crc ^= (vec($_[0], $i, 8) << 8)) &= 0xFFFF; 251 $crc = ($crc << 8) ^ $MAGIC[$crc >> 8]; 252 } 253 $crc; 254} 255 256#------------------------------------------------------------ 257 258=item binhex_crc DATA, SEED 259 260Compute the HQX-style CRC for the given DATA, with the CRC seeded to SEED. 261Normally, you start with a SEED of 0, and you pump in the previous CRC as 262the SEED if you're handling a lot of data one chunk at a time. That is: 263 264 $crc = 0; 265 while (<STDIN>) { 266 $crc = binhex_crc($_, $crc); 267 } 268 269I<Note:> Extracted from the I<mcvert> utility (Doug Moore, April '87), 270using a "magic array" algorithm by Jim Van Verth for efficiency. 271Converted to Perl5 by Eryq. 272 273=cut 274 275sub binhex_crc { 276 my $len = length($_[0]); 277 my $crc = $_[1]; 278 if (! defined $crc) { 279 $crc = 0; 280 } 281 my $i; 282 for ($i = 0; $i < $len; $i++) { 283 my $ocrc = $crc; 284 $crc = (((($crc & 0xFF) << 8) | vec($_[0], $i, 8)) 285 ^ $MAGIC[$crc >> 8]) & 0xFFFF; 286 ## printf "CRCin = %04x, char = %02x (%c), CRCout = %04x\n", 287 ## $ocrc, vec($_[0], $i, 8), ord(substr($_[0], $i, 1)), $crc; 288 } 289 $crc; 290} 291 292 293=back 294 295=cut 296 297 298 299#============================== 300 301=head1 OO INTERFACE 302 303=head2 Conversion 304 305=over 4 306 307=cut 308 309#------------------------------------------------------------ 310 311=item bin2hex 312 313I<Class method, constructor.> 314Return a converter object. Just creates a new instance of 315L<"Convert::BinHex::Bin2Hex">; see that class for details. 316 317=cut 318 319sub bin2hex { 320 return Convert::BinHex::Bin2Hex->new; 321} 322 323#------------------------------------------------------------ 324 325=item hex2bin 326 327I<Class method, constructor.> 328Return a converter object. Just creates a new instance of 329L<"Convert::BinHex::Hex2Bin">; see that class for details. 330 331=cut 332 333sub hex2bin { 334 return Convert::BinHex::Hex2Bin->new; 335} 336 337=back 338 339=cut 340 341 342 343#============================== 344 345=head2 Construction 346 347=over 4 348 349=cut 350 351#------------------------------------------------------------ 352 353=item new PARAMHASH 354 355I<Class method, constructor.> 356Return a handle on a BinHex'able entity. In general, the data and resource 357forks for such an entity are stored in native format (binary) format. 358 359Parameters in the PARAMHASH are the same as header-oriented method names, 360and may be used to set attributes: 361 362 $HQX = new Convert::BinHex filename => "icon.gif", 363 type => "GIFB", 364 creator => "CNVS"; 365 366=cut 367 368sub new { 369 my ($class, %params) = @_; 370 371 # Create object: 372 my $self = bless { 373 Data => new Convert::BinHex::Fork, # data fork 374 Rsrc => new Convert::BinHex::Fork, # resource fork 375 }, $class; # basic object 376 377 # Process params: 378 my $method; 379 foreach $method (qw(creator filename flags requires type version 380 software_version)){ 381 $self->$method($params{$method}) if exists($params{$method}); 382 } 383 $self; 384} 385 386#------------------------------------------------------------ 387 388=item open PARAMHASH 389 390I<Class method, constructor.> 391Return a handle on a new BinHex'ed stream, for parsing. 392Params are: 393 394=over 4 395 396=item Data 397 398Input a HEX stream from the given data. This can be a scalar, or a 399reference to an array of scalars. 400 401=item Expr 402 403Input a HEX stream from any open()able expression. It will be opened and 404binmode'd, and the filehandle will be closed either on a C<close()> 405or when the object is destructed. 406 407=item FH 408 409Input a HEX stream from the given filehandle. 410 411=item NoComment 412 413If true, the parser should not attempt to skip a leading "(This file...)" 414comment. That means that the first nonwhite characters encountered 415must be the binhex'ed data. 416 417=back 418 419=cut 420 421sub open { 422 my $self = shift; 423 my %params = @_; 424 425 # Create object: 426 ref($self) or $self = $self->new; 427 428 # Set up input: 429 my $data; 430 if ($params{FH}) { 431 $self->{FH} = Convert::BinHex::IO_Handle->wrap($params{FH}); 432 } 433 elsif ($params{Expr}) { 434 $self->{FH} = FileHandle->new($params{Expr}) or 435 croak "$I can't open $params{Expr}: $!\n"; 436 $self->{FH} = Convert::BinHex::IO_Handle->wrap($self->{FH}); 437 } 438 elsif ($params{Data}) { 439 if (!ref($data = $params{Data})) { # scalar 440 $self->{FH} = Convert::BinHex::IO_Scalar->wrap(\$data); 441 } 442 elsif (ref($data) eq 'ARRAY') { 443 $data = join('', @$data); 444 $self->{FH} = Convert::BinHex::IO_Scalar->wrap(\$data); 445 } 446 } 447 $self->{FH} or croak "$I missing a valid input source\n"; 448 449 # Comments? 450 $self->{CommentRead} = $params{NoComment}; 451 452 # Reset the converter! 453 $self->{H2B} = Convert::BinHex::Hex2Bin->new; 454 $self; 455} 456 457 458=back 459 460=cut 461 462 463 464 465#============================== 466 467=head2 Get/set header information 468 469=over 4 470 471=cut 472 473#------------------------------ 474 475=item creator [VALUE] 476 477I<Instance method.> 478Get/set the creator of the file. This is a four-character 479string (though I don't know if it's guaranteed to be printable ASCII!) 480that serves as part of the Macintosh's version of a MIME "content-type". 481 482For example, a document created by "Canvas" might have 483creator C<"CNVS">. 484 485=cut 486 487sub creator { (@_ > 1) ? ($_[0]->{Creator} = $_[1]) : $_[0]->{Creator} } 488 489#------------------------------ 490 491=item data [PARAMHASH] 492 493I<Instance method.> 494Get/set the data fork. Any arguments are passed into the 495new() method of L<"Convert::BinHex::Fork">. 496 497=cut 498 499sub data { 500 my $self = shift; 501 @_ ? $self->{Data} = Convert::BinHex::Fork->new(@_) : $self->{Data}; 502} 503 504#------------------------------ 505 506=item filename [VALUE] 507 508I<Instance method.> 509Get/set the name of the file. 510 511=cut 512 513sub filename { (@_ > 1) ? ($_[0]->{Filename} = $_[1]) : $_[0]->{Filename} } 514 515#------------------------------ 516 517=item flags [VALUE] 518 519I<Instance method.> 520Return the flags, as an integer. Use bitmasking to get as the values 521you need. 522 523=cut 524 525sub flags { (@_ > 1) ? ($_[0]->{Flags} = $_[1]) : $_[0]->{Flags} } 526 527#------------------------------ 528 529=item header_as_string 530 531Return a stringified version of the header that you might 532use for logging/debugging purposes. It looks like this: 533 534 X-HQX-Software: BinHex 4.0 (Convert::BinHex 1.102) 535 X-HQX-Filename: Something_new.eps 536 X-HQX-Version: 0 537 X-HQX-Type: EPSF 538 X-HQX-Creator: ART5 539 X-HQX-Data-Length: 49731 540 X-HQX-Rsrc-Length: 23096 541 542As some of you might have guessed, this is RFC-822-style, and 543may be easily plunked down into the middle of a mail header, or 544split into lines, etc. 545 546=cut 547 548sub header_as_string { 549 my $self = shift; 550 my @h; 551 push @h, "X-HQX-Software: " . 552 "BinHex " . ($self->requires || '4.0') . 553 " (Convert::BinHex $VERSION)"; 554 push @h, "X-HQX-Filename: " . $self->filename; 555 push @h, "X-HQX-Version: " . $self->version; 556 push @h, "X-HQX-Type: " . $self->type; 557 push @h, "X-HQX-Creator: " . $self->creator; 558 push @h, "X-HQX-Flags: " . sprintf("%x", $self->flags); 559 push @h, "X-HQX-Data-Length: " . $self->data->length; 560 push @h, "X-HQX-Rsrc-Length: " . $self->resource->length; 561 push @h, "X-HQX-CRC: " . sprintf("%x", $self->{HdrCRC}); 562 return join("\n", @h) . "\n"; 563} 564 565#------------------------------ 566 567=item requires [VALUE] 568 569I<Instance method.> 570Get/set the software version required to convert this file, as 571extracted from the comment that preceded the actual binhex'ed 572data; e.g.: 573 574 (This file must be converted with BinHex 4.0) 575 576In this case, after parsing in the comment, the code: 577 578 $HQX->requires; 579 580would get back "4.0". 581 582=cut 583 584sub requires { 585 (@_ > 1) ? ($_[0]->{Requires} = $_[1]) : $_[0]->{Requires} 586} 587*software_version = \&requires; 588 589#------------------------------ 590 591=item resource [PARAMHASH] 592 593I<Instance method.> 594Get/set the resource fork. Any arguments are passed into the 595new() method of L<"Convert::BinHex::Fork">. 596 597=cut 598 599sub resource { 600 my $self = shift; 601 @_ ? $self->{Rsrc} = Convert::BinHex::Fork->new(@_) : $self->{Rsrc}; 602} 603 604#------------------------------ 605 606=item type [VALUE] 607 608I<Instance method.> 609Get/set the type of the file. This is a four-character 610string (though I don't know if it's guaranteed to be printable ASCII!) 611that serves as part of the Macintosh's version of a MIME "content-type". 612 613For example, a GIF89a file might have type C<"GF89">. 614 615=cut 616 617sub type { (@_ > 1) ? ($_[0]->{Type} = $_[1]) : $_[0]->{Type} } 618 619#------------------------------ 620 621=item version [VALUE] 622 623I<Instance method.> 624Get/set the version, as an integer. 625 626=cut 627 628sub version { (@_ > 1) ? ($_[0]->{Version} = $_[1]) : $_[0]->{Version} } 629 630 631=back 632 633=cut 634 635### OBSOLETE!!! 636sub data_length { shift->data->length(@_) } 637sub resource_length { shift->resource->length(@_) } 638 639 640 641 642#============================== 643 644=head2 Decode, high-level 645 646=over 4 647 648=cut 649 650#------------------------------------------------------------ 651 652=item read_comment 653 654I<Instance method.> 655Skip past the opening comment in the file, which is of the form: 656 657 (This file must be converted with BinHex 4.0) 658 659As per RFC-1741, I<this comment must immediately precede the BinHex data,> 660and any text before it will be ignored. 661 662I<You don't need to invoke this method yourself;> C<read_header()> will 663do it for you. After the call, the version number in the comment is 664accessible via the C<requires()> method. 665 666=cut 667 668sub read_comment { 669 my $self = shift; 670 return 1 if ($self->{CommentRead}); # prevent accidents 671 local($_); 672 while (defined($_ = $self->{FH}->getline)) { 673 chomp; 674 if (/^\(This file must be converted with BinHex ([\d\.]+).*\)\s*$/i) { 675 $self->requires($1); 676 return $self->{CommentRead} = 1; 677 } 678 } 679 croak "$I comment line (This file must be converted with BinHex...) ". 680 "not found\n"; 681} 682 683#------------------------------------------------------------ 684 685=item read_header 686 687I<Instance method.> 688Read in the BinHex file header. You must do this first! 689 690=cut 691 692sub read_header { 693 my $self = shift; 694 return 1 if ($self->{HeaderRead}); # prevent accidents 695 696 # Skip comment: 697 $self->read_comment; 698 699 # Get header info: 700 $self->filename ($self->read_str($self->read_byte)); 701 $self->version ($self->read_byte); 702 $self->type ($self->read_str(4)); 703 $self->creator ($self->read_str(4)); 704 $self->flags ($self->read_short); 705 $self->data_length ($self->read_long); 706 $self->resource_length ($self->read_long); 707 $self->{HdrCRC} = $self->read_short; 708 $self->{HeaderRead} = 1; 709} 710 711#------------------------------------------------------------ 712# 713# _read_fork 714# 715# I<Instance method, private.> 716# Read in a fork. 717# 718 719sub _read_fork { 720 my $self = shift; 721 722 # Pass in call if array context: 723 if (wantarray) { 724 local($_); 725 my @all; 726 push @all, $_ while (defined($_ = $self->_read_fork(@_))); 727 return @all; 728 } 729 730 # Get args: 731 my ($fork, $n) = @_; 732 if($self->{$fork}->length == 0) { 733 $self->{$fork}->crc($self->read_short); 734 return undef; 735 } 736 defined($n) or $n = 2048; 737 738 # Reset pointer into fork if necessary: 739 if (!defined($self->{$fork}{Ptr})) { 740 $self->{$fork}{Ptr} = 0; 741 $self->{CompCRC} = 0; 742 } 743 744 # Check for EOF: 745 return undef if ($self->{$fork}{Ptr} >= $self->{$fork}->length); 746 747 # Read up to, but not exceeding, the number of bytes left in the fork: 748 my $n2read = min($n, ($self->{$fork}->length - $self->{$fork}{Ptr})); 749 my $data = $self->read_str($n2read); 750 $self->{$fork}{Ptr} += length($data); 751 752 # If we just read the last byte, read the CRC also: 753 if (($self->{$fork}{Ptr} == $self->{$fork}->length) && # last byte 754 !defined($self->{$fork}->crc)) { # no CRC 755 my $comp_CRC; 756 757 # Move computed CRC forward by two zero bytes, and grab the value: 758 if ($self->{CheckCRC}) { 759 $self->{CompCRC} = binhex_crc("\000\000", $self->{CompCRC}); 760 } 761 762 # Get CRC as stored in file: 763 $self->{$fork}->crc($self->read_short); # get stored CRC 764 765 # Compare, and note corruption if detected: 766 if ($self->{CheckCRC} and ($self->{$fork}->crc != $comp_CRC)) { 767 &Carp::carp("CRCs do not match: corrupted data?") unless $QUIET; 768 $self->{Corrupted} = 1; 769 } 770 } 771 772 # Return the bytes: 773 $data; 774} 775 776#------------------------------------------------------------ 777 778=item read_data [NBYTES] 779 780I<Instance method.> 781Read information from the data fork. Use it in an array context to 782slurp all the data into an array of scalars: 783 784 @data = $HQX->read_data; 785 786Or use it in a scalar context to get the data piecemeal: 787 788 while (defined($data = $HQX->read_data)) { 789 # do stuff with $data 790 } 791 792The NBYTES to read defaults to 2048. 793 794=cut 795 796sub read_data { 797 shift->_read_fork('Data',@_); 798} 799 800#------------------------------------------------------------ 801 802=item read_resource [NBYTES] 803 804I<Instance method.> 805Read in all/some of the resource fork. 806See C<read_data()> for usage. 807 808=cut 809 810sub read_resource { 811 shift->_read_fork('Rsrc',@_); 812} 813 814=back 815 816=cut 817 818 819 820#------------------------------------------------------------ 821# 822# read BUFFER, NBYTES 823# 824# Read the next NBYTES (decompressed) bytes from the input stream 825# into BUFFER. Returns the number of bytes actually read, and 826# undef on end of file. 827# 828# I<Note:> the calling style mirrors the IO::Handle read() function. 829 830my $READBUF = ''; 831sub read { 832 my ($self, $n) = ($_[0], $_[2]); 833 $_[1] = ''; # just in case 834 my $FH = $self->{FH}; 835 local($^W) = 0; 836 837 # Get more BIN bytes until enough or EOF: 838 my $bin; 839 while (length($self->{BIN_QUEUE}) < $n) { 840 $FH->read($READBUF, 4096) or last; 841 $self->{BIN_QUEUE} .= $self->{H2B}->next($READBUF); # save BIN 842 } 843 844 # We've got as many bytes as we're gonna get: 845 $_[1] = substr($self->{BIN_QUEUE}, 0, $n); 846 $self->{BIN_QUEUE} = substr($self->{BIN_QUEUE}, $n); 847 848 # Advance the CRC: 849 if ($self->{CheckCRC}) { 850 $self->{CompCRC} = binhex_crc($_[1], $self->{CompCRC}); 851 } 852 return length($_[1]); 853} 854 855#------------------------------------------------------------ 856# 857# read_str NBYTES 858# 859# Read and return the next NBYTES bytes, or die with "unexpected end of file" 860 861sub read_str { 862 my ($self, $n) = @_; 863 my $buf = ''; 864 $self->read($buf, $n); 865 croak "$I unexpected end of file (wanted $n, got " . length($buf) . ")\n" 866 if ($n and (length($buf) < $n)); 867 return $buf; 868} 869 870#------------------------------------------------------------ 871# 872# read_byte 873# read_short 874# read_long 875# 876# Read 1, 2, or 4 bytes, and return the value read as an unsigned integer. 877# If not that many bytes remain, die with "unexpected end of file"; 878 879sub read_byte { 880 ord($_[0]->read_str(1)); 881} 882 883sub read_short { 884 unpack("n", $_[0]->read_str(2)); 885} 886 887sub read_long { 888 unpack("N", $_[0]->read_str(4)); 889} 890 891 892 893 894 895 896 897 898 899#============================== 900 901=head2 Encode, high-level 902 903=over 4 904 905=cut 906 907#------------------------------------------------------------ 908 909=item encode OUT 910 911Encode the object as a BinHex stream to the given output handle OUT. 912OUT can be a filehandle, or any blessed object that responds to a 913C<print()> message. 914 915The leading comment is output, using the C<requires()> attribute. 916 917=cut 918 919sub encode { 920 my $self = shift; 921 922 # Get output handle: 923 my $OUT = shift; $OUT = wrap Convert::BinHex::IO_Handle $OUT; 924 925 # Get a new converter: 926 my $B2H = $self->bin2hex; 927 928 # Comment: 929 $OUT->print("(This file must be converted with BinHex ", 930 ($self->requires || '4.0'), 931 ")\n"); 932 933 # Build header in core: 934 my @hdrs; 935 my $flen = length($self->filename); 936 push @hdrs, pack("C", $flen); 937 push @hdrs, pack("a$flen", $self->filename); 938 push @hdrs, pack('C', $self->version); 939 push @hdrs, pack('a4', $self->type || '????'); 940 push @hdrs, pack('a4', $self->creator || '????'); 941 push @hdrs, pack('n', $self->flags || 0); 942 push @hdrs, pack('N', $self->data->length || 0); 943 push @hdrs, pack('N', $self->resource->length || 0); 944 my $hdr = join '', @hdrs; 945 946 # Compute the header CRC: 947 my $crc = binhex_crc("\000\000", binhex_crc($hdr, 0)); 948 949 # Output the header (plus its CRC): 950 $OUT->print($B2H->next($hdr . pack('n', $crc))); 951 952 # Output the data fork: 953 $self->data->encode($OUT, $B2H); 954 955 # Output the resource fork: 956 $self->resource->encode($OUT, $B2H); 957 958 # Finish: 959 $OUT->print($B2H->done); 960 1; 961} 962 963=back 964 965=cut 966 967 968 969#============================== 970 971=head1 SUBMODULES 972 973=cut 974 975#============================================================ 976# 977package Convert::BinHex::Bin2Hex; 978# 979#============================================================ 980 981=head2 Convert::BinHex::Bin2Hex 982 983A BINary-to-HEX converter. This kind of conversion requires 984a certain amount of state information; it cannot be done by 985just calling a simple function repeatedly. Use it like this: 986 987 # Create and use a "translator" object: 988 my $B2H = Convert::BinHex->bin2hex; # get a converter object 989 while (<STDIN>) { 990 print STDOUT $B2H->next($_); # convert some more input 991 } 992 print STDOUT $B2H->done; # no more input: finish up 993 994 # Re-use the object: 995 $B2H->rewind; # ready for more action! 996 while (<MOREIN>) { ... 997 998On each iteration, C<next()> (and C<done()>) may return either 999a decent-sized non-empty string (indicating that more converted data 1000is ready for you) or an empty string (indicating that the converter 1001is waiting to amass more input in its private buffers before handing 1002you more stuff to output. 1003 1004Note that C<done()> I<always> converts and hands you whatever is left. 1005 1006This may have been a good approach. It may not. Someday, the converter 1007may also allow you give it an object that responds to read(), or 1008a FileHandle, and it will do all the nasty buffer-filling on its own, 1009serving you stuff line by line: 1010 1011 # Someday, maybe... 1012 my $B2H = Convert::BinHex->bin2hex(\*STDIN); 1013 while (defined($_ = $B2H->getline)) { 1014 print STDOUT $_; 1015 } 1016 1017Someday, maybe. Feel free to voice your opinions. 1018 1019=cut 1020 1021#------------------------------ 1022# 1023# new 1024 1025sub new { 1026 my $self = bless {}, shift; 1027 return $self->rewind; 1028} 1029 1030#------------------------------ 1031# 1032# rewind 1033 1034sub rewind { 1035 my $self = shift; 1036 $self->{CBIN} = ' ' x 2048; $self->{CBIN} = ''; # BIN waiting for xlation 1037 $self->{HEX} = ' ' x 2048; $self->{HEX} = ''; # HEX waiting for output 1038 $self->{LINE} = 0; # current line of output 1039 $self->{EOL} = "\n"; 1040 $self; 1041} 1042 1043#------------------------------ 1044# 1045# next MOREDATA 1046 1047sub next { shift->_next(0, @_) } 1048 1049#------------------------------ 1050# 1051# done 1052 1053sub done { shift->_next(1) } 1054 1055#------------------------------ 1056# 1057# _next ATEOF, [MOREDATA] 1058# 1059# Instance method, private. Supply more data, and get any more output. 1060# Returns the empty string often, if not enough output has accumulated. 1061 1062sub _next { 1063 my $self = shift; 1064 my $eof = shift; 1065 1066 # Get the BINary data to process this time round, re-queueing the rest: 1067 # Handle EOF and non-EOF conditions separately: 1068 my $new_bin; 1069 if ($eof) { # No more BINary input... 1070 # Pad the queue with nuls to exactly 3n characters: 1071 $self->{CBIN} .= ("\x00" x ((3 - length($self->{CBIN}) % 3) % 3)) 1072 } 1073 else { # More BINary input... 1074 # "Compress" new stuff, and add it to the queue: 1075 ($new_bin = $_[0]) =~ s/\x90/\x90\x00/g; 1076 $self->{CBIN} .= $new_bin; 1077 1078 # Return if not enough to bother with: 1079 return '' if (length($self->{CBIN}) < 2048); 1080 } 1081 1082 # ...At this point, QUEUE holds compressed binary which we will attempt 1083 # to convert to some HEX characters... 1084 1085 # Trim QUEUE to exactly 3n characters, saving the excess: 1086 my $requeue = ''; 1087 $requeue .= chop($self->{CBIN}) while (length($self->{CBIN}) % 3); 1088 1089 # Uuencode, adding stuff to hex: 1090 my $hex = ' ' x 2048; $hex = ''; 1091 pos($self->{CBIN}) = 0; 1092 while ($self->{CBIN} =~ /(.{1,45})/gs) { 1093 $hex .= substr(pack('u', $1), 1); 1094 chop($hex); 1095 } 1096 $self->{CBIN} = reverse($requeue); # put the excess back on the queue 1097 1098 # Switch to BinHex alphabet: 1099 $hex =~ tr 1100 {` -_} 1101 {!!"#$%&'()*+,\x2D012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr}; 1102 1103 # Prepend any HEX we have queued from the last time: 1104 $hex = (($self->{LINE}++ ? '' : ':') . # start with ":" pad? 1105 $self->{HEX} . # any output in the queue? 1106 $hex); 1107 1108 # Break off largest chunk of 64n characters, put remainder back in queue: 1109 my $rem = length($hex) % 64; 1110 $self->{HEX} = ($rem ? substr($hex, -$rem) : ''); 1111 $hex = substr($hex, 0, (length($hex)-$rem)); 1112 1113 # Put in an EOL every 64'th character: 1114 $hex =~ s{(.{64})}{$1$self->{EOL}}sg; 1115 1116 # No more input? Then tack on the remainder now: 1117 if ($eof) { 1118 $hex .= $self->{HEX} . ":" . ($self->{EOL} ? $self->{EOL} : ''); 1119 } 1120 1121 # Done! 1122 $hex; 1123} 1124 1125 1126 1127 1128#============================================================ 1129# 1130package Convert::BinHex::Hex2Bin; 1131# 1132#============================================================ 1133 1134=head2 Convert::BinHex::Hex2Bin 1135 1136A HEX-to-BINary converter. This kind of conversion requires 1137a certain amount of state information; it cannot be done by 1138just calling a simple function repeatedly. Use it like this: 1139 1140 # Create and use a "translator" object: 1141 my $H2B = Convert::BinHex->hex2bin; # get a converter object 1142 while (<STDIN>) { 1143 print STDOUT $H2B->next($_); # convert some more input 1144 } 1145 print STDOUT $H2B->done; # no more input: finish up 1146 1147 # Re-use the object: 1148 $H2B->rewind; # ready for more action! 1149 while (<MOREIN>) { ... 1150 1151On each iteration, C<next()> (and C<done()>) may return either 1152a decent-sized non-empty string (indicating that more converted data 1153is ready for you) or an empty string (indicating that the converter 1154is waiting to amass more input in its private buffers before handing 1155you more stuff to output. 1156 1157Note that C<done()> I<always> converts and hands you whatever is left. 1158 1159Note that this converter does I<not> find the initial 1160"BinHex version" comment. You have to skip that yourself. It 1161only handles data between the opening and closing C<":">. 1162 1163=cut 1164 1165#------------------------------ 1166# 1167# new 1168 1169sub new { 1170 my $self = bless {}, shift; 1171 return $self->rewind; 1172} 1173 1174#------------------------------ 1175# 1176# rewind 1177 1178sub rewind { 1179 my $self = shift; 1180 $self->hex2comp_rewind; 1181 $self->comp2bin_rewind; 1182 $self; 1183} 1184 1185#------------------------------ 1186# 1187# next MOREDATA 1188 1189sub next { 1190 my $self = shift; 1191 $_[0] =~ s/\s//g if (defined($_[0])); # more input 1192 return $self->comp2bin_next($self->hex2comp_next($_[0])); 1193} 1194 1195#------------------------------ 1196# 1197# done 1198 1199sub done { 1200 return ""; 1201} 1202 1203#------------------------------ 1204# 1205# hex2comp_rewind 1206 1207sub hex2comp_rewind { 1208 my $self = shift; 1209 $self->{HEX} = ''; 1210} 1211 1212#------------------------------ 1213# 1214# hex2comp_next HEX 1215# 1216# WARNING: argument is modified destructively for efficiency!!!! 1217 1218sub hex2comp_next { 1219 my $self = shift; 1220 ### print "hex2comp: newhex = $newhex\n"; 1221 1222 # Concat new with queue, and kill any padding: 1223 my $hex = $self->{HEX} . (defined($_[0]) ? $_[0] : ''); 1224 if (index($hex, ':') >= 0) { 1225 $hex =~ s/^://; # start of input 1226 if ($hex =~ s/:\s*\Z//) { # end of input 1227 my $leftover = (length($hex) % 4); # need to pad! 1228 $hex .= "\000" x (4 - $leftover) if $leftover; # zero pad 1229 } 1230 } 1231 1232 # Get longest substring of length 4n possible; put rest back on queue: 1233 my $rem = length($hex) % 4; 1234 $self->{HEX} = ($rem ? substr($hex, -$rem) : ''); 1235 for (; $rem; --$rem) { chop $hex }; 1236 return undef if ($hex eq ''); # nothing to do! 1237 1238 # Convert to uuencoded format: 1239 $hex =~ tr 1240 {!"#$%&'()*+,\x2D012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr} 1241 { -_}; 1242 1243 # Now, uudecode: 1244 my $comp = ''; 1245 my $len; 1246 my $up; 1247 local($^W) = 0; ### KLUDGE 1248 while ($hex =~ /\G(.{1,60})/gs) { 1249 $len = chr(32 + ((length($1)*3)>>2)); # compute length byte 1250 $comp .= unpack("u", $len . $1 ); # uudecode 1251 } 1252 1253 # We now have the compressed binary... expand it: 1254 ### print "hex2comp: comp = $comp\n"; 1255 $comp; 1256} 1257 1258#------------------------------ 1259# 1260# comp2bin_rewind 1261 1262sub comp2bin_rewind { 1263 my $self = shift; 1264 $self->{COMP} = ''; 1265 $self->{LASTC} = ''; 1266} 1267 1268#------------------------------ 1269# 1270# comp2bin_next COMP 1271# 1272# WARNING: argument is modified destructively for efficiency!!!! 1273 1274sub comp2bin_next { 1275 my $self = shift; 1276 1277 # Concat new with queue... anything to do? 1278 my $comp = $self->{COMP} . (defined($_[0]) ? $_[0] : ''); 1279 return undef if ($comp eq ''); 1280 1281 # For each character in compressed string... 1282 $self->{COMP} = ''; 1283 my $lastc = $self->{LASTC}; # speed hack 1284 my $exp = ''; # expanded string 1285 my $i; 1286 my ($c, $n); 1287 for ($i = 0; $i < length($comp); $i++) { 1288 if (($c = substr($comp, $i, 1)) eq "\x90") { # MARK 1289 ### print "c = MARK\n"; 1290 unless (length($n = substr($comp, ++$i, 1))) { 1291 $self->{COMP} = "\x90"; 1292 last; 1293 } 1294 ### print "n = ", ord($n), "; lastc = ", ord($lastc), "\n"; 1295 $exp .= ((ord($n) ? ($lastc x (ord($n)-1)) # repeat last char 1296 : ($lastc = "\x90"))); # literal MARK 1297 } 1298 else { # other CHAR 1299 ### print "c = ", ord($c), "\n"; 1300 $exp .= ($lastc = $c); 1301 } 1302 ### print "exp is now $exp\n"; 1303 } 1304 1305 # Either hit EOS, or there's a MARK char at the very end: 1306 $self->{LASTC} = $lastc; 1307 ### print "leaving with lastc=$lastc and comp=$self->{COMP}\n"; 1308 ### print "comp2bin: exp = $exp\n"; 1309 $exp; 1310} 1311 1312 1313 1314 1315 1316 1317#============================================================ 1318# 1319package Convert::BinHex::Fork; 1320# 1321#============================================================ 1322 1323=head2 Convert::BinHex::Fork 1324 1325A fork in a Macintosh file. 1326 1327 # How to get them... 1328 $data_fork = $HQX->data; # get the data fork 1329 $rsrc_fork = $HQX->resource; # get the resource fork 1330 1331 # Make a new fork: 1332 $FORK = Convert::BinHex::Fork->new(Path => "/tmp/file.data"); 1333 $FORK = Convert::BinHex::Fork->new(Data => $scalar); 1334 $FORK = Convert::BinHex::Fork->new(Data => \@array_of_scalars); 1335 1336 # Get/set the length of the data fork: 1337 $len = $FORK->length; 1338 $FORK->length(170); # this overrides the REAL value: be careful! 1339 1340 # Get/set the path to the underlying data (if in a disk file): 1341 $path = $FORK->path; 1342 $FORK->path("/tmp/file.data"); 1343 1344 # Get/set the in-core data itself, which may be a scalar or an arrayref: 1345 $data = $FORK->data; 1346 $FORK->data($scalar); 1347 $FORK->data(\@array_of_scalars); 1348 1349 # Get/set the CRC: 1350 $crc = $FORK->crc; 1351 $FORK->crc($crc); 1352 1353=cut 1354 1355 1356# Import some stuff into our namespace: 1357*binhex_crc = \&Convert::BinHex::binhex_crc; 1358 1359#------------------------------ 1360# 1361# new PARAMHASH 1362 1363sub new { 1364 my ($class, %params) = @_; 1365 bless \%params, $class; 1366} 1367 1368#------------------------------ 1369# 1370# length [VALUE] 1371 1372sub length { 1373 my $self = shift; 1374 1375 # Set length? 1376 $self->{Length} = shift if @_; 1377 1378 # Return explicit length, if any 1379 return $self->{Length} if defined($self->{Length}); 1380 1381 # Compute it: 1382 if (defined($self->{Path})) { 1383 return (-s $self->{Path}); 1384 } 1385 elsif (!ref($self->{Data})) { 1386 return length($self->{Data}); 1387 } 1388 elsif (ref($self->{Data} eq 'ARRAY')) { 1389 my $n = 0; 1390 foreach (@{$self->{Data}}) { $n += length($_) } 1391 return $n; 1392 } 1393 return undef; # unknown! 1394} 1395 1396#------------------------------ 1397# 1398# path [VALUE] 1399 1400sub path { 1401 my $self = shift; 1402 if (@_) { $self->{Path} = shift; delete $self->{Data} } 1403 $self->{Path}; 1404} 1405 1406#------------------------------ 1407# 1408# data [VALUE] 1409 1410sub data { 1411 my $self = shift; 1412 if (@_) { $self->{Data} = shift; delete $self->{Path} } 1413 $self->{Data}; 1414} 1415 1416#------------------------------ 1417# 1418# crc [VALUE] 1419 1420sub crc { 1421 my $self = shift; 1422 @_ ? $self->{CRC} = shift : $self->{CRC}; 1423} 1424 1425#------------------------------ 1426# 1427# encode OUT, B2H 1428# 1429# Instance method, private. Encode this fork as part of a BinHex stream. 1430# It will be printed to handle OUT using the binhexer B2H. 1431 1432sub encode { 1433 my ($self, $OUT, $B2H) = @_; 1434 my $buf = ''; 1435 require POSIX if $^O||'' eq "MacOS"; 1436 require Fcntl if $^O||'' eq "MacOS"; 1437 my $fd; 1438 1439 # Reset the CRC: 1440 $self->{CRC} = 0; 1441 1442 # Output the data, calculating the CRC as we go: 1443 if (defined($self->{Path})) { # path to fork file 1444 if ($^O||'' eq "MacOS" and $self->{Fork} eq "RSRC") { 1445 $fd = POSIX::open($self->{Path},&POSIX::O_RDONLY | &Fcntl::O_RSRC); 1446 while (POSIX::read($fd, $buf, 2048) > 0) { 1447 $self->{CRC} = binhex_crc($buf, $self->{CRC}); 1448 $OUT->print($B2H->next($buf)); 1449 } 1450 POSIX::close($fd); 1451 } 1452 else { 1453 open FORK, $self->{Path} or die "$self->{Path}: $!"; 1454 while (read(\*FORK, $buf, 2048)) { 1455 $self->{CRC} = binhex_crc($buf, $self->{CRC}); 1456 $OUT->print($B2H->next($buf)); 1457 } 1458 close FORK; 1459 } 1460 } 1461 elsif (!defined($self->{Data})) { # nothing! 1462 &Carp::carp("no data in fork!") unless $Convert::BinHex::QUIET; 1463 } 1464 elsif (!ref($self->{Data})) { # scalar 1465 $self->{CRC} = binhex_crc($self->{Data}, $self->{CRC}); 1466 $OUT->print($B2H->next($self->{Data})); 1467 } 1468 elsif (ref($self->{Data}) eq 'ARRAY') { # array of scalars 1469 foreach $buf (@{$self->{Data}}) { 1470 $self->{CRC} = binhex_crc($buf, $self->{CRC}); 1471 $OUT->print($B2H->next($buf)); 1472 } 1473 } 1474 else { 1475 &Carp::croak("bad/unsupported data in fork"); 1476 } 1477 1478 # Finish the CRC, and output it: 1479 $self->{CRC} = binhex_crc("\000\000", $self->{CRC}); 1480 $OUT->print($B2H->next(pack("n", $self->{CRC}))); 1481 1; 1482} 1483 1484 1485 1486 1487#============================================================ 1488# 1489package Convert::BinHex::IO_Handle; 1490# 1491#============================================================ 1492 1493# Wrap a non-object filehandle inside a blessed, printable interface: 1494# Does nothing if the given $fh is already a blessed object. 1495sub wrap { 1496 my ($class, $fh) = @_; 1497 no strict 'refs'; 1498 $fh or $fh = select; # no filehandle means selected one 1499 ref($fh) or $fh = \*$fh; # scalar becomes a globref 1500 return $fh if (ref($fh) and (ref($fh) !~ /^(GLOB|FileHandle)$/)); 1501 bless \$fh, $class; # wrap it in a printable interface 1502} 1503sub print { 1504 my $FH = ${shift(@_)}; 1505 print $FH @_; 1506} 1507sub getline { 1508 my $FH = ${shift(@_)}; 1509 scalar(<$FH>); 1510} 1511sub read { 1512 read ${$_[0]}, $_[1], $_[2]; 1513} 1514 1515 1516 1517#============================================================ 1518# 1519package Convert::BinHex::IO_Scalar; 1520# 1521#============================================================ 1522 1523# Wrap a scalar inside a blessed, printable interface: 1524sub wrap { 1525 my ($class, $scalarref) = @_; 1526 defined($scalarref) or $scalarref = \""; 1527 pos($$scalarref) = 0; 1528 bless $scalarref, $class; 1529} 1530sub print { 1531 my $self = shift; 1532 $$self .= join('', @_); 1533 1; 1534} 1535sub getline { 1536 my $self = shift; 1537 ($$self =~ /\G(.*?\n?)/g) or return undef; 1538 return $1; 1539} 1540sub read { 1541 my $self = shift; 1542 $_[0] = substr($$self, pos($$self), $_[1]); 1543 pos($$self) += $_[1]; 1544 return length($_[0]); 1545} 1546 1547 1548 1549#============================== 1550 1551=head1 UNDER THE HOOD 1552 1553=head2 Design issues 1554 1555=over 4 1556 1557=item BinHex needs a stateful parser 1558 1559Unlike its cousins I<base64> and I<uuencode>, BinHex format is not 1560amenable to being parsed line-by-line. There appears to be no 1561guarantee that lines contain 4n encoded characters... and even if there 1562is one, the BinHex compression algorithm interferes: even when you 1563can I<decode> one line at a time, you can't necessarily 1564I<decompress> a line at a time. 1565 1566For example: a decoded line ending with the byte C<\x90> (the escape 1567or "mark" character) is ambiguous: depending on the next decoded byte, 1568it could mean a literal C<\x90> (if the next byte is a C<\x00>), or 1569it could mean n-1 more repetitions of the previous character (if 1570the next byte is some nonzero C<n>). 1571 1572For this reason, a BinHex parser has to be somewhat stateful: you 1573cannot have code like this: 1574 1575 #### NO! #### NO! #### NO! #### NO! #### NO! #### 1576 while (<STDIN>) { # read HEX 1577 print hexbin($_); # convert and write BIN 1578 } 1579 1580unless something is happening "behind the scenes" to keep track of 1581what was last done. I<The dangerous thing, however, is that this 1582approach will B<seem> to work, if you only test it on BinHex files 1583which do not use compression and which have 4n HEX characters 1584on each line.> 1585 1586Since we have to be stateful anyway, we use the parser object to 1587keep our state. 1588 1589 1590=item We need to be handle large input files 1591 1592Solutions that demand reading everything into core don't cut 1593it in my book. The first MPEG file that comes along can louse 1594up your whole day. So, there are no size limitations in this 1595module: the data is read on-demand, and filehandles are always 1596an option. 1597 1598 1599=item Boy, is this slow! 1600 1601A lot of the byte-level manipulation that has to go on, particularly 1602the CRC computing (which involves intensive bit-shifting and masking) 1603slows this module down significantly. What is needed perhaps is an 1604I<optional> extension library where the slow pieces can be done more 1605quickly... a Convert::BinHex::CRC, if you will. Volunteers, anyone? 1606 1607Even considering that, however, it's slower than I'd like. I'm 1608sure many improvements can be made in the HEX-to-BIN end of things. 1609No doubt I'll attempt some as time goes on... 1610 1611=back 1612 1613 1614 1615=head2 How it works 1616 1617Since BinHex is a layered format, consisting of... 1618 1619 A Macintosh file [the "BIN"]... 1620 Encoded as a structured 8-bit bytestream, then... 1621 Compressed to reduce duplicate bytes, then... 1622 Encoded as 7-bit ASCII [the "HEX"] 1623 1624...there is a layered parsing algorithm to reverse the process. 1625Basically, it works in a similar fashion to stdio's fread(): 1626 1627 0. There is an internal buffer of decompressed (BIN) data, 1628 initially empty. 1629 1. Application asks to read() n bytes of data from object 1630 2. If the buffer is not full enough to accommodate the request: 1631 2a. The read() method grabs the next available chunk of input 1632 data (the HEX). 1633 2b. HEX data is converted and decompressed into as many BIN 1634 bytes as possible. 1635 2c. BIN bytes are added to the read() buffer. 1636 2d. Go back to step 2a. until the buffer is full enough 1637 or we hit end-of-input. 1638 1639The conversion-and-decompression algorithms need their own internal 1640buffers and state (since the next input chunk may not contain all the 1641data needed for a complete conversion/decompression operation). 1642These are maintained in the object, so parsing two different 1643input streams simultaneously is possible. 1644 1645 1646=head1 WARNINGS 1647 1648Only handles C<Hqx7> files, as per RFC-1741. 1649 1650Remember that Macintosh text files use C<"\r"> as end-of-line: 1651this means that if you want a textual file to look normal on 1652a non-Mac system, you probably want to do this to the data: 1653 1654 # Get the data, and output it according to normal conventions: 1655 foreach ($HQX->read_data) { s/\r/\n/g; print } 1656 1657 1658=head1 AUTHOR AND CREDITS 1659 1660Maintained by Stephen Nelson <stephenenelson@mac.com> 1661 1662Written by Eryq, F<http://www.enteract.com/~eryq> / F<eryq@enteract.com> 1663 1664Support for native-Mac conversion, I<plus> invaluable contributions in 1665Alpha Testing, I<plus> a few patches, I<plus> the baseline binhex/debinhex 1666programs, were provided by Paul J. Schinder (NASA/GSFC). 1667 1668Ken Lunde (Adobe) suggested incorporating the CAP file representation. 1669 1670 1671=head1 LICENSE 1672 1673Copyright (c) 1997 by Eryq. All rights reserved. This program is free 1674software; you can redistribute it and/or modify it under the same terms as 1675Perl itself. 1676 1677This software comes with B<NO WARRANTY> of any kind. 1678See the COPYING file in the distribution for details. 1679 1680=cut 1681 16821; 1683 1684__END__ 1685 1686my $HQX = new Convert::BinHex 1687 version => 0, 1688 filename=>"s.gif", 1689 type => "GIF8", 1690 creator => "PCBH", 1691 flags => 0xFFFF 1692 ; 1693 1694$HQX->data(Path=>"/home/eryq/s.gif"); 1695$HQX->resource(Path=>"/etc/issue"); 1696 1697#$HQX->data(Data=>"123456789"); 1698#$HQX->resource(Data=>''); 1699 1700$HQX->encode(\*STDOUT); 1701 17021; 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712