1package Convert::BinHex;
2
3
4=head1 NAME
5
6Convert::BinHex - extract data from Macintosh BinHex files
7
8I<ALPHA WARNING: this code is currently in its Alpha release.
9Things may change drastically until the interface is hammered out:
10if you have suggestions or objections, please speak up now!>
11
12
13=head1 SYNOPSIS
14
15B<Simple functions:>
16
17    use Convert::BinHex qw(binhex_crc macbinary_crc);
18
19    # Compute HQX7-style CRC for data, pumping in old CRC if desired:
20    $crc = binhex_crc($data, $crc);
21
22    # Compute the MacBinary-II-style CRC for the data:
23    $crc = macbinary_crc($data, $crc);
24
25B<Hex to bin, low-level interface.>
26Conversion is actually done via an object (L<"Convert::BinHex::Hex2Bin">)
27which keeps internal conversion state:
28
29    # Create and use a "translator" object:
30    my $H2B = Convert::BinHex->hex2bin;    # get a converter object
31    while (<STDIN>) {
32	print $STDOUT $H2B->next($_);        # convert some more input
33    }
34    print $STDOUT $H2B->done;              # no more input: finish up
35
36B<Hex to bin, OO interface.>
37The following operations I<must> be done in the order shown!
38
39    # Read data in piecemeal:
40    $HQX = Convert::BinHex->open(FH=>\*STDIN) || die "open: $!";
41    $HQX->read_header;                  # read header info
42    @data = $HQX->read_data;            # read in all the data
43    @rsrc = $HQX->read_resource;        # read in all the resource
44
45B<Bin to hex, low-level interface.>
46Conversion is actually done via an object (L<"Convert::BinHex::Bin2Hex">)
47which keeps internal conversion state:
48
49    # Create and use a "translator" object:
50    my $B2H = Convert::BinHex->bin2hex;    # get a converter object
51    while (<STDIN>) {
52	print $STDOUT $B2H->next($_);        # convert some more input
53    }
54    print $STDOUT $B2H->done;              # no more input: finish up
55
56B<Bin to hex, file interface.>  Yes, you can convert I<to> BinHex
57as well as from it!
58
59    # Create new, empty object:
60    my $HQX = Convert::BinHex->new;
61
62    # Set header attributes:
63    $HQX->filename("logo.gif");
64    $HQX->type("GIFA");
65    $HQX->creator("CNVS");
66
67    # Give it the data and resource forks (either can be absent):
68    $HQX->data(Path => "/path/to/data");       # here, data is on disk
69    $HQX->resource(Data => $resourcefork);     # here, resource is in core
70
71    # Output as a BinHex stream, complete with leading comment:
72    $HQX->encode(\*STDOUT);
73
74B<PLANNED!!!! Bin to hex, "CAP" interface.>
75I<Thanks to Ken Lunde for suggesting this>.
76
77    # Create new, empty object from CAP tree:
78    my $HQX = Convert::BinHex->from_cap("/path/to/root/file");
79    $HQX->encode(\*STDOUT);
80
81
82=head1 DESCRIPTION
83
84B<BinHex> is a format used by Macintosh for transporting Mac files
85safely through electronic mail, as short-lined, 7-bit, semi-compressed
86data streams.  Ths module provides a means of converting those
87data streams back into into binary data.
88
89
90=head1 FORMAT
91
92I<(Some text taken from RFC-1741.)>
93Files on the Macintosh consist of two parts, called I<forks>:
94
95=over 4
96
97=item Data fork
98
99The actual data included in the file.  The Data fork is typically the
100only meaningful part of a Macintosh file on a non-Macintosh computer system.
101For example, if a Macintosh user wants to send a file of data to a
102user on an IBM-PC, she would only send the Data fork.
103
104=item Resource fork
105
106Contains a collection of arbitrary attribute/value pairs, including
107program segments, icon bitmaps, and parametric values.
108
109=back
110
111Additional information regarding Macintosh files is stored by the
112Finder in a hidden file, called the "Desktop Database".
113
114Because of the complications in storing different parts of a
115Macintosh file in a non-Macintosh filesystem that only handles
116consecutive data in one part, it is common to convert the Macintosh
117file into some other format before transferring it over the network.
118The BinHex format squashes that data into transmittable ASCII as follows:
119
120=over 4
121
122=item 1.
123
124The file is output as a B<byte stream> consisting of some basic header
125information (filename, type, creator), then the data fork, then the
126resource fork.
127
128=item 2.
129
130The byte stream is B<compressed> by looking for series of duplicated
131bytes and representing them using a special binary escape sequence
132(of course, any occurences of the escape character must also be escaped).
133
134=item 3.
135
136The compressed stream is B<encoded> via the "6/8 hemiola" common
137to I<base64> and I<uuencode>: each group of three 8-bit bytes (24 bits)
138is chopped into four 6-bit numbers, which are used as indexes into
139an ASCII "alphabet".
140(I assume that leftover bytes are zero-padded; documentation is thin).
141
142=back
143
144=cut
145
146use strict;
147use warnings;
148use vars qw(@ISA @EXPORT_OK $VERSION $QUIET);
149use integer;
150
151use Carp;
152use Exporter;
153use FileHandle;
154
155@ISA = qw(Exporter);
156@EXPORT_OK = qw(
157		macbinary_crc
158		binhex_crc
159		);
160
161
162
163our $VERSION = '1.125'; # VERSION
164
165# My identity:
166my $I = 'binhex:';
167
168# Utility function:
169sub min {
170    my ($a, $b) = @_;
171    ($a < $b) ? $a : $b;
172}
173
174# An array useful for CRC calculations that use 0x1021 as the "seed":
175my @MAGIC = (
176    0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50a5, 0x60c6, 0x70e7,
177    0x8108, 0x9129, 0xa14a, 0xb16b, 0xc18c, 0xd1ad, 0xe1ce, 0xf1ef,
178    0x1231, 0x0210, 0x3273, 0x2252, 0x52b5, 0x4294, 0x72f7, 0x62d6,
179    0x9339, 0x8318, 0xb37b, 0xa35a, 0xd3bd, 0xc39c, 0xf3ff, 0xe3de,
180    0x2462, 0x3443, 0x0420, 0x1401, 0x64e6, 0x74c7, 0x44a4, 0x5485,
181    0xa56a, 0xb54b, 0x8528, 0x9509, 0xe5ee, 0xf5cf, 0xc5ac, 0xd58d,
182    0x3653, 0x2672, 0x1611, 0x0630, 0x76d7, 0x66f6, 0x5695, 0x46b4,
183    0xb75b, 0xa77a, 0x9719, 0x8738, 0xf7df, 0xe7fe, 0xd79d, 0xc7bc,
184    0x48c4, 0x58e5, 0x6886, 0x78a7, 0x0840, 0x1861, 0x2802, 0x3823,
185    0xc9cc, 0xd9ed, 0xe98e, 0xf9af, 0x8948, 0x9969, 0xa90a, 0xb92b,
186    0x5af5, 0x4ad4, 0x7ab7, 0x6a96, 0x1a71, 0x0a50, 0x3a33, 0x2a12,
187    0xdbfd, 0xcbdc, 0xfbbf, 0xeb9e, 0x9b79, 0x8b58, 0xbb3b, 0xab1a,
188    0x6ca6, 0x7c87, 0x4ce4, 0x5cc5, 0x2c22, 0x3c03, 0x0c60, 0x1c41,
189    0xedae, 0xfd8f, 0xcdec, 0xddcd, 0xad2a, 0xbd0b, 0x8d68, 0x9d49,
190    0x7e97, 0x6eb6, 0x5ed5, 0x4ef4, 0x3e13, 0x2e32, 0x1e51, 0x0e70,
191    0xff9f, 0xefbe, 0xdfdd, 0xcffc, 0xbf1b, 0xaf3a, 0x9f59, 0x8f78,
192    0x9188, 0x81a9, 0xb1ca, 0xa1eb, 0xd10c, 0xc12d, 0xf14e, 0xe16f,
193    0x1080, 0x00a1, 0x30c2, 0x20e3, 0x5004, 0x4025, 0x7046, 0x6067,
194    0x83b9, 0x9398, 0xa3fb, 0xb3da, 0xc33d, 0xd31c, 0xe37f, 0xf35e,
195    0x02b1, 0x1290, 0x22f3, 0x32d2, 0x4235, 0x5214, 0x6277, 0x7256,
196    0xb5ea, 0xa5cb, 0x95a8, 0x8589, 0xf56e, 0xe54f, 0xd52c, 0xc50d,
197    0x34e2, 0x24c3, 0x14a0, 0x0481, 0x7466, 0x6447, 0x5424, 0x4405,
198    0xa7db, 0xb7fa, 0x8799, 0x97b8, 0xe75f, 0xf77e, 0xc71d, 0xd73c,
199    0x26d3, 0x36f2, 0x0691, 0x16b0, 0x6657, 0x7676, 0x4615, 0x5634,
200    0xd94c, 0xc96d, 0xf90e, 0xe92f, 0x99c8, 0x89e9, 0xb98a, 0xa9ab,
201    0x5844, 0x4865, 0x7806, 0x6827, 0x18c0, 0x08e1, 0x3882, 0x28a3,
202    0xcb7d, 0xdb5c, 0xeb3f, 0xfb1e, 0x8bf9, 0x9bd8, 0xabbb, 0xbb9a,
203    0x4a75, 0x5a54, 0x6a37, 0x7a16, 0x0af1, 0x1ad0, 0x2ab3, 0x3a92,
204    0xfd2e, 0xed0f, 0xdd6c, 0xcd4d, 0xbdaa, 0xad8b, 0x9de8, 0x8dc9,
205    0x7c26, 0x6c07, 0x5c64, 0x4c45, 0x3ca2, 0x2c83, 0x1ce0, 0x0cc1,
206    0xef1f, 0xff3e, 0xcf5d, 0xdf7c, 0xaf9b, 0xbfba, 0x8fd9, 0x9ff8,
207    0x6e17, 0x7e36, 0x4e55, 0x5e74, 0x2e93, 0x3eb2, 0x0ed1, 0x1ef0
208);
209
210# Ssssssssssshhhhhhhhhh:
211$QUIET = 0;
212
213
214
215#==============================
216
217=head1 FUNCTIONS
218
219=head2 CRC computation
220
221=over 4
222
223=cut
224
225#------------------------------------------------------------
226
227=item macbinary_crc DATA, SEED
228
229Compute the MacBinary-II-style CRC for the given DATA, with the CRC
230seeded to SEED.  Normally, you start with a SEED of 0, and you pump in
231the previous CRC as the SEED if you're handling a lot of data one chunk
232at a time.  That is:
233
234    $crc = 0;
235    while (<STDIN>) {
236        $crc = macbinary_crc($_, $crc);
237    }
238
239I<Note:> Extracted from the I<mcvert> utility (Doug Moore, April '87),
240using a "magic array" algorithm by Jim Van Verth for efficiency.
241Converted to Perl5 by Eryq.  B<Untested.>
242
243=cut
244
245sub macbinary_crc {
246    my $len = length($_[0]);
247    my $crc = $_[1];
248    my $i;
249    for ($i = 0; $i < $len; $i++) {
250	($crc ^= (vec($_[0], $i, 8) << 8)) &= 0xFFFF;
251	$crc = ($crc << 8) ^ $MAGIC[$crc >> 8];
252    }
253    $crc;
254}
255
256#------------------------------------------------------------
257
258=item binhex_crc DATA, SEED
259
260Compute the HQX-style CRC for the given DATA, with the CRC seeded to SEED.
261Normally, you start with a SEED of 0, and you pump in the previous CRC as
262the SEED if you're handling a lot of data one chunk at a time.  That is:
263
264    $crc = 0;
265    while (<STDIN>) {
266        $crc = binhex_crc($_, $crc);
267    }
268
269I<Note:> Extracted from the I<mcvert> utility (Doug Moore, April '87),
270using a "magic array" algorithm by Jim Van Verth for efficiency.
271Converted to Perl5 by Eryq.
272
273=cut
274
275sub binhex_crc {
276    my $len = length($_[0]);
277    my $crc = $_[1];
278    if (! defined $crc) {
279    	$crc = 0;
280    }
281    my $i;
282    for ($i = 0; $i < $len; $i++) {
283	my $ocrc = $crc;
284	$crc = (((($crc & 0xFF) << 8) | vec($_[0], $i, 8))
285		^ $MAGIC[$crc >> 8]) & 0xFFFF;
286	## printf "CRCin = %04x, char = %02x (%c), CRCout = %04x\n",
287	##        $ocrc, vec($_[0], $i, 8), ord(substr($_[0], $i, 1)), $crc;
288    }
289    $crc;
290}
291
292
293=back
294
295=cut
296
297
298
299#==============================
300
301=head1 OO INTERFACE
302
303=head2 Conversion
304
305=over 4
306
307=cut
308
309#------------------------------------------------------------
310
311=item bin2hex
312
313I<Class method, constructor.>
314Return a converter object.  Just creates a new instance of
315L<"Convert::BinHex::Bin2Hex">; see that class for details.
316
317=cut
318
319sub bin2hex {
320    return Convert::BinHex::Bin2Hex->new;
321}
322
323#------------------------------------------------------------
324
325=item hex2bin
326
327I<Class method, constructor.>
328Return a converter object.  Just creates a new instance of
329L<"Convert::BinHex::Hex2Bin">; see that class for details.
330
331=cut
332
333sub hex2bin {
334    return Convert::BinHex::Hex2Bin->new;
335}
336
337=back
338
339=cut
340
341
342
343#==============================
344
345=head2 Construction
346
347=over 4
348
349=cut
350
351#------------------------------------------------------------
352
353=item new PARAMHASH
354
355I<Class method, constructor.>
356Return a handle on a BinHex'able entity.  In general, the data and resource
357forks for such an entity are stored in native format (binary) format.
358
359Parameters in the PARAMHASH are the same as header-oriented method names,
360and may be used to set attributes:
361
362    $HQX = new Convert::BinHex filename => "icon.gif",
363                               type    => "GIFB",
364                               creator => "CNVS";
365
366=cut
367
368sub new {
369    my ($class, %params) = @_;
370
371    # Create object:
372    my $self = bless {
373	Data => new Convert::BinHex::Fork,      # data fork
374	Rsrc => new Convert::BinHex::Fork,      # resource fork
375    }, $class;   # basic object
376
377    # Process params:
378    my $method;
379    foreach $method (qw(creator	filename flags requires type version
380			software_version)){
381	$self->$method($params{$method}) if exists($params{$method});
382    }
383    $self;
384}
385
386#------------------------------------------------------------
387
388=item open PARAMHASH
389
390I<Class method, constructor.>
391Return a handle on a new BinHex'ed stream, for parsing.
392Params are:
393
394=over 4
395
396=item Data
397
398Input a HEX stream from the given data.  This can be a scalar, or a
399reference to an array of scalars.
400
401=item Expr
402
403Input a HEX stream from any open()able expression.  It will be opened and
404binmode'd, and the filehandle will be closed either on a C<close()>
405or when the object is destructed.
406
407=item FH
408
409Input a HEX stream from the given filehandle.
410
411=item NoComment
412
413If true, the parser should not attempt to skip a leading "(This file...)"
414comment.  That means that the first nonwhite characters encountered
415must be the binhex'ed data.
416
417=back
418
419=cut
420
421sub open {
422    my $self = shift;
423    my %params = @_;
424
425    # Create object:
426    ref($self) or $self = $self->new;
427
428    # Set up input:
429    my $data;
430    if ($params{FH}) {
431	$self->{FH} = Convert::BinHex::IO_Handle->wrap($params{FH});
432    }
433    elsif ($params{Expr}) {
434	$self->{FH} = FileHandle->new($params{Expr}) or
435	    croak "$I can't open $params{Expr}: $!\n";
436	$self->{FH} = Convert::BinHex::IO_Handle->wrap($self->{FH});
437    }
438    elsif ($params{Data}) {
439	if (!ref($data = $params{Data})) {   # scalar
440	    $self->{FH} = Convert::BinHex::IO_Scalar->wrap(\$data);
441	}
442	elsif (ref($data) eq 'ARRAY') {
443	    $data = join('', @$data);
444	    $self->{FH} = Convert::BinHex::IO_Scalar->wrap(\$data);
445	}
446    }
447    $self->{FH} or croak "$I missing a valid input source\n";
448
449    # Comments?
450    $self->{CommentRead} = $params{NoComment};
451
452    # Reset the converter!
453    $self->{H2B} = Convert::BinHex::Hex2Bin->new;
454    $self;
455}
456
457
458=back
459
460=cut
461
462
463
464
465#==============================
466
467=head2 Get/set header information
468
469=over 4
470
471=cut
472
473#------------------------------
474
475=item creator [VALUE]
476
477I<Instance method.>
478Get/set the creator of the file.  This is a four-character
479string (though I don't know if it's guaranteed to be printable ASCII!)
480that serves as part of the Macintosh's version of a MIME "content-type".
481
482For example, a document created by "Canvas" might have
483creator C<"CNVS">.
484
485=cut
486
487sub creator  { (@_ > 1) ? ($_[0]->{Creator}  = $_[1]) : $_[0]->{Creator} }
488
489#------------------------------
490
491=item data [PARAMHASH]
492
493I<Instance method.>
494Get/set the data fork.  Any arguments are passed into the
495new() method of L<"Convert::BinHex::Fork">.
496
497=cut
498
499sub data {
500    my $self = shift;
501    @_ ? $self->{Data} = Convert::BinHex::Fork->new(@_) : $self->{Data};
502}
503
504#------------------------------
505
506=item filename [VALUE]
507
508I<Instance method.>
509Get/set the name of the file.
510
511=cut
512
513sub filename { (@_ > 1) ? ($_[0]->{Filename} = $_[1]) : $_[0]->{Filename} }
514
515#------------------------------
516
517=item flags [VALUE]
518
519I<Instance method.>
520Return the flags, as an integer.  Use bitmasking to get as the values
521you need.
522
523=cut
524
525sub flags    { (@_ > 1) ? ($_[0]->{Flags}    = $_[1]) : $_[0]->{Flags} }
526
527#------------------------------
528
529=item header_as_string
530
531Return a stringified version of the header that you might
532use for logging/debugging purposes.  It looks like this:
533
534    X-HQX-Software: BinHex 4.0 (Convert::BinHex 1.102)
535    X-HQX-Filename: Something_new.eps
536    X-HQX-Version: 0
537    X-HQX-Type: EPSF
538    X-HQX-Creator: ART5
539    X-HQX-Data-Length: 49731
540    X-HQX-Rsrc-Length: 23096
541
542As some of you might have guessed, this is RFC-822-style, and
543may be easily plunked down into the middle of a mail header, or
544split into lines, etc.
545
546=cut
547
548sub header_as_string {
549    my $self = shift;
550    my @h;
551    push @h, "X-HQX-Software: " .
552	     "BinHex " . ($self->requires || '4.0') .
553	     " (Convert::BinHex $VERSION)";
554    push @h, "X-HQX-Filename: " . $self->filename;
555    push @h, "X-HQX-Version: "  . $self->version;
556    push @h, "X-HQX-Type: "     . $self->type;
557    push @h, "X-HQX-Creator: "  . $self->creator;
558    push @h, "X-HQX-Flags: "    . sprintf("%x", $self->flags);
559    push @h, "X-HQX-Data-Length: " . $self->data->length;
560    push @h, "X-HQX-Rsrc-Length: " . $self->resource->length;
561    push @h, "X-HQX-CRC: "      . sprintf("%x", $self->{HdrCRC});
562    return join("\n", @h) . "\n";
563}
564
565#------------------------------
566
567=item requires [VALUE]
568
569I<Instance method.>
570Get/set the software version required to convert this file, as
571extracted from the comment that preceded the actual binhex'ed
572data; e.g.:
573
574    (This file must be converted with BinHex 4.0)
575
576In this case, after parsing in the comment, the code:
577
578    $HQX->requires;
579
580would get back "4.0".
581
582=cut
583
584sub requires  {
585    (@_ > 1) ? ($_[0]->{Requires}  = $_[1]) : $_[0]->{Requires}
586}
587*software_version = \&requires;
588
589#------------------------------
590
591=item resource [PARAMHASH]
592
593I<Instance method.>
594Get/set the resource fork.  Any arguments are passed into the
595new() method of L<"Convert::BinHex::Fork">.
596
597=cut
598
599sub resource {
600    my $self = shift;
601    @_ ? $self->{Rsrc} = Convert::BinHex::Fork->new(@_) : $self->{Rsrc};
602}
603
604#------------------------------
605
606=item type [VALUE]
607
608I<Instance method.>
609Get/set the type of the file.  This is a four-character
610string (though I don't know if it's guaranteed to be printable ASCII!)
611that serves as part of the Macintosh's version of a MIME "content-type".
612
613For example, a GIF89a file might have type C<"GF89">.
614
615=cut
616
617sub type  { (@_ > 1) ? ($_[0]->{Type}  = $_[1]) : $_[0]->{Type} }
618
619#------------------------------
620
621=item version [VALUE]
622
623I<Instance method.>
624Get/set the version, as an integer.
625
626=cut
627
628sub version  { (@_ > 1) ? ($_[0]->{Version}  = $_[1]) : $_[0]->{Version} }
629
630
631=back
632
633=cut
634
635### OBSOLETE!!!
636sub data_length     { shift->data->length(@_) }
637sub resource_length { shift->resource->length(@_) }
638
639
640
641
642#==============================
643
644=head2 Decode, high-level
645
646=over 4
647
648=cut
649
650#------------------------------------------------------------
651
652=item read_comment
653
654I<Instance method.>
655Skip past the opening comment in the file, which is of the form:
656
657   (This file must be converted with BinHex 4.0)
658
659As per RFC-1741, I<this comment must immediately precede the BinHex data,>
660and any text before it will be ignored.
661
662I<You don't need to invoke this method yourself;> C<read_header()> will
663do it for you.  After the call, the version number in the comment is
664accessible via the C<requires()> method.
665
666=cut
667
668sub read_comment {
669    my $self = shift;
670    return 1 if ($self->{CommentRead});   # prevent accidents
671    local($_);
672    while (defined($_ = $self->{FH}->getline)) {
673	chomp;
674	if (/^\(This file must be converted with BinHex ([\d\.]+).*\)\s*$/i) {
675	    $self->requires($1);
676	    return $self->{CommentRead} = 1;
677	}
678    }
679    croak "$I comment line (This file must be converted with BinHex...) ".
680	  "not found\n";
681}
682
683#------------------------------------------------------------
684
685=item read_header
686
687I<Instance method.>
688Read in the BinHex file header.  You must do this first!
689
690=cut
691
692sub read_header {
693    my $self = shift;
694    return 1 if ($self->{HeaderRead});   # prevent accidents
695
696    # Skip comment:
697    $self->read_comment;
698
699    # Get header info:
700    $self->filename ($self->read_str($self->read_byte));
701    $self->version  ($self->read_byte);
702    $self->type     ($self->read_str(4));
703    $self->creator  ($self->read_str(4));
704    $self->flags    ($self->read_short);
705    $self->data_length     ($self->read_long);
706    $self->resource_length ($self->read_long);
707    $self->{HdrCRC}   = $self->read_short;
708    $self->{HeaderRead} = 1;
709}
710
711#------------------------------------------------------------
712#
713# _read_fork
714#
715# I<Instance method, private.>
716# Read in a fork.
717#
718
719sub _read_fork {
720    my $self = shift;
721
722    # Pass in call if array context:
723    if (wantarray) {
724	local($_);
725	my @all;
726	push @all, $_ while (defined($_ = $self->_read_fork(@_)));
727	return @all;
728    }
729
730    # Get args:
731    my ($fork, $n) = @_;
732    if($self->{$fork}->length == 0) {
733    	$self->{$fork}->crc($self->read_short);
734    	return undef;
735    }
736    defined($n) or $n = 2048;
737
738    # Reset pointer into fork if necessary:
739    if (!defined($self->{$fork}{Ptr})) {
740	$self->{$fork}{Ptr} = 0;
741	$self->{CompCRC} = 0;
742    }
743
744    # Check for EOF:
745    return undef if ($self->{$fork}{Ptr} >= $self->{$fork}->length);
746
747    # Read up to, but not exceeding, the number of bytes left in the fork:
748    my $n2read = min($n, ($self->{$fork}->length - $self->{$fork}{Ptr}));
749    my $data = $self->read_str($n2read);
750    $self->{$fork}{Ptr} += length($data);
751
752    # If we just read the last byte, read the CRC also:
753    if (($self->{$fork}{Ptr} == $self->{$fork}->length) &&    # last byte
754	!defined($self->{$fork}->crc)) {                   # no CRC
755	my $comp_CRC;
756
757	# Move computed CRC forward by two zero bytes, and grab the value:
758	if ($self->{CheckCRC}) {
759	    $self->{CompCRC} = binhex_crc("\000\000", $self->{CompCRC});
760	}
761
762	# Get CRC as stored in file:
763	$self->{$fork}->crc($self->read_short);          # get stored CRC
764
765	# Compare, and note corruption if detected:
766	if ($self->{CheckCRC} and ($self->{$fork}->crc != $comp_CRC)) {
767	    &Carp::carp("CRCs do not match: corrupted data?") unless $QUIET;
768	    $self->{Corrupted} = 1;
769	}
770    }
771
772    # Return the bytes:
773    $data;
774}
775
776#------------------------------------------------------------
777
778=item read_data [NBYTES]
779
780I<Instance method.>
781Read information from the data fork.  Use it in an array context to
782slurp all the data into an array of scalars:
783
784    @data = $HQX->read_data;
785
786Or use it in a scalar context to get the data piecemeal:
787
788    while (defined($data = $HQX->read_data)) {
789       # do stuff with $data
790    }
791
792The NBYTES to read defaults to 2048.
793
794=cut
795
796sub read_data {
797    shift->_read_fork('Data',@_);
798}
799
800#------------------------------------------------------------
801
802=item read_resource [NBYTES]
803
804I<Instance method.>
805Read in all/some of the resource fork.
806See C<read_data()> for usage.
807
808=cut
809
810sub read_resource {
811    shift->_read_fork('Rsrc',@_);
812}
813
814=back
815
816=cut
817
818
819
820#------------------------------------------------------------
821#
822# read BUFFER, NBYTES
823#
824# Read the next NBYTES (decompressed) bytes from the input stream
825# into BUFFER.  Returns the number of bytes actually read, and
826# undef on end of file.
827#
828# I<Note:> the calling style mirrors the IO::Handle read() function.
829
830my $READBUF = '';
831sub read {
832    my ($self, $n) = ($_[0], $_[2]);
833    $_[1] = '';            # just in case
834    my $FH = $self->{FH};
835    local($^W) = 0;
836
837    # Get more BIN bytes until enough or EOF:
838    my $bin;
839    while (length($self->{BIN_QUEUE}) < $n) {
840	$FH->read($READBUF, 4096) or last;
841	$self->{BIN_QUEUE} .= $self->{H2B}->next($READBUF);   # save BIN
842    }
843
844    # We've got as many bytes as we're gonna get:
845    $_[1] = substr($self->{BIN_QUEUE}, 0, $n);
846    $self->{BIN_QUEUE} = substr($self->{BIN_QUEUE}, $n);
847
848    # Advance the CRC:
849    if ($self->{CheckCRC}) {
850	$self->{CompCRC} = binhex_crc($_[1], $self->{CompCRC});
851    }
852    return length($_[1]);
853}
854
855#------------------------------------------------------------
856#
857# read_str NBYTES
858#
859# Read and return the next NBYTES bytes, or die with "unexpected end of file"
860
861sub read_str {
862    my ($self, $n) = @_;
863    my $buf = '';
864    $self->read($buf, $n);
865    croak "$I unexpected end of file (wanted $n, got " . length($buf) . ")\n"
866	if ($n and (length($buf) < $n));
867    return $buf;
868}
869
870#------------------------------------------------------------
871#
872# read_byte
873# read_short
874# read_long
875#
876# Read 1, 2, or 4 bytes, and return the value read as an unsigned integer.
877# If not that many bytes remain, die with "unexpected end of file";
878
879sub read_byte {
880    ord($_[0]->read_str(1));
881}
882
883sub read_short {
884    unpack("n", $_[0]->read_str(2));
885}
886
887sub read_long {
888    unpack("N", $_[0]->read_str(4));
889}
890
891
892
893
894
895
896
897
898
899#==============================
900
901=head2 Encode, high-level
902
903=over 4
904
905=cut
906
907#------------------------------------------------------------
908
909=item encode OUT
910
911Encode the object as a BinHex stream to the given output handle OUT.
912OUT can be a filehandle, or any blessed object that responds to a
913C<print()> message.
914
915The leading comment is output, using the C<requires()> attribute.
916
917=cut
918
919sub encode {
920    my $self = shift;
921
922    # Get output handle:
923    my $OUT = shift; $OUT = wrap Convert::BinHex::IO_Handle $OUT;
924
925    # Get a new converter:
926    my $B2H = $self->bin2hex;
927
928    # Comment:
929    $OUT->print("(This file must be converted with BinHex ",
930		($self->requires || '4.0'),
931		")\n");
932
933    # Build header in core:
934    my @hdrs;
935    my $flen = length($self->filename);
936    push @hdrs, pack("C", $flen);
937    push @hdrs, pack("a$flen", $self->filename);
938    push @hdrs, pack('C', $self->version);
939    push @hdrs, pack('a4', $self->type    || '????');
940    push @hdrs, pack('a4', $self->creator || '????');
941    push @hdrs, pack('n',  $self->flags   || 0);
942    push @hdrs, pack('N',  $self->data->length        || 0);
943    push @hdrs, pack('N',  $self->resource->length    || 0);
944    my $hdr = join '', @hdrs;
945
946    # Compute the header CRC:
947    my $crc = binhex_crc("\000\000", binhex_crc($hdr, 0));
948
949    # Output the header (plus its CRC):
950    $OUT->print($B2H->next($hdr . pack('n', $crc)));
951
952    # Output the data fork:
953    $self->data->encode($OUT, $B2H);
954
955    # Output the resource fork:
956    $self->resource->encode($OUT, $B2H);
957
958    # Finish:
959    $OUT->print($B2H->done);
960    1;
961}
962
963=back
964
965=cut
966
967
968
969#==============================
970
971=head1 SUBMODULES
972
973=cut
974
975#============================================================
976#
977package Convert::BinHex::Bin2Hex;
978#
979#============================================================
980
981=head2 Convert::BinHex::Bin2Hex
982
983A BINary-to-HEX converter.  This kind of conversion requires
984a certain amount of state information; it cannot be done by
985just calling a simple function repeatedly.  Use it like this:
986
987    # Create and use a "translator" object:
988    my $B2H = Convert::BinHex->bin2hex;    # get a converter object
989    while (<STDIN>) {
990	print STDOUT $B2H->next($_);          # convert some more input
991    }
992    print STDOUT $B2H->done;               # no more input: finish up
993
994    # Re-use the object:
995    $B2H->rewind;                 # ready for more action!
996    while (<MOREIN>) { ...
997
998On each iteration, C<next()> (and C<done()>) may return either
999a decent-sized non-empty string (indicating that more converted data
1000is ready for you) or an empty string (indicating that the converter
1001is waiting to amass more input in its private buffers before handing
1002you more stuff to output.
1003
1004Note that C<done()> I<always> converts and hands you whatever is left.
1005
1006This may have been a good approach.  It may not.  Someday, the converter
1007may also allow you give it an object that responds to read(), or
1008a FileHandle, and it will do all the nasty buffer-filling on its own,
1009serving you stuff line by line:
1010
1011    # Someday, maybe...
1012    my $B2H = Convert::BinHex->bin2hex(\*STDIN);
1013    while (defined($_ = $B2H->getline)) {
1014	print STDOUT $_;
1015    }
1016
1017Someday, maybe.  Feel free to voice your opinions.
1018
1019=cut
1020
1021#------------------------------
1022#
1023# new
1024
1025sub new {
1026    my $self = bless {}, shift;
1027    return $self->rewind;
1028}
1029
1030#------------------------------
1031#
1032# rewind
1033
1034sub rewind {
1035    my $self = shift;
1036    $self->{CBIN} = ' ' x 2048; $self->{CBIN} = ''; # BIN waiting for xlation
1037    $self->{HEX}  = ' ' x 2048; $self->{HEX}  = ''; # HEX waiting for output
1038    $self->{LINE} = 0;       # current line of output
1039    $self->{EOL} = "\n";
1040    $self;
1041}
1042
1043#------------------------------
1044#
1045# next MOREDATA
1046
1047sub next { shift->_next(0, @_) }
1048
1049#------------------------------
1050#
1051# done
1052
1053sub done { shift->_next(1) }
1054
1055#------------------------------
1056#
1057# _next ATEOF, [MOREDATA]
1058#
1059# Instance method, private.  Supply more data, and get any more output.
1060# Returns the empty string often, if not enough output has accumulated.
1061
1062sub _next {
1063    my $self = shift;
1064    my $eof = shift;
1065
1066    # Get the BINary data to process this time round, re-queueing the rest:
1067    # Handle EOF and non-EOF conditions separately:
1068    my $new_bin;
1069    if ($eof) {                      # No more BINary input...
1070	# Pad the queue with nuls to exactly 3n characters:
1071	$self->{CBIN} .= ("\x00" x ((3 - length($self->{CBIN}) % 3) % 3))
1072    }
1073    else {                           # More BINary input...
1074	# "Compress" new stuff, and add it to the queue:
1075	($new_bin = $_[0]) =~ s/\x90/\x90\x00/g;
1076	$self->{CBIN} .= $new_bin;
1077
1078	# Return if not enough to bother with:
1079	return '' if (length($self->{CBIN}) < 2048);
1080    }
1081
1082    # ...At this point, QUEUE holds compressed binary which we will attempt
1083    # to convert to some HEX characters...
1084
1085    # Trim QUEUE to exactly 3n characters, saving the excess:
1086    my $requeue = '';
1087    $requeue .= chop($self->{CBIN}) while (length($self->{CBIN}) % 3);
1088
1089    # Uuencode, adding stuff to hex:
1090    my $hex = ' ' x 2048; $hex = '';
1091    pos($self->{CBIN}) = 0;
1092    while ($self->{CBIN} =~ /(.{1,45})/gs) {
1093	$hex .= substr(pack('u', $1), 1);
1094	chop($hex);
1095    }
1096    $self->{CBIN} = reverse($requeue);     # put the excess back on the queue
1097
1098    # Switch to BinHex alphabet:
1099    $hex =~ tr
1100        {` -_}
1101        {!!"#$%&'()*+,\x2D012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr};
1102
1103    # Prepend any HEX we have queued from the last time:
1104    $hex = (($self->{LINE}++ ? '' : ':') .   # start with ":" pad?
1105	    $self->{HEX} .              # any output in the queue?
1106	    $hex);
1107
1108    # Break off largest chunk of 64n characters, put remainder back in queue:
1109    my $rem = length($hex) % 64;
1110    $self->{HEX} = ($rem ? substr($hex, -$rem) : '');
1111    $hex = substr($hex, 0, (length($hex)-$rem));
1112
1113    # Put in an EOL every 64'th character:
1114    $hex =~ s{(.{64})}{$1$self->{EOL}}sg;
1115
1116    # No more input?  Then tack on the remainder now:
1117    if ($eof) {
1118        $hex .= $self->{HEX} . ":" . ($self->{EOL} ? $self->{EOL} : '');
1119    }
1120
1121    # Done!
1122    $hex;
1123}
1124
1125
1126
1127
1128#============================================================
1129#
1130package Convert::BinHex::Hex2Bin;
1131#
1132#============================================================
1133
1134=head2 Convert::BinHex::Hex2Bin
1135
1136A HEX-to-BINary converter. This kind of conversion requires
1137a certain amount of state information; it cannot be done by
1138just calling a simple function repeatedly.  Use it like this:
1139
1140    # Create and use a "translator" object:
1141    my $H2B = Convert::BinHex->hex2bin;    # get a converter object
1142    while (<STDIN>) {
1143	print STDOUT $H2B->next($_);          # convert some more input
1144    }
1145    print STDOUT $H2B->done;               # no more input: finish up
1146
1147    # Re-use the object:
1148    $H2B->rewind;                 # ready for more action!
1149    while (<MOREIN>) { ...
1150
1151On each iteration, C<next()> (and C<done()>) may return either
1152a decent-sized non-empty string (indicating that more converted data
1153is ready for you) or an empty string (indicating that the converter
1154is waiting to amass more input in its private buffers before handing
1155you more stuff to output.
1156
1157Note that C<done()> I<always> converts and hands you whatever is left.
1158
1159Note that this converter does I<not> find the initial
1160"BinHex version" comment.  You have to skip that yourself.  It
1161only handles data between the opening and closing C<":">.
1162
1163=cut
1164
1165#------------------------------
1166#
1167# new
1168
1169sub new {
1170    my $self = bless {}, shift;
1171    return $self->rewind;
1172}
1173
1174#------------------------------
1175#
1176# rewind
1177
1178sub rewind {
1179    my $self = shift;
1180    $self->hex2comp_rewind;
1181    $self->comp2bin_rewind;
1182    $self;
1183}
1184
1185#------------------------------
1186#
1187# next MOREDATA
1188
1189sub next {
1190    my $self = shift;
1191    $_[0] =~ s/\s//g if (defined($_[0]));      # more input
1192    return $self->comp2bin_next($self->hex2comp_next($_[0]));
1193}
1194
1195#------------------------------
1196#
1197# done
1198
1199sub done {
1200    return "";
1201}
1202
1203#------------------------------
1204#
1205# hex2comp_rewind
1206
1207sub hex2comp_rewind {
1208    my $self = shift;
1209    $self->{HEX} = '';
1210}
1211
1212#------------------------------
1213#
1214# hex2comp_next HEX
1215#
1216# WARNING: argument is modified destructively for efficiency!!!!
1217
1218sub hex2comp_next {
1219    my $self = shift;
1220    ### print "hex2comp: newhex = $newhex\n";
1221
1222    # Concat new with queue, and kill any padding:
1223    my $hex = $self->{HEX} . (defined($_[0]) ? $_[0] : '');
1224    if (index($hex, ':') >= 0) {
1225	$hex =~ s/^://;                                 # start of input
1226	if ($hex =~ s/:\s*\Z//) {                       # end of input
1227	    my $leftover = (length($hex) % 4);                # need to pad!
1228	    $hex .= "\000" x (4 - $leftover)  if $leftover;   # zero pad
1229	}
1230    }
1231
1232    # Get longest substring of length 4n possible; put rest back on queue:
1233    my $rem = length($hex) % 4;
1234    $self->{HEX} = ($rem ? substr($hex, -$rem) : '');
1235    for (; $rem; --$rem) { chop $hex };
1236    return undef if ($hex eq '');            # nothing to do!
1237
1238    # Convert to uuencoded format:
1239    $hex =~ tr
1240        {!"#$%&'()*+,\x2D012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr}
1241        { -_};
1242
1243    # Now, uudecode:
1244    my $comp = '';
1245    my $len;
1246    my $up;
1247    local($^W) = 0;       ### KLUDGE
1248    while ($hex =~ /\G(.{1,60})/gs) {
1249	$len = chr(32 + ((length($1)*3)>>2));  # compute length byte
1250	$comp .= unpack("u", $len . $1 );      # uudecode
1251    }
1252
1253    # We now have the compressed binary... expand it:
1254    ### print "hex2comp: comp = $comp\n";
1255    $comp;
1256}
1257
1258#------------------------------
1259#
1260# comp2bin_rewind
1261
1262sub comp2bin_rewind {
1263    my $self = shift;
1264    $self->{COMP} = '';
1265    $self->{LASTC} = '';
1266}
1267
1268#------------------------------
1269#
1270# comp2bin_next COMP
1271#
1272# WARNING: argument is modified destructively for efficiency!!!!
1273
1274sub comp2bin_next {
1275    my $self = shift;
1276
1277    # Concat new with queue... anything to do?
1278    my $comp = $self->{COMP} . (defined($_[0]) ? $_[0] : '');
1279    return undef if ($comp eq '');
1280
1281    # For each character in compressed string...
1282    $self->{COMP} = '';
1283    my $lastc = $self->{LASTC};      # speed hack
1284    my $exp = '';       # expanded string
1285    my $i;
1286    my ($c, $n);
1287    for ($i = 0; $i < length($comp); $i++) {
1288	if (($c = substr($comp, $i, 1)) eq "\x90") {    # MARK
1289	    ### print "c = MARK\n";
1290	    unless (length($n = substr($comp, ++$i, 1))) {
1291		$self->{COMP} = "\x90";
1292		last;
1293	    }
1294	    ### print "n = ", ord($n), "; lastc = ", ord($lastc), "\n";
1295	    $exp .= ((ord($n) ? ($lastc x (ord($n)-1))  # repeat last char
1296		              : ($lastc = "\x90")));    # literal MARK
1297	}
1298	else {                                          # other CHAR
1299	    ### print "c = ", ord($c), "\n";
1300	    $exp .= ($lastc = $c);
1301	}
1302	### print "exp is now $exp\n";
1303    }
1304
1305    # Either hit EOS, or there's a MARK char at the very end:
1306    $self->{LASTC} = $lastc;
1307    ### print "leaving with lastc=$lastc and comp=$self->{COMP}\n";
1308    ### print "comp2bin: exp = $exp\n";
1309    $exp;
1310}
1311
1312
1313
1314
1315
1316
1317#============================================================
1318#
1319package Convert::BinHex::Fork;
1320#
1321#============================================================
1322
1323=head2 Convert::BinHex::Fork
1324
1325A fork in a Macintosh file.
1326
1327    # How to get them...
1328    $data_fork = $HQX->data;      # get the data fork
1329    $rsrc_fork = $HQX->resource;  # get the resource fork
1330
1331    # Make a new fork:
1332    $FORK = Convert::BinHex::Fork->new(Path => "/tmp/file.data");
1333    $FORK = Convert::BinHex::Fork->new(Data => $scalar);
1334    $FORK = Convert::BinHex::Fork->new(Data => \@array_of_scalars);
1335
1336    # Get/set the length of the data fork:
1337    $len = $FORK->length;
1338    $FORK->length(170);        # this overrides the REAL value: be careful!
1339
1340    # Get/set the path to the underlying data (if in a disk file):
1341    $path = $FORK->path;
1342    $FORK->path("/tmp/file.data");
1343
1344    # Get/set the in-core data itself, which may be a scalar or an arrayref:
1345    $data = $FORK->data;
1346    $FORK->data($scalar);
1347    $FORK->data(\@array_of_scalars);
1348
1349    # Get/set the CRC:
1350    $crc = $FORK->crc;
1351    $FORK->crc($crc);
1352
1353=cut
1354
1355
1356# Import some stuff into our namespace:
1357*binhex_crc = \&Convert::BinHex::binhex_crc;
1358
1359#------------------------------
1360#
1361# new PARAMHASH
1362
1363sub new {
1364    my ($class, %params) = @_;
1365    bless \%params, $class;
1366}
1367
1368#------------------------------
1369#
1370# length [VALUE]
1371
1372sub length {
1373    my $self = shift;
1374
1375    # Set length?
1376    $self->{Length} = shift if @_;
1377
1378    # Return explicit length, if any
1379    return $self->{Length} if defined($self->{Length});
1380
1381    # Compute it:
1382    if (defined($self->{Path})) {
1383	return (-s $self->{Path});
1384    }
1385    elsif (!ref($self->{Data})) {
1386	return length($self->{Data});
1387    }
1388    elsif (ref($self->{Data} eq 'ARRAY')) {
1389	my $n = 0;
1390	foreach (@{$self->{Data}}) { $n += length($_) }
1391	return $n;
1392    }
1393    return undef;          # unknown!
1394}
1395
1396#------------------------------
1397#
1398# path [VALUE]
1399
1400sub path {
1401    my $self = shift;
1402    if (@_) { $self->{Path} = shift; delete $self->{Data} }
1403    $self->{Path};
1404}
1405
1406#------------------------------
1407#
1408# data [VALUE]
1409
1410sub data {
1411    my $self = shift;
1412    if (@_) { $self->{Data} = shift; delete $self->{Path} }
1413    $self->{Data};
1414}
1415
1416#------------------------------
1417#
1418# crc [VALUE]
1419
1420sub crc {
1421    my $self = shift;
1422    @_ ? $self->{CRC} = shift : $self->{CRC};
1423}
1424
1425#------------------------------
1426#
1427# encode OUT, B2H
1428#
1429# Instance method, private.  Encode this fork as part of a BinHex stream.
1430# It will be printed to handle OUT using the binhexer B2H.
1431
1432sub encode {
1433    my ($self, $OUT, $B2H) = @_;
1434    my $buf = '';
1435    require POSIX if $^O||'' eq "MacOS";
1436    require Fcntl if $^O||'' eq "MacOS";
1437    my $fd;
1438
1439    # Reset the CRC:
1440    $self->{CRC} = 0;
1441
1442    # Output the data, calculating the CRC as we go:
1443    if (defined($self->{Path})) { # path to fork file
1444        if ($^O||'' eq "MacOS" and $self->{Fork} eq "RSRC") {
1445    	    $fd = POSIX::open($self->{Path},&POSIX::O_RDONLY | &Fcntl::O_RSRC);
1446	    while (POSIX::read($fd, $buf, 2048) > 0) {
1447		$self->{CRC} = binhex_crc($buf, $self->{CRC});
1448		$OUT->print($B2H->next($buf));
1449	    }
1450	    POSIX::close($fd);
1451        }
1452	else {
1453	    open FORK, $self->{Path} or die "$self->{Path}: $!";
1454	    while (read(\*FORK, $buf, 2048)) {
1455		$self->{CRC} = binhex_crc($buf, $self->{CRC});
1456		$OUT->print($B2H->next($buf));
1457	    }
1458	    close FORK;
1459	}
1460    }
1461    elsif (!defined($self->{Data})) {        # nothing!
1462	&Carp::carp("no data in fork!") unless $Convert::BinHex::QUIET;
1463    }
1464    elsif (!ref($self->{Data})) {            # scalar
1465	$self->{CRC} = binhex_crc($self->{Data}, $self->{CRC});
1466	$OUT->print($B2H->next($self->{Data}));
1467    }
1468    elsif (ref($self->{Data}) eq 'ARRAY') {  # array of scalars
1469	foreach $buf (@{$self->{Data}}) {
1470	    $self->{CRC} = binhex_crc($buf, $self->{CRC});
1471	    $OUT->print($B2H->next($buf));
1472	}
1473    }
1474    else {
1475	&Carp::croak("bad/unsupported data in fork");
1476    }
1477
1478    # Finish the CRC, and output it:
1479    $self->{CRC} = binhex_crc("\000\000", $self->{CRC});
1480    $OUT->print($B2H->next(pack("n", $self->{CRC})));
1481    1;
1482}
1483
1484
1485
1486
1487#============================================================
1488#
1489package Convert::BinHex::IO_Handle;
1490#
1491#============================================================
1492
1493# Wrap a non-object filehandle inside a blessed, printable interface:
1494# Does nothing if the given $fh is already a blessed object.
1495sub wrap {
1496    my ($class, $fh) = @_;
1497    no strict 'refs';
1498    $fh or $fh = select;        # no filehandle means selected one
1499    ref($fh) or $fh = \*$fh;    # scalar becomes a globref
1500    return $fh if (ref($fh) and (ref($fh) !~ /^(GLOB|FileHandle)$/));
1501    bless \$fh, $class;         # wrap it in a printable interface
1502}
1503sub print {
1504    my $FH = ${shift(@_)};
1505    print $FH @_;
1506}
1507sub getline {
1508    my $FH = ${shift(@_)};
1509    scalar(<$FH>);
1510}
1511sub read {
1512    read ${$_[0]}, $_[1], $_[2];
1513}
1514
1515
1516
1517#============================================================
1518#
1519package Convert::BinHex::IO_Scalar;
1520#
1521#============================================================
1522
1523# Wrap a scalar inside a blessed, printable interface:
1524sub wrap {
1525    my ($class, $scalarref) = @_;
1526    defined($scalarref) or $scalarref = \"";
1527    pos($$scalarref) = 0;
1528    bless $scalarref, $class;
1529}
1530sub print {
1531    my $self = shift;
1532    $$self .= join('', @_);
1533    1;
1534}
1535sub getline {
1536    my $self = shift;
1537    ($$self =~ /\G(.*?\n?)/g) or return undef;
1538    return $1;
1539}
1540sub read {
1541    my $self = shift;
1542    $_[0] = substr($$self, pos($$self), $_[1]);
1543    pos($$self) += $_[1];
1544    return length($_[0]);
1545}
1546
1547
1548
1549#==============================
1550
1551=head1 UNDER THE HOOD
1552
1553=head2 Design issues
1554
1555=over 4
1556
1557=item BinHex needs a stateful parser
1558
1559Unlike its cousins I<base64> and I<uuencode>, BinHex format is not
1560amenable to being parsed line-by-line.  There appears to be no
1561guarantee that lines contain 4n encoded characters... and even if there
1562is one, the BinHex compression algorithm interferes: even when you
1563can I<decode> one line at a time, you can't necessarily
1564I<decompress> a line at a time.
1565
1566For example: a decoded line ending with the byte C<\x90> (the escape
1567or "mark" character) is ambiguous: depending on the next decoded byte,
1568it could mean a literal C<\x90> (if the next byte is a C<\x00>), or
1569it could mean n-1 more repetitions of the previous character (if
1570the next byte is some nonzero C<n>).
1571
1572For this reason, a BinHex parser has to be somewhat stateful: you
1573cannot have code like this:
1574
1575    #### NO! #### NO! #### NO! #### NO! #### NO! ####
1576    while (<STDIN>) {            # read HEX
1577        print hexbin($_);          # convert and write BIN
1578    }
1579
1580unless something is happening "behind the scenes" to keep track of
1581what was last done.  I<The dangerous thing, however, is that this
1582approach will B<seem> to work, if you only test it on BinHex files
1583which do not use compression and which have 4n HEX characters
1584on each line.>
1585
1586Since we have to be stateful anyway, we use the parser object to
1587keep our state.
1588
1589
1590=item We need to be handle large input files
1591
1592Solutions that demand reading everything into core don't cut
1593it in my book.  The first MPEG file that comes along can louse
1594up your whole day.  So, there are no size limitations in this
1595module: the data is read on-demand, and filehandles are always
1596an option.
1597
1598
1599=item Boy, is this slow!
1600
1601A lot of the byte-level manipulation that has to go on, particularly
1602the CRC computing (which involves intensive bit-shifting and masking)
1603slows this module down significantly.  What is needed perhaps is an
1604I<optional> extension library where the slow pieces can be done more
1605quickly... a Convert::BinHex::CRC, if you will.  Volunteers, anyone?
1606
1607Even considering that, however, it's slower than I'd like.  I'm
1608sure many improvements can be made in the HEX-to-BIN end of things.
1609No doubt I'll attempt some as time goes on...
1610
1611=back
1612
1613
1614
1615=head2 How it works
1616
1617Since BinHex is a layered format, consisting of...
1618
1619      A Macintosh file [the "BIN"]...
1620         Encoded as a structured 8-bit bytestream, then...
1621            Compressed to reduce duplicate bytes, then...
1622               Encoded as 7-bit ASCII [the "HEX"]
1623
1624...there is a layered parsing algorithm to reverse the process.
1625Basically, it works in a similar fashion to stdio's fread():
1626
1627       0. There is an internal buffer of decompressed (BIN) data,
1628          initially empty.
1629       1. Application asks to read() n bytes of data from object
1630       2. If the buffer is not full enough to accommodate the request:
1631            2a. The read() method grabs the next available chunk of input
1632                data (the HEX).
1633            2b. HEX data is converted and decompressed into as many BIN
1634                bytes as possible.
1635            2c. BIN bytes are added to the read() buffer.
1636            2d. Go back to step 2a. until the buffer is full enough
1637                or we hit end-of-input.
1638
1639The conversion-and-decompression algorithms need their own internal
1640buffers and state (since the next input chunk may not contain all the
1641data needed for a complete conversion/decompression operation).
1642These are maintained in the object, so parsing two different
1643input streams simultaneously is possible.
1644
1645
1646=head1 WARNINGS
1647
1648Only handles C<Hqx7> files, as per RFC-1741.
1649
1650Remember that Macintosh text files use C<"\r"> as end-of-line:
1651this means that if you want a textual file to look normal on
1652a non-Mac system, you probably want to do this to the data:
1653
1654    # Get the data, and output it according to normal conventions:
1655    foreach ($HQX->read_data) { s/\r/\n/g; print }
1656
1657
1658=head1 AUTHOR AND CREDITS
1659
1660Maintained by Stephen Nelson <stephenenelson@mac.com>
1661
1662Written by Eryq, F<http://www.enteract.com/~eryq> / F<eryq@enteract.com>
1663
1664Support for native-Mac conversion, I<plus> invaluable contributions in
1665Alpha Testing, I<plus> a few patches, I<plus> the baseline binhex/debinhex
1666programs, were provided by Paul J. Schinder (NASA/GSFC).
1667
1668Ken Lunde (Adobe) suggested incorporating the CAP file representation.
1669
1670
1671=head1 LICENSE
1672
1673Copyright (c) 1997 by Eryq.  All rights reserved.  This program is free
1674software; you can redistribute it and/or modify it under the same terms as
1675Perl itself.
1676
1677This software comes with B<NO WARRANTY> of any kind.
1678See the COPYING file in the distribution for details.
1679
1680=cut
1681
16821;
1683
1684__END__
1685
1686my $HQX = new Convert::BinHex
1687    version => 0,
1688    filename=>"s.gif",
1689    type    => "GIF8",
1690    creator => "PCBH",
1691    flags => 0xFFFF
1692    ;
1693
1694$HQX->data(Path=>"/home/eryq/s.gif");
1695$HQX->resource(Path=>"/etc/issue");
1696
1697#$HQX->data(Data=>"123456789");
1698#$HQX->resource(Data=>'');
1699
1700$HQX->encode(\*STDOUT);
1701
17021;
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712