1=pod
2
3=head1 NAME
4
5Pod::Simple::XHTML -- format Pod as validating XHTML
6
7=head1 SYNOPSIS
8
9  use Pod::Simple::XHTML;
10
11  my $parser = Pod::Simple::XHTML->new();
12
13  ...
14
15  $parser->parse_file('path/to/file.pod');
16
17=head1 DESCRIPTION
18
19This class is a formatter that takes Pod and renders it as XHTML
20validating HTML.
21
22This is a subclass of L<Pod::Simple::Methody> and inherits all its
23methods. The implementation is entirely different than
24L<Pod::Simple::HTML>, but it largely preserves the same interface.
25
26=head2 Minimal code
27
28  use Pod::Simple::XHTML;
29  my $psx = Pod::Simple::XHTML->new;
30  $psx->output_string(\my $html);
31  $psx->parse_file('path/to/Module/Name.pm');
32  open my $out, '>', 'out.html' or die "Cannot open 'out.html': $!\n";
33  print $out $html;
34
35You can also control the character encoding and entities. For example, if
36you're sure that the POD is properly encoded (using the C<=encoding> command),
37you can prevent high-bit characters from being encoded as HTML entities and
38declare the output character set as UTF-8 before parsing, like so:
39
40  $psx->html_charset('UTF-8');
41  $psx->html_encode_chars(q{&<>'"});
42
43=cut
44
45package Pod::Simple::XHTML;
46use strict;
47use vars qw( $VERSION @ISA $HAS_HTML_ENTITIES );
48$VERSION = '3.43';
49use Pod::Simple::Methody ();
50@ISA = ('Pod::Simple::Methody');
51
52BEGIN {
53  $HAS_HTML_ENTITIES = eval "require HTML::Entities; 1";
54}
55
56my %entities = (
57  q{>} => 'gt',
58  q{<} => 'lt',
59  q{'} => '#39',
60  q{"} => 'quot',
61  q{&} => 'amp',
62);
63
64sub encode_entities {
65  my $self = shift;
66  my $ents = $self->html_encode_chars;
67  return HTML::Entities::encode_entities( $_[0], $ents ) if $HAS_HTML_ENTITIES;
68  if (defined $ents) {
69      $ents =~ s,(?<!\\)([]/]),\\$1,g;
70      $ents =~ s,(?<!\\)\\\z,\\\\,;
71  } else {
72      $ents = join '', keys %entities;
73  }
74  my $str = $_[0];
75  $str =~ s/([$ents])/'&' . ($entities{$1} || sprintf '#x%X', ord $1) . ';'/ge;
76  return $str;
77}
78
79#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80
81=head1 METHODS
82
83Pod::Simple::XHTML offers a number of methods that modify the format of
84the HTML output. Call these after creating the parser object, but before
85the call to C<parse_file>:
86
87  my $parser = Pod::PseudoPod::HTML->new();
88  $parser->set_optional_param("value");
89  $parser->parse_file($file);
90
91=head2 perldoc_url_prefix
92
93In turning L<Foo::Bar> into http://whatever/Foo%3a%3aBar, what
94to put before the "Foo%3a%3aBar". The default value is
95"https://metacpan.org/pod/".
96
97=head2 perldoc_url_postfix
98
99What to put after "Foo%3a%3aBar" in the URL. This option is not set by
100default.
101
102=head2 man_url_prefix
103
104In turning C<< L<crontab(5)> >> into http://whatever/man/1/crontab, what
105to put before the "1/crontab". The default value is
106"http://man.he.net/man".
107
108=head2 man_url_postfix
109
110What to put after "1/crontab" in the URL. This option is not set by default.
111
112=head2 title_prefix, title_postfix
113
114What to put before and after the title in the head. The values should
115already be &-escaped.
116
117=head2 html_css
118
119  $parser->html_css('path/to/style.css');
120
121The URL or relative path of a CSS file to include. This option is not
122set by default.
123
124=head2 html_javascript
125
126The URL or relative path of a JavaScript file to pull in. This option is
127not set by default.
128
129=head2 html_doctype
130
131A document type tag for the file. This option is not set by default.
132
133=head2 html_charset
134
135The character set to declare in the Content-Type meta tag created by default
136for C<html_header_tags>. Note that this option will be ignored if the value of
137C<html_header_tags> is changed. Defaults to "ISO-8859-1".
138
139=head2 html_header_tags
140
141Additional arbitrary HTML tags for the header of the document. The
142default value is just a content type header tag:
143
144  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
145
146Add additional meta tags here, or blocks of inline CSS or JavaScript
147(wrapped in the appropriate tags).
148
149=head3 html_encode_chars
150
151A string containing all characters that should be encoded as HTML entities,
152specified using the regular expression character class syntax (what you find
153within brackets in regular expressions). This value will be passed as the
154second argument to the C<encode_entities> function of L<HTML::Entities>. If
155L<HTML::Entities> is not installed, then any characters other than C<&<>"'>
156will be encoded numerically.
157
158=head2 html_h_level
159
160This is the level of HTML "Hn" element to which a Pod "head1" corresponds.  For
161example, if C<html_h_level> is set to 2, a head1 will produce an H2, a head2
162will produce an H3, and so on.
163
164=head2 default_title
165
166Set a default title for the page if no title can be determined from the
167content. The value of this string should already be &-escaped.
168
169=head2 force_title
170
171Force a title for the page (don't try to determine it from the content).
172The value of this string should already be &-escaped.
173
174=head2 html_header, html_footer
175
176Set the HTML output at the beginning and end of each file. The default
177header includes a title, a doctype tag (if C<html_doctype> is set), a
178content tag (customized by C<html_header_tags>), a tag for a CSS file
179(if C<html_css> is set), and a tag for a Javascript file (if
180C<html_javascript> is set). The default footer simply closes the C<html>
181and C<body> tags.
182
183The options listed above customize parts of the default header, but
184setting C<html_header> or C<html_footer> completely overrides the
185built-in header or footer. These may be useful if you want to use
186template tags instead of literal HTML headers and footers or are
187integrating converted POD pages in a larger website.
188
189If you want no headers or footers output in the HTML, set these options
190to the empty string.
191
192=head2 index
193
194Whether to add a table-of-contents at the top of each page (called an
195index for the sake of tradition).
196
197=head2 anchor_items
198
199Whether to anchor every definition C<=item> directive. This needs to be
200enabled if you want to be able to link to specific C<=item> directives, which
201are output as C<< <dt> >> elements. Disabled by default.
202
203=head2 backlink
204
205Whether to turn every =head1 directive into a link pointing to the top
206of the page (specifically, the opening body tag).
207
208=cut
209
210__PACKAGE__->_accessorize(
211 'perldoc_url_prefix',
212 'perldoc_url_postfix',
213 'man_url_prefix',
214 'man_url_postfix',
215 'title_prefix',  'title_postfix',
216 'html_css',
217 'html_javascript',
218 'html_doctype',
219 'html_charset',
220 'html_encode_chars',
221 'html_h_level',
222 'title', # Used internally for the title extracted from the content
223 'default_title',
224 'force_title',
225 'html_header',
226 'html_footer',
227 'index',
228 'anchor_items',
229 'backlink',
230 'batch_mode', # whether we're in batch mode
231 'batch_mode_current_level',
232    # When in batch mode, how deep the current module is: 1 for "LWP",
233    #  2 for "LWP::Procotol", 3 for "LWP::Protocol::GHTTP", etc
234);
235
236#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
237
238=head1 SUBCLASSING
239
240If the standard options aren't enough, you may want to subclass
241Pod::Simple::XHMTL. These are the most likely candidates for methods
242you'll want to override when subclassing.
243
244=cut
245
246sub new {
247  my $self = shift;
248  my $new = $self->SUPER::new(@_);
249  $new->{'output_fh'} ||= *STDOUT{IO};
250  $new->perldoc_url_prefix('https://metacpan.org/pod/');
251  $new->man_url_prefix('http://man.he.net/man');
252  $new->html_charset('ISO-8859-1');
253  $new->nix_X_codes(1);
254  $new->{'scratch'} = '';
255  $new->{'to_index'} = [];
256  $new->{'output'} = [];
257  $new->{'saved'} = [];
258  $new->{'ids'} = { '_podtop_' => 1 }; # used in <body>
259  $new->{'in_li'} = [];
260
261  $new->{'__region_targets'}  = [];
262  $new->{'__literal_targets'} = {};
263  $new->accept_targets_as_html( 'html', 'HTML' );
264
265  return $new;
266}
267
268sub html_header_tags {
269    my $self = shift;
270    return $self->{html_header_tags} = shift if @_;
271    return $self->{html_header_tags}
272        ||= '<meta http-equiv="Content-Type" content="text/html; charset='
273            . $self->html_charset . '" />';
274}
275
276#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
277
278=head2 handle_text
279
280This method handles the body of text within any element: it's the body
281of a paragraph, or everything between a "=begin" tag and the
282corresponding "=end" tag, or the text within an L entity, etc. You would
283want to override this if you are adding a custom element type that does
284more than just display formatted text. Perhaps adding a way to generate
285HTML tables from an extended version of POD.
286
287So, let's say you want to add a custom element called 'foo'. In your
288subclass's C<new> method, after calling C<SUPER::new> you'd call:
289
290  $new->accept_targets_as_text( 'foo' );
291
292Then override the C<start_for> method in the subclass to check for when
293"$flags->{'target'}" is equal to 'foo' and set a flag that marks that
294you're in a foo block (maybe "$self->{'in_foo'} = 1"). Then override the
295C<handle_text> method to check for the flag, and pass $text to your
296custom subroutine to construct the HTML output for 'foo' elements,
297something like:
298
299  sub handle_text {
300      my ($self, $text) = @_;
301      if ($self->{'in_foo'}) {
302          $self->{'scratch'} .= build_foo_html($text);
303          return;
304      }
305      $self->SUPER::handle_text($text);
306  }
307
308=head2 handle_code
309
310This method handles the body of text that is marked up to be code.
311You might for instance override this to plug in a syntax highlighter.
312The base implementation just escapes the text.
313
314The callback methods C<start_code> and C<end_code> emits the C<code> tags
315before and after C<handle_code> is invoked, so you might want to override these
316together with C<handle_code> if this wrapping isn't suitable.
317
318Note that the code might be broken into multiple segments if there are
319nested formatting codes inside a C<< CE<lt>...> >> sequence.  In between the
320calls to C<handle_code> other markup tags might have been emitted in that
321case.  The same is true for verbatim sections if the C<codes_in_verbatim>
322option is turned on.
323
324=head2 accept_targets_as_html
325
326This method behaves like C<accept_targets_as_text>, but also marks the region
327as one whose content should be emitted literally, without HTML entity escaping
328or wrapping in a C<div> element.
329
330=cut
331
332sub __in_literal_xhtml_region {
333    return unless @{ $_[0]{__region_targets} };
334    my $target = $_[0]{__region_targets}[-1];
335    return $_[0]{__literal_targets}{ $target };
336}
337
338sub accept_targets_as_html {
339    my ($self, @targets) = @_;
340    $self->accept_targets(@targets);
341    $self->{__literal_targets}{$_} = 1 for @targets;
342}
343
344sub handle_text {
345    # escape special characters in HTML (<, >, &, etc)
346    my $text = $_[0]->__in_literal_xhtml_region
347        ? $_[1]
348        : $_[0]->encode_entities( $_[1] );
349
350    if ($_[0]{'in_code'} && @{$_[0]{'in_code'}}) {
351        # Intentionally use the raw text in $_[1], even if we're not in a
352        # literal xhtml region, since handle_code calls encode_entities.
353        $_[0]->handle_code( $_[1], $_[0]{'in_code'}[-1] );
354    } else {
355        if ($_[0]->{in_for}) {
356            my $newlines = $_[0]->__in_literal_xhtml_region ? "\n\n" : '';
357            if ($_[0]->{started_for}) {
358                if ($text =~ /\S/) {
359                    delete $_[0]->{started_for};
360                    $_[0]{'scratch'} .= $text . $newlines;
361                }
362                # Otherwise, append nothing until we have something to append.
363            } else {
364                # The parser sometimes preserves newlines and sometimes doesn't!
365                $text =~ s/\n\z//;
366                $_[0]{'scratch'} .= $text . $newlines;
367            }
368        } else {
369            # Just plain text.
370            $_[0]{'scratch'} .= $text;
371        }
372    }
373
374    $_[0]{htext} .= $text if $_[0]{'in_head'};
375}
376
377sub start_code {
378    $_[0]{'scratch'} .= '<code>';
379}
380
381sub end_code {
382    $_[0]{'scratch'} .= '</code>';
383}
384
385sub handle_code {
386    $_[0]{'scratch'} .= $_[0]->encode_entities( $_[1] );
387}
388
389sub start_Para {
390    $_[0]{'scratch'} .= '<p>';
391}
392
393sub start_Verbatim {
394    $_[0]{'scratch'} = '<pre>';
395    push(@{$_[0]{'in_code'}}, 'Verbatim');
396    $_[0]->start_code($_[0]{'in_code'}[-1]);
397}
398
399sub start_head1 {  $_[0]{'in_head'} = 1; $_[0]{htext} = ''; }
400sub start_head2 {  $_[0]{'in_head'} = 2; $_[0]{htext} = ''; }
401sub start_head3 {  $_[0]{'in_head'} = 3; $_[0]{htext} = ''; }
402sub start_head4 {  $_[0]{'in_head'} = 4; $_[0]{htext} = ''; }
403sub start_head5 {  $_[0]{'in_head'} = 5; $_[0]{htext} = ''; }
404sub start_head6 {  $_[0]{'in_head'} = 6; $_[0]{htext} = ''; }
405
406sub start_item_number {
407    $_[0]{'scratch'} = "</li>\n" if ($_[0]{'in_li'}->[-1] && pop @{$_[0]{'in_li'}});
408    $_[0]{'scratch'} .= '<li><p>';
409    push @{$_[0]{'in_li'}}, 1;
410}
411
412sub start_item_bullet {
413    $_[0]{'scratch'} = "</li>\n" if ($_[0]{'in_li'}->[-1] && pop @{$_[0]{'in_li'}});
414    $_[0]{'scratch'} .= '<li><p>';
415    push @{$_[0]{'in_li'}}, 1;
416}
417
418sub start_item_text   {
419    # see end_item_text
420}
421
422sub start_over_bullet { $_[0]{'scratch'} = '<ul>'; push @{$_[0]{'in_li'}}, 0; $_[0]->emit }
423sub start_over_block  { $_[0]{'scratch'} = '<ul>'; $_[0]->emit }
424sub start_over_number { $_[0]{'scratch'} = '<ol>'; push @{$_[0]{'in_li'}}, 0; $_[0]->emit }
425sub start_over_text   {
426    $_[0]{'scratch'} = '<dl>';
427    $_[0]{'dl_level'}++;
428    $_[0]{'in_dd'} ||= [];
429    $_[0]->emit
430}
431
432sub end_over_block  { $_[0]{'scratch'} .= '</ul>'; $_[0]->emit }
433
434sub end_over_number   {
435    $_[0]{'scratch'} = "</li>\n" if ( pop @{$_[0]{'in_li'}} );
436    $_[0]{'scratch'} .= '</ol>';
437    pop @{$_[0]{'in_li'}};
438    $_[0]->emit;
439}
440
441sub end_over_bullet   {
442    $_[0]{'scratch'} = "</li>\n" if ( pop @{$_[0]{'in_li'}} );
443    $_[0]{'scratch'} .= '</ul>';
444    pop @{$_[0]{'in_li'}};
445    $_[0]->emit;
446}
447
448sub end_over_text   {
449    if ($_[0]{'in_dd'}[ $_[0]{'dl_level'} ]) {
450        $_[0]{'scratch'} = "</dd>\n";
451        $_[0]{'in_dd'}[ $_[0]{'dl_level'} ] = 0;
452    }
453    $_[0]{'scratch'} .= '</dl>';
454    $_[0]{'dl_level'}--;
455    $_[0]->emit;
456}
457
458# . . . . . Now the actual formatters:
459
460sub end_Para     { $_[0]{'scratch'} .= '</p>'; $_[0]->emit }
461sub end_Verbatim {
462    $_[0]->end_code(pop(@{$_[0]->{'in_code'}}));
463    $_[0]{'scratch'} .= '</pre>';
464    $_[0]->emit;
465}
466
467sub _end_head {
468    my $h = delete $_[0]{in_head};
469
470    my $add = $_[0]->html_h_level;
471    $add = 1 unless defined $add;
472    $h += $add - 1;
473
474    my $id = $_[0]->idify($_[0]{htext});
475    my $text = $_[0]{scratch};
476    $_[0]{'scratch'} = $_[0]->backlink && ($h - $add == 0)
477                         # backlinks enabled && =head1
478                         ? qq{<a href="#_podtop_"><h$h id="$id">$text</h$h></a>}
479                         : qq{<h$h id="$id">$text</h$h>};
480    $_[0]->emit;
481    push @{ $_[0]{'to_index'} }, [$h, $id, delete $_[0]{'htext'}];
482}
483
484sub end_head1       { shift->_end_head(@_); }
485sub end_head2       { shift->_end_head(@_); }
486sub end_head3       { shift->_end_head(@_); }
487sub end_head4       { shift->_end_head(@_); }
488sub end_head5       { shift->_end_head(@_); }
489sub end_head6       { shift->_end_head(@_); }
490
491sub end_item_bullet { $_[0]{'scratch'} .= '</p>'; $_[0]->emit }
492sub end_item_number { $_[0]{'scratch'} .= '</p>'; $_[0]->emit }
493
494sub end_item_text   {
495    # idify and anchor =item content if wanted
496    my $dt_id = $_[0]{'anchor_items'}
497                 ? ' id="'. $_[0]->idify($_[0]{'scratch'}) .'"'
498                 : '';
499
500    # reset scratch
501    my $text = $_[0]{scratch};
502    $_[0]{'scratch'} = '';
503
504    if ($_[0]{'in_dd'}[ $_[0]{'dl_level'} ]) {
505        $_[0]{'scratch'} = "</dd>\n";
506        $_[0]{'in_dd'}[ $_[0]{'dl_level'} ] = 0;
507    }
508
509    $_[0]{'scratch'} .= qq{<dt$dt_id>$text</dt>\n<dd>};
510    $_[0]{'in_dd'}[ $_[0]{'dl_level'} ] = 1;
511    $_[0]->emit;
512}
513
514# This handles =begin and =for blocks of all kinds.
515sub start_for {
516  my ($self, $flags) = @_;
517
518  push @{ $self->{__region_targets} }, $flags->{target_matching};
519  $self->{started_for} = 1;
520  $self->{in_for} = 1;
521
522  unless ($self->__in_literal_xhtml_region) {
523    $self->{scratch} .= '<div';
524    $self->{scratch} .= qq( class="$flags->{target}") if $flags->{target};
525    $self->{scratch} .= ">\n\n";
526  }
527}
528
529sub end_for {
530  my ($self) = @_;
531  delete $self->{started_for};
532  delete $self->{in_for};
533
534  if ($self->__in_literal_xhtml_region) {
535    # Remove trailine newlines.
536    $self->{'scratch'} =~ s/\s+\z//s;
537  } else {
538    $self->{'scratch'} .= '</div>';
539  }
540
541  pop @{ $self->{__region_targets} };
542  $self->emit;
543}
544
545sub start_Document {
546  my ($self) = @_;
547  if (defined $self->html_header) {
548    $self->{'scratch'} .= $self->html_header;
549    $self->emit unless $self->html_header eq "";
550  } else {
551    my ($doctype, $title, $metatags, $bodyid);
552    $doctype = $self->html_doctype || '';
553    $title = $self->force_title || $self->title || $self->default_title || '';
554    $metatags = $self->html_header_tags || '';
555    if (my $css = $self->html_css) {
556        if ($css !~ /<link/) {
557            # this is required to be compatible with Pod::Simple::BatchHTML
558            $metatags .= '<link rel="stylesheet" href="'
559                . $self->encode_entities($css) . '" type="text/css" />';
560        } else {
561            $metatags .= $css;
562        }
563    }
564    if ($self->html_javascript) {
565      $metatags .= qq{\n<script type="text/javascript" src="} .
566                    $self->html_javascript . '"></script>';
567    }
568    $bodyid = $self->backlink ? ' id="_podtop_"' : '';
569    $self->{'scratch'} .= <<"HTML";
570$doctype
571<html>
572<head>
573<title>$title</title>
574$metatags
575</head>
576<body$bodyid>
577HTML
578    $self->emit;
579  }
580}
581
582sub end_Document   {
583  my ($self) = @_;
584  my $to_index = $self->{'to_index'};
585  if ($self->index && @{ $to_index } ) {
586      my @out;
587      my $level  = 0;
588      my $indent = -1;
589      my $space  = '';
590      my $id     = ' id="index"';
591
592      for my $h (@{ $to_index }, [0]) {
593          my $target_level = $h->[0];
594          # Get to target_level by opening or closing ULs
595          if ($level == $target_level) {
596              $out[-1] .= '</li>';
597          } elsif ($level > $target_level) {
598              $out[-1] .= '</li>' if $out[-1] =~ /^\s+<li>/;
599              while ($level > $target_level) {
600                  --$level;
601                  push @out, ('  ' x --$indent) . '</li>' if @out && $out[-1] =~ m{^\s+<\/ul};
602                  push @out, ('  ' x --$indent) . '</ul>';
603              }
604              push @out, ('  ' x --$indent) . '</li>' if $level;
605          } else {
606              while ($level < $target_level) {
607                  ++$level;
608                  push @out, ('  ' x ++$indent) . '<li>' if @out && $out[-1]=~ /^\s*<ul/;
609                  push @out, ('  ' x ++$indent) . "<ul$id>";
610                  $id = '';
611              }
612              ++$indent;
613          }
614
615          next unless $level;
616          $space = '  '  x $indent;
617          push @out, sprintf '%s<li><a href="#%s">%s</a>',
618              $space, $h->[1], $h->[2];
619      }
620      # Splice the index in between the HTML headers and the first element.
621      my $offset = defined $self->html_header ? $self->html_header eq '' ? 0 : 1 : 1;
622      splice @{ $self->{'output'} }, $offset, 0, join "\n", @out;
623  }
624
625  if (defined $self->html_footer) {
626    $self->{'scratch'} .= $self->html_footer;
627    $self->emit unless $self->html_footer eq "";
628  } else {
629    $self->{'scratch'} .= "</body>\n</html>";
630    $self->emit;
631  }
632
633  if ($self->index) {
634      print {$self->{'output_fh'}} join ("\n\n", @{ $self->{'output'} }), "\n\n";
635      @{$self->{'output'}} = ();
636  }
637
638}
639
640# Handling code tags
641sub start_B { $_[0]{'scratch'} .= '<b>' }
642sub end_B   { $_[0]{'scratch'} .= '</b>' }
643
644sub start_C { push(@{$_[0]{'in_code'}}, 'C'); $_[0]->start_code($_[0]{'in_code'}[-1]); }
645sub end_C   { $_[0]->end_code(pop(@{$_[0]{'in_code'}})); }
646
647sub start_F { $_[0]{'scratch'} .= '<i>' }
648sub end_F   { $_[0]{'scratch'} .= '</i>' }
649
650sub start_I { $_[0]{'scratch'} .= '<i>' }
651sub end_I   { $_[0]{'scratch'} .= '</i>' }
652
653sub start_L {
654  my ($self, $flags) = @_;
655    my ($type, $to, $section) = @{$flags}{'type', 'to', 'section'};
656    my $url = $self->encode_entities(
657        $type eq 'url' ? $to
658            : $type eq 'pod' ? $self->resolve_pod_page_link($to, $section)
659            : $type eq 'man' ? $self->resolve_man_page_link($to, $section)
660            :                  undef
661    );
662
663    # If it's an unknown type, use an attribute-less <a> like HTML.pm.
664    $self->{'scratch'} .= '<a' . ($url ? ' href="'. $url . '">' : '>');
665}
666
667sub end_L   { $_[0]{'scratch'} .= '</a>' }
668
669sub start_S { $_[0]{'scratch'} .= '<span style="white-space: nowrap;">' }
670sub end_S   { $_[0]{'scratch'} .= '</span>' }
671
672sub emit {
673  my($self) = @_;
674  if ($self->index) {
675      push @{ $self->{'output'} }, $self->{'scratch'};
676  } else {
677      print {$self->{'output_fh'}} $self->{'scratch'}, "\n\n";
678  }
679  $self->{'scratch'} = '';
680  return;
681}
682
683=head2 resolve_pod_page_link
684
685  my $url = $pod->resolve_pod_page_link('Net::Ping', 'INSTALL');
686  my $url = $pod->resolve_pod_page_link('perlpodspec');
687  my $url = $pod->resolve_pod_page_link(undef, 'SYNOPSIS');
688
689Resolves a POD link target (typically a module or POD file name) and section
690name to a URL. The resulting link will be returned for the above examples as:
691
692  https://metacpan.org/pod/Net::Ping#INSTALL
693  https://metacpan.org/pod/perlpodspec
694  #SYNOPSIS
695
696Note that when there is only a section argument the URL will simply be a link
697to a section in the current document.
698
699=cut
700
701sub resolve_pod_page_link {
702    my ($self, $to, $section) = @_;
703    return undef unless defined $to || defined $section;
704    if (defined $section) {
705        $section = '#' . $self->idify($self->encode_entities($section), 1);
706        return $section unless defined $to;
707    } else {
708        $section = ''
709    }
710
711    return ($self->perldoc_url_prefix || '')
712        . $self->encode_entities($to) . $section
713        . ($self->perldoc_url_postfix || '');
714}
715
716=head2 resolve_man_page_link
717
718  my $url = $pod->resolve_man_page_link('crontab(5)', 'EXAMPLE CRON FILE');
719  my $url = $pod->resolve_man_page_link('crontab');
720
721Resolves a man page link target and numeric section to a URL. The resulting
722link will be returned for the above examples as:
723
724    http://man.he.net/man5/crontab
725    http://man.he.net/man1/crontab
726
727Note that the first argument is required. The section number will be parsed
728from it, and if it's missing will default to 1. The second argument is
729currently ignored, as L<man.he.net|http://man.he.net> does not currently
730include linkable IDs or anchor names in its pages. Subclass to link to a
731different man page HTTP server.
732
733=cut
734
735sub resolve_man_page_link {
736    my ($self, $to, $section) = @_;
737    return undef unless defined $to;
738    my ($page, $part) = $to =~ /^([^(]+)(?:[(](\d+)[)])?$/;
739    return undef unless $page;
740    return ($self->man_url_prefix || '')
741        . ($part || 1) . "/" . $self->encode_entities($page)
742        . ($self->man_url_postfix || '');
743
744}
745
746=head2 idify
747
748  my $id   = $pod->idify($text);
749  my $hash = $pod->idify($text, 1);
750
751This method turns an arbitrary string into a valid XHTML ID attribute value.
752The rules enforced, following
753L<http://webdesign.about.com/od/htmltags/a/aa031707.htm>, are:
754
755=over
756
757=item *
758
759The id must start with a letter (a-z or A-Z)
760
761=item *
762
763All subsequent characters can be letters, numbers (0-9), hyphens (-),
764underscores (_), colons (:), and periods (.).
765
766=item *
767
768The final character can't be a hyphen, colon, or period. URLs ending with these
769characters, while allowed by XHTML, can be awkward to extract from plain text.
770
771=item *
772
773Each id must be unique within the document.
774
775=back
776
777In addition, the returned value will be unique within the context of the
778Pod::Simple::XHTML object unless a second argument is passed a true value. ID
779attributes should always be unique within a single XHTML document, but pass
780the true value if you are creating not an ID but a URL hash to point to
781an ID (i.e., if you need to put the "#foo" in C<< <a href="#foo">foo</a> >>.
782
783=cut
784
785sub idify {
786    my ($self, $t, $not_unique) = @_;
787    for ($t) {
788        s/<[^>]+>//g;            # Strip HTML.
789        s/&[^;]+;//g;            # Strip entities.
790        s/^\s+//; s/\s+$//;      # Strip white space.
791        s/^([^a-zA-Z]+)$/pod$1/; # Prepend "pod" if no valid chars.
792        s/^[^a-zA-Z]+//;         # First char must be a letter.
793        s/[^-a-zA-Z0-9_:.]+/-/g; # All other chars must be valid.
794        s/[-:.]+$//;             # Strip trailing punctuation.
795    }
796    return $t if $not_unique;
797    my $i = '';
798    $i++ while $self->{ids}{"$t$i"}++;
799    return "$t$i";
800}
801
802=head2 batch_mode_page_object_init
803
804  $pod->batch_mode_page_object_init($batchconvobj, $module, $infile, $outfile, $depth);
805
806Called by L<Pod::Simple::HTMLBatch> so that the class has a chance to
807initialize the converter. Internally it sets the C<batch_mode> property to
808true and sets C<batch_mode_current_level()>, but Pod::Simple::XHTML does not
809currently use those features. Subclasses might, though.
810
811=cut
812
813sub batch_mode_page_object_init {
814  my ($self, $batchconvobj, $module, $infile, $outfile, $depth) = @_;
815  $self->batch_mode(1);
816  $self->batch_mode_current_level($depth);
817  return $self;
818}
819
820sub html_header_after_title {
821}
822
823
8241;
825
826__END__
827
828=head1 SEE ALSO
829
830L<Pod::Simple>, L<Pod::Simple::Text>, L<Pod::Spell>
831
832=head1 SUPPORT
833
834Questions or discussion about POD and Pod::Simple should be sent to the
835pod-people@perl.org mail list. Send an empty email to
836pod-people-subscribe@perl.org to subscribe.
837
838This module is managed in an open GitHub repository,
839L<https://github.com/perl-pod/pod-simple/>. Feel free to fork and contribute, or
840to clone L<git://github.com/perl-pod/pod-simple.git> and send patches!
841
842Patches against Pod::Simple are welcome. Please send bug reports to
843<bug-pod-simple@rt.cpan.org>.
844
845=head1 COPYRIGHT AND DISCLAIMERS
846
847Copyright (c) 2003-2005 Allison Randal.
848
849This library is free software; you can redistribute it and/or modify it
850under the same terms as Perl itself.
851
852This program is distributed in the hope that it will be useful, but
853without any warranty; without even the implied warranty of
854merchantability or fitness for a particular purpose.
855
856=head1 ACKNOWLEDGEMENTS
857
858Thanks to L<Hurricane Electric|http://he.net/> for permission to use its
859L<Linux man pages online|http://man.he.net/> site for man page links.
860
861Thanks to L<search.cpan.org|http://search.cpan.org/> for permission to use the
862site for Perl module links.
863
864=head1 AUTHOR
865
866Pod::Simpele::XHTML was created by Allison Randal <allison@perl.org>.
867
868Pod::Simple was created by Sean M. Burke <sburke@cpan.org>.
869But don't bother him, he's retired.
870
871Pod::Simple is maintained by:
872
873=over
874
875=item * Allison Randal C<allison@perl.org>
876
877=item * Hans Dieter Pearcey C<hdp@cpan.org>
878
879=item * David E. Wheeler C<dwheeler@cpan.org>
880
881=back
882
883=cut
884